I have a endpoint that sets a pretty short TTL but the semantics of my service are such that it’s always useful to return the response to queued requesters even if it’s a bit out of date. BUT as soon as the request finishes, I want to start queueing requests again to the backend.
Essentially I want to only allow one request at a time to my service and always serve that to queued requesters even if the request is a bit stale (but not serve the cache to any one that requests after the initial request is finished)
I can’t seem to find a way to do this… perhaps I’m missing something obvious so help would be greatly appreciated!
It’s just delivery right now but we’ll probably need compute soon-ish to do auth and origin routing. So if moving there immediately is the easiest way to solve this that’s an option.
It’s an instruction for Fastly to hold the object in the cache for an additional window (commonly configured to 30-60s max) while the CDN fetches a fresh object from your origin or receives a revalidation instruction to start the cache TTL timer over. You’ll also need to configure ETag headers on your origin so the CDN knows when an object has changed.
That doesn’t quite work for my specific application — the http API is to read a log from a server. The clients ask for a specific log & an offset. The server can hold the request for a bit if the client is at the tail of the log in case a new message is appended.
If there are new messages then the clients will then repeat the API call at the new offset. But if there’s not, they’ll keep polling the same offset until there’s a new message.
So SWR wouldn’t work as that’d mean the client would get an immediate cached response — I’d rather they all get collapsed into a single connection to the origin while it waits to respond with a new message.
This is not really a good fit for our normal delivery/caching tools, unfortunately. Streaming delivery of dynamic content is generally not cacheable, and not collapsible, because every client connection may need different data.
We have a product designed for this sort of thing, called Fastly Fanout, which is based on an open source system called Pushpin. That tool allows for client connections to be held at the edge, with no connections to the origin at all, and then the origin can publish messages to be delivered to all active connections. It may be worth some investigation for your application.