Inviting inputs on a new Compute@Edge Cache API

Fastly is evaluating a new feature that will expose Fastly’s cache as a simple set of API for developers building on C@E. The goal will be to allow developers to cache any arbitrary content using this API (e.g. result of an expective compute operation) via a set of simple CRUD calls (get/set/delete), and be able to set an expiry time for cached content.

We are starting this thread to talk about how you might use this, what it will look like, and what name you would use to refer to such as API — we’d love your input.


My initial reaction is, how is this different than KVStore which has the same CRUD operations?

I know personally the limitations I had with KVStore:

  1. Cannot cache large objects
  2. Cannot range request
  3. No metadata API to figure out total content length of an item in the cache

And then my more general asks for caching are:

  1. Ability to properly shield on C@E
  2. Ability to request collapse on C@E
  3. Ability to cache large files (streaming miss + segmented caching)
  4. Ability to range request

As for how we could use this, I imagine it’s something that plays into the fetch() api. One problem with VCL is that it won’t request collapse until it knows an object is collapsible. This always felt too magical to me and I think we could have better control by telling C@E that its a cacheable item up front:

const cache = new FastlyCache("default")
const cacheKey = req.url.pathname // simple for example purposes
const cacheTrx = cache.transaction(cacheKey, {
  maxAge: 60,
  staleWhileRevalidate: 300,
  staleIfError: 3600,
  shield: "iad1"
const fetcher = async (trx) => fetch(req.url.pathname, { backend: "s3", cache: trx })
const stream = await, fetcher)
return new Response(stream)

That API does a few things:

  1. Creates an initial transaction with a set of options that describing how something will be cached
  2. Creates a fetcher function which takes in a trx and can be passed to the fetch() api. The trx can tell the fetch important things like it should use a shield. It will also tie it to the trx cache key which can be useful for partial ranges
  3. Creates a stream based on a trx which can either return something directly from cache or fetch+insert it into cache, taking into account concurrent requests for the same cache key (should only be inserted once)

I’m not sure the right way to properly handle range requests. If the fetcher requests a partial range maybe it could signal to the transaction that its a partial range and the cache api can figure out the correct way to reconstruct the response.


It would be good to understand the intended use-case for this API; like can I use it for rapidly changing data? Access patterns that might be more like reddis and be updated at very high rates, perhaps 100 times per second or more. And an API that could do, e.g., atomic increment operations on things that act as integers would be very handy to count and summarize things like events or callbacks, etc.

I suppose this would be more like a the KVStore that’s already there, except for differing assumptions about update rates.

I have a few additions to @AndrewBarba’s excellent suggestions:

  1. Cache Responses, not BLOBs. I want to interact with the cache at the HTTP layer. I want it to obey Cache-Control and Surrogate-Control. I don’t want to have to encode a Response with all its headers in order to cache it.
  2. Vary support. I don’t want my C@E service to have to know that /some/particular/path might vary on Accept-Language. The origin knows what things vary on and should be in control.
  3. Purge by anything Fastly supports. I want the cache API to obey Surrogate-Key headers on Responses. I want to be able to purge objects via PURGE request or via the UI.
  4. Purge authentication. I want to be able to use Fastly-Purge-Requires-Auth=1 or another mechanism to ensure that only authenticated users (or tokens with purge permissions) can purge content.
1 Like

Thank you @AndrewBarba for your detailed feedback! It is super useful as we plan for Cache API’s on the C@E roadmap. To your question on how this will be different from KV store: while KV store is durable storage and data can be accessed globally, the data written via the Cache API will be ephemeral, and not global i.e. it will be accessible from the POP from where it was initially written to.

The scenarios you call out all make sense - the documentation for the upcoming API release will answer most of those, please stay tuned.

I also agree with @AndrewBarba’s sketch - the way I think about it is mostly about request collapsing because that tends to require tight coupling of fetch and cache, and makes it hard to treat the two as separate. You want to be able to do things like:

  1. Fetch without touching cache
  2. Fetch without consulting cache but update/populate the cache with the response
  3. Fetch, using cache if available, but don’t update cache with any origin response (opposite of 2)
  4. Fetch, using cache if available and populate cache with response (standard cache behaviour)
  5. Lookup in cache but don’t go to network if not found (cache-only)
  6. Populate into cache without making a network fetch (cache-only)

The fetch standard actually already defines a property of FetchOptions that describes some of these options:


A string indicating how the request will interact with the HTTP cache. The possible values, default, no-store, reload, no-cache, force-cache, and only-if-cached, are documented in the article for the cache property of the Request object.

That said, I also feel like we should make the fetch and cache APIs operate as independently as possible, and in most use cases I actually don’t feel like they need to be intertwined. But request collapsing does make that tricky. If 10 C@E instances want the same resource at the same time, they could all do the cache lookups and initiate fetches independently, and we could identify that the requests are potentially collapsible and forward only one of them to origin. But then when that response comes back, if we return it to the first waiting C@E instance, it would be hard to know if that instance then decided to write that response to cache, and unclear whether it would still be appropriate to use it to satisfy the other queued requests.

I think the answer possibly is a fetch extension that allows a cache instance to be injected into the fetch,

fetch(url, {
  cache: new FastlyCache({... options ...});

If the only entrypoint to FastlyCache is via fetch, how would one write a synthetic response to cache?

I would expect to be able to do something like

const url = new URL(reqest.url)

if (url.pathname === '/expensive/calculation') {
  const cache = new FastlyCache(/* ... */);
  const body = await expensiveCalculation(request)
  const response = new Response(body, {
    status: 200,
    headers: {
      'content-type': 'application/json',
      'cache-control': 'public, max-age=3600',

  // put this into cache in the background to make this response faster
  // or do it in the foreground to coalesce requests
  fetchEvent.waitUntil(cache.put(request, response));

  return response

return fetch(request) // fall through to cache, then origin

@jamesarosen yes totally - my example was just intended to convey that one use case of enabling request collapsing. C@E needs to support (and ideally without the developer realising the need for it):

  1. Fetch from C@E instance B is paused because there’s already an identical fetch in progress triggered by C@E instance A.
  2. That A fetch completes…
    a) … and is written to cache, so is copied and returned to the C@E instance B as the response to it’s request. The fetch from instance B is never sent to the network.
    b) … and turns out to not be cacheable, so B’s fetch is unpaused, sent to the network, and the response is returned to instance B.

To achieve this, we need to be able to touch the cache before and after the fetch. And request collapsing is too important an optimisation to make it opt-in. So ideally we would want to have a few modes. The simplest does caching transparently, as now, and support request collapsing without telling you:


If you wanted to do something completely off-piste you’d have the option to interact entirely separately (but with no request collapsing):

const c = new FastlyCache();
if (!c.matches(url)) {
    const resp = await fetch(url, { cache: false });
    c.put(url, resp);

And then we clearly need something that allows for custom configuration but allows for RC. This could be in the form of passing a configured cache object as the value of the cache property on fetch:

fetch(url, {
  cache: new FastlyCache({... options ...});

The main thing we want to achieve in such a scenario is to inspect and amend the object before it’s cached - which is generically the solution that delivers on all of @jamesarosen’s requirements above. I guess I’ve always assumed we would do that via a callback:

fetch(url, {
  cache: new FastlyCache({
    onPut: (key, resp) => {
      return (!resp.headers.get("Vary").includes("foo") ? resp : null;