Nothing is being cached

I’m potentially being stupid here, but I’m new to Fastly and VCL, and I’m trying to cache my site, but nothing seems to be caching.

My site runs on MediaWiki, and it sends the “Accept-Encoding, Cookie, Authorization” header, which is the default for MediaWiki. I’ve removed the Authorization header in the VCL, but kept the Cookie and the Accept-Encoding header, but my site doesn’t seem to be caching still, it seems to always miss.

If I remove the “Cookie” from the vary header, then it caches properly, but obviously this isn’t ideal and I don’t want to remove the cookie header if possible because then users won’t be able to log in (and I see other websites also running on Fastly that don’t remove the Cookie vary and it still caches; is there anything I can do to force it to cache without removing the “cookie” from vary? This is my VCL. I’ve set the cache to pass to the origin if the “[sS]ession” or “Token” cookies are set which signifies a log in and nothing should be cached.

Any help appreciated! I’ve also put below the headers that appear when a cache miss occurs.

Cache-Control: public, max-age=3600
Content-Language: en
Content-Length: 88506
Content-Type: text/html; charset=UTF-8
Date: Thu, 12 Oct 2023 21:04:23 GMT
Expires: Thu, 01 Jan 1970 00:00:00 GMT
Last-Modified: Thu, 12 Oct 2023 20:59:29 GMT
Server: Apache
Vary: Accept-Encoding, Cookie
Via: 1.1 varnish
X-Cache: MISS

This is an example of another website that also uses Fastly, and caches despite the cookie vary:

HTTP/2 200 
content-type: text/html; charset=UTF-8
x-content-type-options: nosniff
content-language: en
content-security-policy: upgrade-insecure-requests
last-modified: Wed, 11 Oct 2023 09:44:43 GMT
x-request-id: 075e1ac910e2364e473fbe45f4604f56
content-encoding: br
x-envoy-upstream-service-time: 106
x-datacenter: SJC
x-cacheable: YES
accept-ranges: bytes
date: Thu, 12 Oct 2023 21:09:19 GMT
age: 41076
x-served-by: cache-wk-sjc11420-SJC, cache-lcy-eglc8600051-LCY
x-cache: HIT, HIT
x-cache-hits: 5, 2
x-timer: S1697144960.646185,VS0,VE0
vary: Accept-Encoding, Cookie
cache-control: private, s-maxage=0, max-age=0, must-revalidate
content-length: 38029
X-Firefox-Spdy: h2

Can anyone point me in the right direction of what I can possibly do here?

Hi @Lrzxft! You’re not being stupid, I can assure you :slight_smile:

Cookies are a tricky business. If your origin server includes a vary: cookie response header, then our ability to cache the response really depends on how much variation there is in the possible cookie values.

You should certainly be caching responses that don’t vary on cookie. As a first step I’d check that your origin server is not applying vary:cookie to all responses. After all, scripts/images/stylesheets should not need that.

Regarding your HTML page responses, where varying on cookie is totally reasonable, I’d recommend that if you want to cache the logged out experience separately to the logged in experience, consider validating your session cookie at the edge, and normalising your cookie header into something less varied, like a special “Edge-Auth: logged-in” header, which only has two values.

If the response is different for each user, then that’s more of a caching challenge! Is that what you’re dealing with?

Thank you for your reply @triblondon, it’s tricky business. My origin server isn’t sending the vary: cookie on every page; things like CSS and JS I’ve since managed to cache effectively, it’s just the main pages in this instance—the actual content that users see.

You’re correct in saying that I want to cache the logged out experience different than the logged in response, in the idea that for logged in users, I want to pass the response to the origin, nothing should be cached in this instance since logged in users should ideally be seeing things up to date.

I did have the idea of normalising the cookies, as you say, which is doable using Varnish, but doing some deeper digging, I don’t know exactly how this translates into Fastly VCL since there does appear to be some differences. I reached out to a friend who is using MediaWiki and Varnish standalone, and has managed to get it set up and they pointed me to their Varnish VCL file, which has the following:

sub evaluate_cookie {
	# Replace all session/token values with a non-unique global value for caching purposes.
	if (req.restarts == 0) {
		unset req.http.X-Orig-Cookie;
		if (req.http.Cookie) {
			set req.http.X-Orig-Cookie = req.http.Cookie;
			if (req.http.Cookie ~ "([sS]ession|Token)=") {
				set req.http.Cookie = "Token=1";
			} else {
				unset req.http.Cookie;
			}
		}
	}
}

If I upload this as a snippet, and add it into vcl_recv then it does cache everything, as expected, but then it prevents logging in due to what I presume is an issue setting the cookie.

Later down the line, they do

# Initiate a backend fetch
sub vcl_backend_fetch {
	# Restore original cookies
	if (bereq.http.X-Orig-Cookie) {
		set bereq.http.Cookie = bereq.http.X-Orig-Cookie;
		unset bereq.http.X-Orig-Cookie;
	}
}

Which I presume restores the cookies which allow for logging in, or something like that. I couldn’t find anything similar to vcl_backend_fetch in Fastly configuration, so I’m not entirely sure how it translates?

I presume that this is something similar to the normalisation you mention—or is there an easier way to do it? I know this is a very long winded reply, but what I’m trying to get across is that the cookies don’t really change the content of what a user sees (logged in or not), they are just used to log users in and keep them logged in.

The cookie values are generally of the form session={session token} and token={token} where the values of those cookies could be anything, not sure if that makes a difference?

The technique you’re describing here should work, with a small modification. Fastly doesn’t have a vcl_backend_fetch subroutine but we do have vcl_miss and vcl_pass, so just put the same code in both of those.

However, the effect of this is that if your origin server adds vary: cookie to a response, we will think that response is usable for any request with a matching cookie header post-normalisation, which means if we cache any logged-in variants, they will be served to other users from cache on subsequent requests.

If the logged-in experience is the same for all logged-in users, for example if the page doesn’t include the user’s name, then that may be what you want. If not, then ensure you set a Cache-Control: no-store, private on the logged-in responses.

If that’s the case, incidentally, and you only want to cache the logged-out variant, then in theory that should work without any normalisation, because your origin server would serve all responses with vary: cookie and only allow caching on responses where the request did not include a cookie. We would therefore only cache one variant. However, this is oversimplistic in practice, because even logged-out users typically have cookies, eg. for Google Analytics or whatever, and also because we have a limit of a few hundred variants for each cache object, and even if responses turn out to be uncacheable, we create the cache marker before the backend request is made, which may evict the logged-out variant that you want to cache.

In short, normalising cache key inputs is usually a good idea, and you should just take care to ensure you don’t serve the wrong content to the wrong users as a result.

I thought I had possibly solved the issue using the method you stated: adding the code to normalise the cookies into vcl_recv and then in vcl_miss and vcl_pass putting the original cookies back, but that seemed to stop everything being cached, oddly enough.

Potentially I am missing something blatantly obvious here?