Googlebot always seems to bypass the cache


#1

Hi guys,

I’ve got caching working with something like

res.set({
                // 'Cache-Control': 's-maxage=86400'
                'Cache-Control': 'no-cache, no-store, must-revalidate',
                'Surrogate-Control': 'max-age=86400',
                'Pragma': 'no-cache',
                'Expires': 0,
 });```

And it works, but when I fetch as Google in webmaster tools, the requests constantly hit my origin.

So I figured that it was hitting different POP locations each time, which it was, but even after enabling shielding it still gets through to my origin.

This is Googlebots response headers

`
HTTP/1.1 200 OK
Server: cloudflare-nginx
Date: Wed, 20 Apr 2016 13:17:17 GMT
Content-Type: text/html; charset=utf-8
Transfer-Encoding: chunked
Connection: keep-alive
Set-Cookie: __cfduid=d4277cb882bba4e2c64c73047786f4b4f1461158237; expires=Thu, 20-Apr-17 13:17:17 GMT; path=/; domain=.xxxx.com; HttpOnly
Status: 200 OK
Cache-Control: no-cache, no-store, must-revalidate
Access-Control-Allow-Origin: *
Pragma: no-cache
Access-Control-Allow-Headers: X-Requested-With
Expires: 0
Set-Cookie: ff=; Path=/
X-Powered-By: Phusion Passenger 5.0.25
Via: 1.1 vegur
Via: 1.1 varnish
Fastly-Debug-Digest: c839f034d8e0abbc67f39899758e6584ba34b305f3abd5ef06fcf4aab0a6dbb7
Via: 1.1 varnish
Age: 0
X-Served-By: cache-jfk1022-JFK, cache-ord1731-ORD
X-Cache: MISS, MISS
X-Cache-Hits: 0, 0
X-Timer: S1461158237.183803,VS0,VE134
CF-RAY: 2968e025e8822629-DFW
`

#2

And here is a browser request that is working

access-control-allow-headers:X-Requested-With access-control-allow-origin:* age:1395 cache-control:no-cache, no-store, must-revalidate cf-ray:2968ed9eca981d68-MEL content-encoding:gzip content-type:text/html; charset=utf-8 date:Wed, 20 Apr 2016 13:26:28 GMT expires:0 fastly-debug-digest:a7b7ff5fbbfb6b4d6a4135e2b64da745fe75f1277a6a5d57bde4f418ce574c3b pragma:no-cache server:cloudflare-nginx status:200 status:200 OK via:1.1 varnish via:1.1 vegur via:1.1 varnish x-cache:MISS, HIT x-cache-hits:0, 2 x-powered-by:Phusion Passenger 5.0.25 x-served-by:cache-jfk1036-JFK, cache-mel6521-MEL x-timer:S1461158788.924244,VS0,VE0


#3

Hi Thomas

The set-cookie header prevents us from caching the response. This is to prevent user-specific content being cached and served among different users.

It looks like it only gets set when the request didn’t come in with a corresponding cookie header (the Googlebot request), while the one from your browser does have one.

If you need to, you can remove the set-cookie header from the response:

https://docs.fastly.com/guides/basic-configuration/removing-headers-from-backend-response