Way to map GCP surrogate headers to fastly?

GCP Cloud Storage uses custom headers for surrogate control (similar to AWS)

x-goog-meta-surrogate-key
x-goog-meta-surrogate-control

In our vcl_fetch we’re trying to map the header through for fastly like so:

if (req.http.host ~ "googleapis") {
  set beresp.http.Surrogate-Key = beresp.http.x-goog-meta-surrogate-key;
  set beresp.http.Surrogate-Control = beresp.http.x-goog-meta-surrogate-control;
}

…but it doesn’t seem to be having any effect.

I can’t think of the right place in the VCL for this logic - any tips?

This seems like it should work to me! Things I’d check/debug:

  • Does Google’s surrogate control / surrogate key format match ours? Info on our formats is here: Surrogate-Control | Fastly Developer Hub and Surrogate-Key | Fastly Developer Hub
  • Does req.http.host actually contain “googleapis”? It may not if you are setting the host on bereq instead of on req, or if you are using a backend host header override in backend settings. I suggest adding a log statement to the if block and ensuring that the if condition is actually true.
  • Is google sending those headers in the response? If you are not stripping them in VCL you should see them on the eventual client response.

thanks for the tips!

I noticed another thread where someone claimed setting these headers in vcl_fetch had no effect, which seems to match what we’re seeing.

The x-goog-meta-* headers are present in the response, and the formats seem to match:

< HTTP/2 200
< cache-control: no-cache, no-store, must-revalidate
< expires: Thu, 31 Oct 2024 19:47:27 GMT
< last-modified: Tue, 31 Oct 2023 20:08:52 GMT
< x-goog-meta-surrogate-control: max-age=31536000
< x-goog-meta-surrogate-key: juno
< content-type: application/javascript
< accept-ranges: bytes
< date: Wed, 01 Nov 2023 19:47:27 GMT
< via: 1.1 varnish
< x-served-by: cache-yyc1430029-YYC
< x-cache-hits: 0
< vary: Accept-Encoding
< content-length: 16090

We are setting the host specifically in the VCL as well as using the backend definition

in vcl_recv:

set req.backend = F_google_cloud_storage;
set req.http.host = "storage.googleapis.com";

our backend def:

backend F_google_cloud_storage {
  .connect_timeout = 1s;
  .dynamic = true;
  .port = "443";
  .host = "storage.googleapis.com";
  .first_byte_timeout = 15s;
  .max_connections = 200;
  .between_bytes_timeout = 10s;
  .share_key = "REDACTED";

  .probe = {
    .request = "HEAD /unii-frontend/ping HTTP/1.1" "Host: storage.googleapis.com" "Connection: close";
    .window = 5;
    .threshold = 1;
    .timeout = 2s;
    .initial = 5;
    .dummy = true;
  }
}

I’ll see what I can recreate locally with a log statement

Hi @kieran

I’ve tried to reproduce this using Fastly Fiddle and it looks like it should work.

Let me take you through the reduced test case I created here:
https://fiddle.fastly.dev/fiddle/9d9c3107

So Fiddle uses https://http-me.glitch.me as its default backend and I’ve modified the incoming request to include parameters to mimic the behaviour we want (i.e. the backend to return a response that includes a x-goog-meta-surrogate-key header.

I’ve also modified the path to have the backend return a standard surrogate-key header just so I could see how Fastly behaves when handling the traditional use case.

In the vcl_fetch I add the following code:

log beresp.http.x-goog-meta-surrogate-key;
log beresp.http.surrogate-key;
set beresp.http.Surrogate-Key = beresp.http.x-goog-meta-surrogate-key;

So I’m logging the values of both headers and I’m setting Surrogate-Key on the beresp so that Fastly knows it has a surrogate key for the object it’s about to cache.

In vcl_fetch you’ll see I also log the Surrogate-Key on the resp object to see what its value is (because as documented, resp is what’s sent back to the client and Fastly should be stripping Surrogate-Key from the client response):

log resp.http.surrogate-key;

OK, so first thing we notice is the response does send the headers we want it to send :white_check_mark:

Screenshot 2023-11-02 at 09.36.49

The next thing we look at is the Fiddle data for the vcl_fetch subroutine…

From this we see two useful things…

  1. We see the log output is what we expect :white_check_mark:
  2. We also see that Fastly is aware of the surrogate key (see the skeys) :white_check_mark:.

We can also see that skeys was originally set to example2 (which came from the surrogate-key header returned in the backend response) and is what we would expect :white_check_mark:.

The reason it’s no longer set to that but to example1 is of course because our code logic has overridden Fastly’s default behaviour by setting Surrogate-Key explicitly to example1 (which comes from the Google header x-goog-meta-surrogate-key returned in the backend response) and again, is what we would expect from this code logic :white_check_mark:.

NOTE: For anyone unaware the Surrogate-Key header can contain multiple keys but they need to be space separated. So if we wanted to keep both key values then we’d append to the header instead of overwriting it completely. Something like…

set beresp.http.Surrogate-Key = beresp.http.Surrogate-Key + " " + beresp.http.x-goog-meta-surrogate-key;

Screenshot 2023-11-02 at 09.47.30

The next thing we look at is the Fiddle data for the vcl_deliver subroutine and we see that Fastly has indeed stripped the Surrogate-Key header from the resp object so that header won’t be sent to the client. This is as per the Fastly documentation, and so the (null) log output is what we would expect to see here :white_check_mark:

Lastly, we can confirm the removal of Surrogate-Key from the response when we look at the actual response sent to the client…

Notice that the header is removed as expected, but also we can see the Google equivalent is still set (and we should probably remove that from either beresp or more likely from resp for security reasons).

But otherwise it looks like this should work fine :thinking:

So my suggestion would be to try and remove any logic branches that might not be evaluating as you might expect them to be, or just log out their values (or log inside the conditional blocks to be sure it has been reached).

Hopefully this helps.

Let us know how you get on.