Monitoring health status from fastly health checks


#1

Hi fastly team,
I am building our service failure detection.

Is there a way to get information on the status of fastlys health checks?

We expected a log upon an unhealthy event, but my first testing has not shown one.

Thank you,
Malte


#2

Hi Malte,

Unfortunately, this is not information that we expose currently. But I’ll point this post out to our product team, so that they can keep your expectations in mind while working out how that information should be exposed.

Cheers,

Doc


#3

Hi Doc,
thanks, any method would be fine for me.
For now we run a local health check routine and some extra load monitoring.

Cheers


#4

Hi, has there been any development on this feature @drwilco? I’m working on the exact same thing.

Thanks,

Chris


#5

@ drwilco Any word on Fastly monitoring for health checks?

Being unable to view health check status makes debugging, especially real-time debugging during a 3rd party outage, a real pain.


#6

Hi @rmharrison @chris.usick @malte and all,

Regarding this topic, yes, we’re aware of your needs as this is one of the frequently asked questions.
We have no ETA for this, but have an internal feature request ticket and the product team is working on it I believe.

In the meantime, I would like to share my tip as to how you can achieve backend monitoring using VCL.
At the very beginning, you may already know about the VCL variable: req.backend.healthy but let me bring this up here again.

req.backend.healthy - Read Only
Whether or not the backend has been marked unhealth by healthchecks or not.

We can definitely use it for this specific purpose. Some of you may think about putting it into your custom logging formats and monitor it through your logging pipeline, but the problem of this solution is because you only receive the logs after your client sent a request to us, it’s not really “proactive” monitoring. Well, if your service continuously gets a large amount of traffic it may be fine, but if not that would be problematic as you don’t have enough data(logs) to identify the backend health.

So, instead of relying on the logs, we can make a tiny endpoint on your VCL which return a value of the req.backend.healthy header.

In your VCL, you can see backend top-level objects are defined like this:

backend ${backend_name_1} { ... }
backend ${backend_name_2} { ... }

${backend_name_n} is important here.
Next, you create below VCL Snippet in vcl_recv subroutine:

# endpoint:
# /fastly/status?backend=${backend_name_n}
if (req.url ~ "^/fastly/status") {
  # https://community.fastly.com/t/accessing-query-string-parameters/1089/3
  set req.http.x-api-parameter = subfield(req.url.qs, "backend", "&");

  if (req.http.x-api-parameter == "${backend_name_1}") {
    set req.backend = ${backend_name_1};
    error 710;
  } else if (req.http.x-api-parameter == "${backend_name_2}"){
    set req.backend = ${backend_name_2};
    error 710;
  }

  # return 403 for unknown parameter
  error 403 "Forbidden";
}

and then create below VCL Snippet in vcl_error subroutine:

# 0=unhealthy, 1=healthy 
if (obj.status == 710) {
  synthetic "{" LF
      {"  "timestamp": ""} now {"","} LF
      {"  "req.backend": ""} req.backend {"","} LF
      {"  "req.backend.healthy": "} req.backend.healthy LF
      "}"
  ;
  set obj.status = 200;
  set obj.response = "OK";
  set obj.http.Content-Type = "application/json";
  set obj.http.X-Backend = req.backend;
  set obj.http.X-Backend-Healthy = req.backend.healthy;
  return (deliver);
}

You can test a live demo at here: https://fiddle.fastlydemo.net/fiddle/b2ace262

What that means is you created an endpoint which returns JSON body that includes your backend health status specified by URL query parameter.
Now, you can use any third party monitoring tools and use this endpoint to monitor your backend health status seen from Fastly POP.

“BUT” this solution is still not so useful. Because we have multiple POPs around the world, you could only check one single path(POP A -> backend) that is the closest POP from your monitoring server, but not for every single path(POP B -> backend, POP C -> backend … etc) which you may want to expect.

OK, there’s still a workaround to achieve that.
Most of you probably haven’t used it before, but we have a nice API “edge check” /content/edge_check:
https://docs.fastly.com/api/tools#content

/content/edge_check - GET
Retrieve headers and MD5 hash of the content for a particular URL from each Fastly edge server

This is exact same functionality as “Check cache” button on your Fastly service configuration UI.
It will send a request with specified URL to all Fastly POPs and return headers and MD5 hash of the content.

“headers”…
Yes, this is why I added a set obj.http.X-Backend-Healthy = req.backend.healthy; line to the VCL Snippet above as well. This endpoint also uses a response header to provide backend health status. If you call this endpoint using this edge check API, you can simply get backend health status seen from all our POPs at once. You should be able to parse the JSON response and retrieve each status.

$ curl https://api.fastly.com/content/edge_check?url=${hostname}/fastly/status?backend=${backend_name} -H 'Fastly-Key: ${your_fastly_token}'

Hope it helps.

Cheers,
Shohei