Guidance needed for configuring Next.js 15 streaming / suspense with Fastly CDN

We are adopting Next.js 15 to leverage its server-side streaming capabilities with React Suspense. While streaming works perfectly in our local development environment, we are encountering issues when deploying our application behind Fastly CDN.

The Problem

When a page with Suspense boundaries is requested through Fastly, the response appears to be fully buffered. Instead of the initial shell being sent immediately followed by streamed chunks, the browser only receives the complete HTML document after the slowest data fetch has resolved on the server. This behavior negates the primary user experience benefit of streaming.

Expected Behavior

We expect Fastly to act as a pass-through for the streamed response from the Next.js server. The initial HTML shell should be delivered to the client instantly, and subsequent chunks generated by resolved Suspense boundaries should be sent as they become available, without being buffered by the CDN.

Hey @adardesign, just a few follow-up questions. Apologies if these seem basic:

  • Are you getting any HTTP errors when testing? I’m wondering if either your origin or Fastly are throwing 5xxs here that might useful clues for futher diagnostics.
  • Are you seeing any variance between large and small response bodies? I’m wondering if this is a case of large response bodies that are missing proper range / transfer-encoding headers.

If you’re comfortable, a sample output of a curl command (with identifying information removed) that triggers the behavior might be worth sharing.

See here:

This one behaves as expected (it is hosted by Vercel)

vs this one (served via fastly CDN)
https://www.adorama.com/poc/streaming/product/canon-eos-r5

My question is, what do I need to do to configure suspense to work and stream the response on fastly

I’m not a react expert, though I’m still investigating for other cases where this has happened. I can reproduce the behavior within webpage test, for what it’s worth.

In my digging, I found this blog post from a Fastly customer outlining their use of React Suspense. It might be helpful in case there’s a configuration issue within the code. https://www.contentful.com/blog/what-is-react-suspense/

After doing some additional digging on forums and reviewing the example responses, the origin isn’t including a Transfer-Encoding: chunked header in the response,

That response header header instructs Fastly to pull and deliver the response body in chunks rather than waiting for the file to be fully downloaded first before delivering. This might be the missing piece of the equation.

Thanks for the hint, the response is streamed indeed if there is Transfer-Encoding: chunked in origin’s headers.

But, in our setup our origin is behind GCS API gateway and the API gateway removed the connection headers (including transfer encoding). The details of the implementation can be found here: External Application Load Balancer overview  |  Load Balancing  |  Google Cloud

This does not allow to have streaming running e2e when both GCS gateway and Factly CDN are participating in the request.

note:
As I see, Fastly’s servers do behave in the same way: if Transfer-Encoding header is present in the response, the header is removed but the response is still streamed.

Both servers are acting according to the rfc RFC 7230 - Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing

A proxy or gateway MUST parse a received Connection header field before a message is forwarded and, for each connection-option in this field, remove any header field(s) from the message with the same name as the connection-option, and then remove the Connection header field itself (or replace it with the intermediary’s own connection options for the forwarded message).

but seems that the following part is ignored

or replace it with the intermediary’s own connection options for the forwarded message

Is it possible still stream the response to the client even if the connection/transfer-encoding headers are missing from origin (GCS load balancer/gateway in our case)?

Thanks in advance,
AndriyS

Hey @AndriyS – if I understand your use case correctly, our Streaming Miss functionality might be helpful here Streaming Miss | Fastly Documentation

Just a note that Streaming Miss is incompatible with gzip/brotli compression at the edge (we can’t compress and send an object we haven’t fully received from the origin). But if you’re already compressing content on your origin, you should already be covered.