We are adopting Next.js 15 to leverage its server-side streaming capabilities with React Suspense. While streaming works perfectly in our local development environment, we are encountering issues when deploying our application behind Fastly CDN.
The Problem
When a page with Suspense boundaries is requested through Fastly, the response appears to be fully buffered. Instead of the initial shell being sent immediately followed by streamed chunks, the browser only receives the complete HTML document after the slowest data fetch has resolved on the server. This behavior negates the primary user experience benefit of streaming.
Expected Behavior
We expect Fastly to act as a pass-through for the streamed response from the Next.js server. The initial HTML shell should be delivered to the client instantly, and subsequent chunks generated by resolved Suspense boundaries should be sent as they become available, without being buffered by the CDN.
Hey @adardesign, just a few follow-up questions. Apologies if these seem basic:
Are you getting any HTTP errors when testing? I’m wondering if either your origin or Fastly are throwing 5xxs here that might useful clues for futher diagnostics.
Are you seeing any variance between large and small response bodies? I’m wondering if this is a case of large response bodies that are missing proper range / transfer-encoding headers.
If you’re comfortable, a sample output of a curl command (with identifying information removed) that triggers the behavior might be worth sharing.
I’m not a react expert, though I’m still investigating for other cases where this has happened. I can reproduce the behavior within webpage test, for what it’s worth.
In my digging, I found this blog post from a Fastly customer outlining their use of React Suspense. It might be helpful in case there’s a configuration issue within the code. https://www.contentful.com/blog/what-is-react-suspense/
After doing some additional digging on forums and reviewing the example responses, the origin isn’t including a Transfer-Encoding: chunked header in the response,
That response header header instructs Fastly to pull and deliver the response body in chunks rather than waiting for the file to be fully downloaded first before delivering. This might be the missing piece of the equation.
This does not allow to have streaming running e2e when both GCS gateway and Factly CDN are participating in the request.
note:
As I see, Fastly’s servers do behave in the same way: if Transfer-Encoding header is present in the response, the header is removed but the response is still streamed.
A proxy or gateway MUST parse a received Connection header field before a message is forwarded and, for each connection-option in this field, remove any header field(s) from the message with the same name as the connection-option, and then remove the Connection header field itself (or replace it with the intermediary’s own connection options for the forwarded message).
but seems that the following part is ignored
or replace it with the intermediary’s own connection options for the forwarded message
Is it possible still stream the response to the client even if the connection/transfer-encoding headers are missing from origin (GCS load balancer/gateway in our case)?
Just a note that Streaming Miss is incompatible with gzip/brotli compression at the edge (we can’t compress and send an object we haven’t fully received from the origin). But if you’re already compressing content on your origin, you should already be covered.