Brainy Records Create a band

← All posts · 2026-05-07 · perf, lessons

How "Vary: Cookie" costs you 600ms of TTFB on a server-rendered SPA

We're running Brainy Records on Cloud Run, fronted by Firebase Hosting which uses Fastly as its CDN. Server-rendered marketing pages — pricing, blog posts, band profiles, genre hubs — were marked with the obvious cache directive: Cache-Control: public, max-age=300, s-maxage=600. We assumed the CDN was caching them. It wasn't.

The symptom

Every TTFB measurement showed cache MISS. Numbers from a UAE edge node, before the fix:

x-cache: MISS on every single request, including immediate retries of the same URL. That's not a cold-start problem; that's the CDN refusing to keep anything.

The cause

The response headers, as they hit the user, looked like this:

cache-control: public, max-age=600, s-maxage=1200

vary: Origin,cookie,need-authorization, x-fh-requested-host, accept-encoding

The Vary: Cookie is the killer. Per HTTP semantics, a CDN MUST treat each distinct value of any Vary'd header as a separate cache entry. Every visitor's cookies are slightly different (Firebase Analytics ID, session preferences, advertising IDs from referrers). So every request was a unique cache key. The CDN, faced with infinite-cardinality cache keys, gives up entirely.

We hadn't set Vary: Cookie ourselves. It was being added downstream, partly by Express's CORS middleware (which sets Vary: Origin for any cross-origin response) and partly by Firebase Hosting's edge layer (which adds Cookie and need-authorization to vary by default for rewrites to Cloud Run, defending against the case where the upstream serves personalized responses).

The defense makes sense if your backend personalizes responses. Ours doesn't — every visitor sees the same /pricing page. But the CDN can't know that; it just sees Vary: Cookie and refuses to cache.

The fix

Override the Vary header on responses we know are public. We did it as Express middleware that watches every setHeader call: if the route sets Cache-Control: public with a max-age or s-maxage, we replace any Vary value with just Accept-Encoding. That's the only header that genuinely changes the response (gzip vs brotli vs identity).

The middleware is fifteen lines. It runs once per request, only mutates Vary when Cache-Control opts in, and otherwise gets out of the way. It does NOT affect /api/* responses (those don't set public Cache-Control).

The result

Same edge node, same URLs, after redeploy:

Ten out of ten pages tested moved from x-cache: MISS on every request to x-cache: HIT on cached requests. Effective TTFB for repeat visitors and AI crawlers dropped from 500-700 ms to 1 ms.

Why this matters more than the timing suggests

Three reasons.

1. AI crawlers retry frequently. ClaudeBot and GPTBot crawl the same URL multiple times across days. With cache enabled, all those revisits cost us almost nothing in Cloud Run minutes. Without cache, every crawl was a cold-path Firestore read.

2. HN-style traffic spikes are bursty. A few hundred concurrent visitors hitting /pricing within the first hour of a Show HN post would have hammered the backend. With CDN caching, the second visitor pays nothing — the response is served from the edge.

3. Core Web Vitals. Google's LCP and INP metrics are first-paint sensitive. A 600 ms TTFB is borderline; a 1 ms TTFB after first request is comfortable. This matters for search ranking, especially as we accumulate more pages worth indexing.

What to check on your own server-rendered SPA

If you have an Express app on Cloud Run / Firebase Hosting / similar:

The general lesson is the same as a lot of perf work: read the actual response headers your CDN is seeing, not the ones you think you set.