Caching is one of the simplest ways to improve delivery. A cache saves a copy of a response so it can be served without a round trip to the origin. The core control surface is the Cache-Control
header and the time-to-live (TTL). This guide explains how freshness and validation work, which directives do what, and how to choose sane defaults that fit real sites and APIs.
Freshness vs validation
A cache serves a response when it is fresh. Freshness comes from an explicit lifetime such as max-age
or from an older Expires
date. After freshness ends, a cache either revalidates or fetches again.
Validation uses validators to ask the origin if a stored response is still good. Two common validators are ETag and Last-Modified
. A validating request includes If-None-Match
or If-Modified-Since
. If the representation has not changed, the origin replies with 304 Not Modified
(see HTTP status codes) and the cache can serve its copy without transferring the body. This saves bandwidth and keeps latency low even when TTLs are short.
The main Cache-Control directives
Freshness: max-age, s-maxage, and Expires
max-age=N
sets the freshness lifetime in seconds for all caches.s-maxage=N
overridesmax-age
for shared caches such as CDNs, leaving browsers to obeymax-age
.Expires: <http-date>
is the older version of freshness. If both are present,Cache-Control
wins.
A subtle point: max-age=0
does not mean “do not cache.” It allows storage but marks the response stale immediately. Because caches and clocks have second-level resolution, a stored object can still be reused for up to one second before revalidation kicks in. If you truly do not want caching, use no-cache
(store but always revalidate) or no-store
(do not store at all).
Storage scope: public, private, no-store
public
permits storage in shared caches (CDNs) and browsers.private
limits storage to a single user’s browser; shared caches should not store it.no-store
forbids storage anywhere. Use this for highly sensitive or per-user pages where even revalidation is undesirable.
Requests with Authorization
headers are not cacheable by default. To allow shared caching of those responses, you must send explicit directives (for example public, s-maxage=60
).
Revalidation controls: no-cache and must-revalidate
no-cache
allows storage but requires revalidation with the origin before serving. This is safer thanno-store
when you still want conditional requests to save bandwidth.must-revalidate
tells caches not to serve stale content when they cannot reach the origin. Pair it with short TTLs when correctness beats availability.proxy-revalidate
is the older shared-cache variant. Most modern stacks treat it the same asmust-revalidate
.
Resilience: stale-while-revalidate and stale-if-error
These optional directives let caches deliver better continuity:
stale-while-revalidate=N
allows a cache to serve an expired response while it refreshes in the background. Users get instant responses; the cache updates itself after.stale-if-error=N
allows serving a stale response when the origin responds with an error or is unreachable. This is useful during incidents or deploys.
Both improve perceived uptime and smooth out traffic spikes. Set the windows to tolerable values for your content rather than minutes by default.
Other useful directives
immutable
tells browsers the resource will not change during its lifetime. Use on versioned static assets to avoid revalidation.no-transform
asks intermediaries not to transcode or resize responses.only-if-cached
is a request directive that tells the cache to serve only if it already has a copy (no origin fetch).
Vary and the cache key
The Vary header tells caches which request headers change the representation and therefore the cache key. Every distinct combination yields a separate object.
- Always vary by
Accept-Encoding
so compressed and uncompressed forms do not collide. - Avoid
Vary: Cookie
. Cookies change often and will destroy your cache hit ratio. If you need cookie-aware responses, consider moving personalization to client-side logic or use separate cookie names that do not apply to static routes. - Be careful with
Vary: User-Agent
. It creates a huge number of variants; prefer feature detection or server-side negotiation with a smaller set of hints. Vary: *
disables caching and should be avoided.
Many CDNs also support “custom cache keys” outside HTTP Vary
. Use them sparingly to normalize query strings, ignore marketing parameters, or pin exact headers.
Choosing TTLs by content type
Versioned static assets
For hashed filenames (for example app.9f1c2.js
), use a very long TTL and mark immutable:
Cache-Control: public, max-age=31536000, immutable
When you deploy a new version, the filename changes and caches naturally fetch the new asset.
HTML documents and primary JSON pages
HTML often needs quick updates. A common pattern is a short freshness window combined with background refresh for continuity:
Cache-Control: public, s-maxage=60, max-age=0, stale-while-revalidate=300, stale-if-error=600
Browsers treat the page as stale (they will revalidate), while the CDN can serve fresh copies for a minute and continue serving stale during revalidation or short outages.
APIs
Public, read-heavy endpoints:
Cache-Control: public, s-maxage=120, max-age=30, stale-while-revalidate=60
ETag: "abc123"
Personalized or sensitive endpoints:
Cache-Control: private, no-store
If you need shared caching for authorized requests, return explicit directives:
Cache-Control: public, s-maxage=60, must-revalidate
File downloads and media
For large, infrequently changing files, prefer long TTLs and support byte ranges. Ensure your CDN honors partial responses without re-fetching the entire object.
ETag and Last-Modified in practice
Validators are the backbone of validation caching.
ETag
should uniquely identify the representation (content hash or a stable revision). If you use weak validators (W/"..."
), caches may serve them across minor changes, which can be fine for HTML.Last-Modified
is easy to implement but less precise. Clocks and build pipelines can make timestamps noisy.
Clients that revalidate correctly receive 304 Not Modified
, which keeps traffic low even with short TTLs. Always return validators on HTML and JSON that you expect to be revalidated frequently.
Expires vs Cache-Control
Expires
dates were the original freshness mechanism. Cache-Control
is newer and more expressive. If both appear, Cache-Control
rules override Expires
. Use Cache-Control
everywhere. Keep Expires
only for legacy clients if your platform needs it.
Edge vs browser caching
Browsers and CDNs do not always behave the same way.
s-maxage
targets shared caches and overridesmax-age
there. This lets you give the CDN longer freshness than the browser.- Some CDNs support a separate
Surrogate-Control
header for edge-specific policies. For example, you can keep a long surrogate TTL and a short browser TTL in one response:
Surrogate-Control: max-age=600
Cache-Control: max-age=60
- Many CDNs also let you set default TTLs when the origin omits headers. Use defaults as a backstop, not as the main policy.
Common pitfalls and how to avoid them
- Thinking
max-age=0
disables caching. It does not. Useno-cache
to force revalidation orno-store
to forbid storage. - Over-broad Vary.
Vary: Cookie
orVary: User-Agent
can kill caching. Minimize the vary set. - Missing validators. Without
ETag
orLast-Modified
, revalidation falls back to full fetches. - Conflicting headers. If you must send
Expires
, ensure it matchesCache-Control
. In conflicts, caches followCache-Control
. - Caching private content. Mark personalized responses as
private
orno-store
. Do not rely on path secrecy. - Ignoring Authorization. Responses to authenticated requests are non-cacheable unless you explicitly allow it with directives such as
public, s-maxage=...
. - Relying on defaults. Be explicit on HTML and API responses. Defaults differ between browsers and CDNs.
Heuristic caching when headers are absent
If you omit freshness information, shared caches may apply heuristics, often a small fraction of the time since Last-Modified
. This is unpredictable and rarely what you want. Prefer explicit max-age
or s-maxage
, even if short.
Normalizing the cache key
Normalize ignorable query parameters (for example marketing tags) so they do not fragment the cache. Many CDNs can ignore specific parameters or order them consistently. The goal is that semantically equal URLs map to the same cached object.
Testing and rollout
- Inspect headers. Use
curl -I https://example.com/
and your browser’s network panel. Confirm the presence and values ofCache-Control
,ETag
,Last-Modified
, andVary
. - Watch cache outcome headers. Many CDNs add
X-Cache
or similar to showHIT
,MISS
,EXPIRED
, orSTALE
. Use them during rollout. - Measure hit ratio and TTFB. Track changes in cache hit ratio and TTFB after you deploy new rules.
- Exercise revalidation. Force short TTLs in a test environment and confirm that
304 Not Modified
responses appear for unchanged content. - Plan for purge. Even with long TTLs, you need targeted invalidation. Tie purges to deploys for HTML and API payloads that must flip quickly.
Practical starter policies
Use these as a baseline and adjust after measuring.
- Versioned static assets
Cache-Control: public, max-age=31536000, immutable
- HTML shell
Cache-Control: public, s-maxage=60, max-age=0, stale-while-revalidate=300, stale-if-error=600
ETag: "rev-<build>"
- Public API (read-heavy)
Cache-Control: public, s-maxage=120, max-age=30, stale-while-revalidate=60
ETag: "sha256-<body>"
- Personalized or sensitive
Cache-Control: private, no-store
See also Purging CDN Content for strategies to remove cached objects when immediate updates are required.