Cache-Control & TTLs: Getting Caching Right

Caching is one of the simplest ways to improve delivery. A cache saves a copy of a response so it can be served without a round trip to the origin. The core control surface is the Cache-Control header and the time-to-live (TTL). This guide explains how freshness and validation work, which directives do what, and how to choose sane defaults that fit real sites and APIs. Freshness vs validation A cache serves a response when it is fresh. Freshness comes from an explicit lifetime such as max-age or from an older Expires date. After freshness ends, a cache either revalidates or fetches again. ...

August 15, 2025

Cache Consistency and Purging Across CDNs

Synopsis This chapter explains how to keep caches consistent when more than one CDN serves the same content. It covers cache key design, HTTP caching headers, immutable assets, purge methods, sequencing across providers, and the operational controls that keep correctness and latency predictable. Purpose and scope Multi-CDN changes cache behavior because different networks fetch, store, and expire objects using different rules and clocks. The goal is that users receive the same bytes for the same URL regardless of the serving CDN and that changes reach users in a controlled and explainable way. The guidance applies to static assets and dynamic responses that allow caching. It also covers the surrounding systems that push and purge content. ...

Origin Architecture for Multi-CDN

Synopsis This chapter explains how to design and operate origin infrastructure that can serve more than one CDN at the same time. It covers topology choices, origin shielding, authentication, cache key consistency, deployment and consistency models, failover behavior, and operational practices. The goal is to keep content correctness and performance stable while multiple CDNs fetch from the same source. Role of the origin in multi-CDN The origin is the source of truth for content and APIs. In a multi-CDN setup more than one provider will fetch from it. The design must handle higher fan in, different retry behaviors, and different cache semantics without breaking correctness. It should also keep the number of variables low so that problems are diagnosable during incidents. ...