Synopsis

This chapter explains how to keep caches consistent when more than one CDN serves the same content. It covers cache key design, HTTP caching headers, immutable assets, purge methods, sequencing across providers, and the operational controls that keep correctness and latency predictable.

Purpose and scope

Multi-CDN changes cache behavior because different networks fetch, store, and expire objects using different rules and clocks. The goal is that users receive the same bytes for the same URL regardless of the serving CDN and that changes reach users in a controlled and explainable way. The guidance applies to static assets and dynamic responses that allow caching. It also covers the surrounding systems that push and purge content.

Cache key design

A stable cache key is the foundation. Use scheme, host, normalized path, and a controlled set of query parameters. Avoid free form headers in the key. If a header must vary the response, make the variation explicit and keep the set small. Accept-Encoding and Accept can create large key spaces if left open. Prefer predictable variants that match what clients actually send.

Keep the Host header used between CDN and origin consistent. If provider A forwards a different Host than provider B, virtual hosting and cache identity can diverge. If a shield or proxy sits in front of the origin, it should apply the same normalization rules as the edges. Document the key in one place so developers, operations, and vendors speak the same language.

HTTP headers that control caches

HTTP caching works when headers are clear. Use Cache-Control to state whether a response is public, how long it can be stored, and whether stale usage is allowed during errors or revalidation. Stale-while-revalidate and stale-if-error improve resilience without custom logic. Use ETag and Last-Modified so conditional requests work across providers and at the origin. If validators are weak or inconsistent, conditional fetches can return full bodies and waste capacity.

Avoid setting both long max-age and frequent purges for the same path unless the asset name is immutable. Either publish immutable versioned URLs and rely on long TTLs, or keep short TTLs and accept higher origin load. Mixing both patterns usually gives the worst of both.

Immutable assets and change strategy

Static files such as scripts, styles, and images should be published under versioned or content addressed URLs. When a file changes, the URL changes. Caches then update naturally without reliance on purge. This pattern also makes rollback safe because the prior version is still in cache.

Dynamic endpoints sometimes need cache. Keep cache windows short and validators accurate. When a change must reach users quickly, do not rely on a low TTL to expire everywhere at once. Use targeted revalidation or purge and confirm the effect.

Purge methods

CDNs expose several purge methods. Purge by URL removes a specific object. Purge by prefix removes a path subtree. Some providers support logical tags or surrogate keys that group related objects. Hard purge removes the object immediately. Soft purge marks an object stale and allows revalidation on next request. Soft purge reduces origin spikes during large purges and should be the default when correctness allows it.

In a multi-CDN setup a controller that issues equivalent purges to each provider is required and should record outcomes. The controller must handle rate limits, backoff, and retry. It must map the purge intent to each provider API. For example, a tag based purge may need to translate to a list of URLs on a provider that lacks tags.

Ordering and idempotency

Changes should have an order. When a new asset or configuration is published, first ensure that the new bytes are present at origin and any shields. Then update links or routing. If purge is required, send purge commands after publish and in a way that can repeat safely. Purge APIs should be called with idempotent identifiers so that retries do not widen the scope.

Sequence matters between providers. If one provider updates first, mixed content can be visible during the window where the second provider still serves the old version. Reduce this window by using immutable URLs or by soft purging and letting revalidation fetch the new content once links change.

Consistency expectations

Perfect global simultaneity is not realistic. Aim for bounded inconsistency. Define acceptable windows for user facing changes and design the process to hit those windows with margin. With immutable URLs the window is near zero because old and new assets do not clash. With purge, the window is the slowest of the provider API latency, propagation inside the CDN, and user path effects such as intermediate caches.

Define what must be strictly consistent. Critical API responses that drive money movement may require no cache at all or very short TTLs with strong validation. Marketing pages can tolerate longer windows.

Negative caching and error handling

Unsuccessful responses can be cached by some providers. Control this with Cache-Control for status codes such as 404 and 5xx. If 404s are temporary during rollout, keep negative TTLs very short. Use stale-if-error to continue serving recent good responses during transient origin faults. Confirm how each CDN treats negative caching and align settings so behavior is the same.

Revalidation and freshness

Revalidation reduces origin load while keeping data fresh. Ensure that validators are stable and that the origin handles If-None-Match and If-Modified-Since efficiently. Support 304 responses with correct headers so caches can update freshness without fetching bodies. For dynamic pages, server timing headers can help diagnose when revalidation occurs and how long it takes.

CDN differences that matter

Providers differ in purge latency, API limits, and features such as tag support. Some propagate purges hierarchically through mid tiers. Some apply them at the edge first. Time to live handling, default negative caching, and redirect caching also vary. Maintain a provider matrix and test the corner cases when a new vendor is onboarded. Where features do not align, prefer the lowest common denominator unless there is a strong reason to adopt a richer feature and emulate it elsewhere.

Telemetry for cache health

Management requires visibility. Measure edge hit rate, shield hit rate, origin fetch rate, and purge API results per provider. Track revalidation ratios and 304 rates. Record purge request ids and map them to traffic changes. During incidents it must be possible to prove whether caches changed as expected or whether a failure left stale objects in place.

Publish and purge pipeline

Treat content changes as a pipeline with clear stages and checkpoints. A basic flow is to build artifacts, write to storage, verify checksum and metadata, warm shields where needed, update references in HTML or manifests, and finally issue purges or revalidation hints. Rollback reverses the reference update and leaves the old assets in place. The pipeline should be automatic and repeatable.

sequenceDiagram participant Dev as Publisher participant Store as Storage participant Shield as Origin Shield participant CDN1 as CDN A participant CDN2 as CDN B Dev->>Store: Upload versioned asset Dev->>Store: Verify checksum and metadata Dev->>Shield: Warm critical assets (optional) Dev->>CDN1: Soft purge by tag or URL Dev->>CDN2: Soft purge by tag or URL Dev->>Dev: Flip references to new URLs CDN1->>Store: Revalidate on next request CDN2->>Store: Revalidate on next request

Safety controls

Large purges can overload origins if many edges fetch at once. Prefer soft purge with revalidation, apply rate limits per path or per region, and warm only the objects that cannot tolerate a cold fetch. Maintain a manual block list to prevent accidental purges of critical prefixes. Require change tickets or review for purge scopes above a defined threshold. Log who issued each purge and the stated reason.

Testing and drills

Test cache behavior the same way failover is tested. Create a synthetic site that exercises redirects, variants, validators, and error responses. Run it through both CDNs and confirm that results match. Drill purge scenarios such as retiring a CSS file referenced by many pages. Record timings and verify that the slowest leg remains within the target window.

Handling user facing pages

HTML is sensitive to inconsistency because it references many assets. Publish HTML with short TTLs and strong validators. Reference immutable asset URLs inside HTML. When a new release lands, update HTML first only if old assets remain available. If old assets are removed, update HTML and assets together and rely on purge or revalidation to make the new set visible. Avoid query string cache busting in HTML for long lived content because proxies and intermediate caches may handle it inconsistently.

Range requests and partial content

For large files, caches should serve ranges without re fetching the entire object. Ensure that purges clear partial content consistently. Verify that revalidation works with ranges and that 206 responses include validators. Check how each CDN stores ranged segments and whether a purge removes all segments.

Configuration management

Keep cache rules in version control. Use templates or a central policy so that providers do not drift. Review changes with both vendors present when rules affect identity, TTLs, or purge behavior. Align default behaviors such as whether to cache 301 and 302, how to treat Set-Cookie, and whether to strip or forward headers that may affect cache keys.

Operations

During incidents, prefer turning off a route or pinning traffic to a known good CDN over wide purges. If purge is required, start from the smallest scope and observe effects before scaling. Confirm that monitoring thresholds do not trigger on expected revalidation spikes. After the event, document the sequence and update the playbook if steps were unclear.

flowchart TD Start[Change ready] --> Check[Assets present and checks pass] Check -->|no| Stop[Hold release] Check -->|yes| Refs[Update references] Refs --> Purge[Issue soft purges] Purge --> Verify[Observe revalidation and hit rates] Verify -->|good| Done[Complete] Verify -->|bad| Rollback[Revert references]

For steering behavior that interacts with cache identity see /multicdn/traffic-steering/. For origin design and shielding see /multicdn/origin-architecture/. For measurement that validates cache health see /multicdn/signals-telemetry/.

Further reading

RFC 9110 defines HTTP semantics and validators. RFC 9111 defines HTTP caching. The Cache-Control extensions for stale-while-revalidate and stale-if-error are documented in RFC 5861. Verify each CDN vendor reference for purge latency and limits.