Add a client-side response cache honoring SEP-2549 caching hints by maxisbey · Pull Request #3023 · modelcontextprotocol/python-sdk

maxisbey · 2026-06-29T17:33:07Z

Implements the client half of SEP-2549: Client now honors the ttlMs/cacheScope hints servers stamp on the six cacheable results, serving fresh repeat calls from a response cache instead of refetching. Builds on #3015 (the server-side authoring surface); part of #2899.

Surface

Client(cache=...): None (default) = on with a per-client in-memory store; a CacheConfig to customize (pluggable async ResponseCacheStore, partition, server-identity override, default TTL, injectable clock); False = off entirely.
Per-call cache_mode: "use" | "refresh" | "bypass" on list_tools, list_prompts, list_resources, list_resource_templates, read_resource.
mcp.client exports the config and store-contract types; CacheableMethod/CACHEABLE_METHODS moved to mcp_types.methods (derived from the method registry, re-exported from mcp.server.caching unchanged).

Semantics worth reviewing

Hints are honored only on 2026-07-28 sessions. Legacy peers can put ttlMs/cacheScope keys on the wire (the 2025 surfaces validate-and-discard unknown keys), so wire presence is not an era signal — the cache gates on the negotiated version and treats legacy results as hint-absent.
Sharing is partitioned. Keys carry a sha256 of a client-derived server identity (the URL you dialed, userinfo-stripped, or CacheConfig.target_id; never serverInfo) plus the caller-supplied partition. "public" entries are partition-scoped by default; CacheConfig.share_public (constructor-only) opts into cross-partition serving. A custom store requires an explicit partition; a custom store with no derivable server identity is a construction-time error.
Multi-round-trip reads are never cached (spec MUST): seeded read_resource calls skip the cache entirely, input_required results are never stored, and terminal results reached through driver rounds are not cached.
Cursor continuation pages never read or write the cache; an expired-cursor INVALID_PARAMS rejection evicts the method's entry (spec SHOULD).
Notifications evict: *_list_changed evicts the matching list(s), resources/updated evicts that URI (exact-URI, both arms).
Calls carrying meta always reach the server — a progress token or tracing key is never swallowed by a cache hit.
A tools/list cache hit re-applies the session's post-fetch pipeline (x-mcp-header MUST filter, Mcp-Param-* header maps, output schemas), so served and fetched listings behave identically on later tools/calls.
Negative inbound ttlMs is now treated as 0 at the parse seams (spec SHOULD) instead of failing the call — this also fixes mode="auto" silently downgrading to the legacy handshake when a discover result carried a negative TTL.
Hardening: 24h TTL cap, deep copies on write and serve, delete-opposite-arm-first writes with a post-set eviction re-check (cancellation-shielded), store failures degrade to misses and never fail a successful fetch.

Server-side fixes that ride along

Configured cache_hints now apply to mapping-shaped handler returns per-field (previously a dict return on a hinted method failed the request at 2026, since the surface requires the fields and the stamp only ran on model instances).
validate_cache_hints reports non-string unknown keys with the intended ValueError instead of a TypeError from sorted/join.

Docs & tests

docs/advanced/caching.md client section rewritten around the real cache (was "no built-in response cache"), tutorial003 is now a runnable cache demo with an injected clock, migration note added. ~130 new tests (unit, e2e against Server(cache_hints=...), scripted legacy/modern peers, fault-injecting stores), 100% branch coverage maintained.

_{AI Disclaimer}

…S from the method registry

…n-memory store

…solution, eviction

…notification eviction

… TTL cap

…fresh purge

…wn-key error formatting

cubic-dev-ai

4 issues found across 16 files

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="docs_src/caching/tutorial003.py">

<violation number="1" location="docs_src/caching/tutorial003.py:27">
P2: `tutorial003.main()` mutates module-global cache demo state without resetting it, so repeated runs produce incorrect/non-deterministic output.</violation>
</file>

_{Tip: cubic can generate docs of your entire codebase and keep them up to date. Try it here.

Fix all with cubic | Re-trigger cubic}

cubic-dev-ai · 2026-06-29T17:50:40Z

-    async with Client(mcp) as client:
-        tools = await client.list_tools()
-        print(f"{len(tools.tools)} tools, fresh for {tools.ttl_ms / 1000:.0f}s, scope={tools.cache_scope}")
+    global now


P2: tutorial003.main() mutates module-global cache demo state without resetting it, so repeated runs produce incorrect/non-deterministic output.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At docs_src/caching/tutorial003.py, line 27: <comment>`tutorial003.main()` mutates module-global cache demo state without resetting it, so repeated runs produce incorrect/non-deterministic output.</comment> <file context> @@ -1,15 +1,34 @@ - async with Client(mcp) as client: - tools = await client.list_tools() - print(f"{len(tools.tools)} tools, fresh for {tools.ttl_ms / 1000:.0f}s, scope={tools.cache_scope}") + global now + async with Client(server, cache=CacheConfig(clock=lambda: now)) as client: + await client.list_tools() # fetch 1 </file context>

Suggested change

global now

global now, fetches

now = 1_000_000.0

fetches = 0

…e, remove section header

cubic-dev-ai

1 issue found across 7 files (changes from recent commits).

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="docs_src/caching/tutorial003.py">

<violation number="1" location="docs_src/caching/tutorial003.py:27">
P2: `tutorial003.main()` mutates module-global cache demo state without resetting it, so repeated runs produce incorrect/non-deterministic output.</violation>
</file>

_{Tip: Review your code locally with the cubic CLI to iterate faster.

Fix all with cubic | Re-trigger cubic}

claude

The store set-failure issue I flagged earlier is now fixed at HEAD (the failed-set path purges the warm own-arm entry, with a covering test), and I found no new bugs in this revision — but this PR adds a whole client-side caching subsystem with multi-tenant sharing semantics, so it still needs a human reviewer.

Extended reasoning...

Overview

This PR implements the client half of SEP-2549: a new mcp.client.caching module (store contract, in-memory LRU store, and a ClientResponseCache coordinator), wiring into Client (per-call cache_mode, server-identity hashing, notification-driven eviction), an inbound ttlMs clamp in ClientSession, the move of CacheableMethod/CACHEABLE_METHODS into mcp_types.methods, server-side fixes for dict-shaped handler results, plus extensive docs and ~3,000 lines of new tests.

Status of prior findings

My earlier inline comment flagged that a failed store set during a refresh left the superseded warm entry serving. At the current head, ClientResponseCache.write checks the boolean from _set and purges the own arm on failure (src/mcp/client/caching.py:286-289), and tests/client/test_caching.py adds test_a_failed_set_purges_the_pre_existing_own_arm_entry covering the warm-entry case. The cubic doc-wording comments also appear addressed. This run's bug hunt surfaced no new issues.

Security risks

The cache introduces real cross-tenant sharing surface: partition/arm key construction, the share_public opt-in, server-identity derivation from URLs (userinfo stripping, sha256 before key material), and era scoping. The defaults look conservative (private-by-default, partition-scoped public entries, custom store requires explicit partition), but these are exactly the kinds of decisions a maintainer should sign off on, since a subtle keying mistake leaks one principal's responses to another.

Level of scrutiny

High. This touches the core Client request path for five verbs, changes default behavior (caching on by default for hint-bearing 2026 servers), and adds public API surface (CacheConfig, ResponseCacheStore, cache_mode) that will be hard to change later. Cancellation-shielded store cleanup and eviction-race generation tracking are also concurrency-sensitive.

Other factors

Test coverage is unusually thorough (store contract tests, fault-injecting stores, scripted legacy/modern peers, e2e eviction tests), which raises confidence, but the size, new public API, and multi-tenant semantics put this well outside the bar for shadow approval.

claude

Thanks for addressing the failed-set purge — the fix in ClientResponseCache.write (with the warm-entry regression test) resolves my earlier inline comment, and I found no new issues in this revision. That said, this is a large change to the client's core verbs with security-relevant cache partitioning semantics, so it deserves a maintainer's review rather than an automated approval.

Extended reasoning...

Overview

This PR adds the full client half of SEP-2549: a new mcp.client.caching module (store contract, in-memory LRU store, ClientResponseCache coordinator), wiring through Client (cache= kwarg, per-call cache_mode, server-identity derivation, notification-driven eviction), a ttlMs clamp at the session parse seam, tool-listing absorption/pruning changes in ClientSession, server-side fixes for dict-shaped handler results, plus extensive docs and ~3,000 lines of new tests.

Prior feedback status

My earlier inline comment (ignored _set return leaving a superseded warm entry servable after a failed store set) is fixed at HEAD: write() now purges the own arm when _set returns False, mirroring the failed-opposite-arm-delete path, and test_a_failed_set_purges_the_pre_existing_own_arm_entry covers the warm-entry case. The cubic findings on userinfo stripping and the docs qualifications were also addressed in subsequent commits. No new issues were identified in this pass.

Security risks

The cache introduces genuinely security-relevant surface: partition isolation of private entries in shared stores, the share_public cross-tenant opt-in, server-identity hashing (credentials stripped from URLs before keying), and era-scoping of entries. The implementation appears careful (partition-scoped public arm by default, sha256 of identity, constructor-only sharing widening, scope verification on read), and these properties are pinned by tests — but the consequences of a mistake here are cross-tenant data leakage, which is exactly the kind of judgment call a human maintainer should sign off on.

Level of scrutiny

High. This touches every cacheable Client verb's hot path, changes default behavior (cache on by default, even though hint-less servers see identical traffic), and adds a public API surface (CacheConfig, ResponseCacheStore, cache_mode) the project will need to maintain. There are also design decisions worth a maintainer's eye: deviating from the TypeScript SDK on public-entry partitioning, the meta-implies-refresh rule, and the no-coalescing/no-stale-if-error rulings.

Other factors

Test coverage is unusually thorough (store contract, fault-injecting stores, cancellation/shielding paths, e2e against scripted legacy/modern peers), which raises confidence in correctness, but the sheer size and the API/design decisions keep this firmly outside the scope of automated approval.

maxisbey added 12 commits June 29, 2026 17:22

Move CacheableMethod to mcp_types.methods and derive CACHEABLE_METHOD…

1797d9a

…S from the method registry

Add client response cache primitives: config, keys, store protocol, i…

d2d4a25

…n-memory store

Add client response cache coordinator: scope arms, era gating, TTL re…

5f8a392

…solution, eviction

Wire the response cache into Client: configuration, server identity, …

ba95005

…notification eviction

Serve cacheable client verbs through the response cache

0a19550

Treat negative inbound ttlMs as zero at the client parse seams

8fae9cf

Document the client response cache

98941ac

Add end-to-end hardening tests for the client response cache

c689c20

Cover float negative ttlMs on the discover seam in the auto-mode test

e3bb712

Document eviction timing, refetch policy, shared-store races, and the…

b81b7dd

… TTL cap

Tighten response cache: session guard, identity and meta handling, re…

efac31f

…fresh purge

Apply configured cache hints to mapping handler results and fix unkno…

d1cebd1

…wn-key error formatting

maxisbey marked this pull request as ready for review June 29, 2026 17:39

cubic-dev-ai Bot reviewed Jun 29, 2026

View reviewed changes

maxisbey added 3 commits June 29, 2026 17:52

Trim comments and docstrings

21a779a

Address review feedback: drop tutorial globals, plain-ASCII docs pros…

804043b

…e, remove section header

Keep the tutorial handler a plain function with a separate state holder

fb1c510

claude Bot reviewed Jun 29, 2026

View reviewed changes

Comment thread src/mcp/client/caching.py Outdated

maxisbey added 2 commits June 29, 2026 18:59

Era-scope cache arms and harden store interaction paths

ef7fdff

Strip userinfo textually and address review notes on the docs

af36ead

cubic-dev-ai Bot reviewed Jun 29, 2026

View reviewed changes

Comment thread src/mcp/client/client.py Outdated

claude Bot reviewed Jun 29, 2026

View reviewed changes

Address review feedback on identity stripping and stale tool-map pruning

b44a891

claude Bot reviewed Jun 29, 2026

View reviewed changes

felixweinberger approved these changes Jun 30, 2026

View reviewed changes

maxisbey merged commit b15b1d5 into main Jun 30, 2026
35 checks passed

maxisbey deleted the client-response-cache branch June 30, 2026 10:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add a client-side response cache honoring SEP-2549 caching hints#3023

Add a client-side response cache honoring SEP-2549 caching hints#3023
maxisbey merged 18 commits into
mainfrom
client-response-cache

maxisbey commented Jun 29, 2026

Uh oh!

cubic-dev-ai Bot left a comment •

edited

Loading

Uh oh!

Uh oh!

cubic-dev-ai Bot Jun 29, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cubic-dev-ai Bot left a comment •

edited

Loading

Uh oh!

Uh oh!

claude Bot left a comment

Uh oh!

claude Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-    global now
+    global now, fetches
+    now = 1_000_000.0
+    fetches = 0

Uh oh!

Conversation

maxisbey commented Jun 29, 2026

Surface

Semantics worth reviewing

Server-side fixes that ride along

Docs & tests

Uh oh!

cubic-dev-ai Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cubic-dev-ai Bot Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cubic-dev-ai Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Overview

Status of prior findings

Security risks

Level of scrutiny

Other factors

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Overview

Prior feedback status

Security risks

Level of scrutiny

Other factors

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cubic-dev-ai Bot left a comment •

edited

Loading

cubic-dev-ai Bot Jun 29, 2026 •

edited

Loading

cubic-dev-ai Bot left a comment •

edited

Loading