Skip to content

Add a client-side response cache honoring SEP-2549 caching hints#3023

Merged
maxisbey merged 18 commits into
mainfrom
client-response-cache
Jun 30, 2026
Merged

Add a client-side response cache honoring SEP-2549 caching hints#3023
maxisbey merged 18 commits into
mainfrom
client-response-cache

Conversation

@maxisbey

Copy link
Copy Markdown
Contributor

Implements the client half of SEP-2549: Client now honors the ttlMs/cacheScope hints servers stamp on the six cacheable results, serving fresh repeat calls from a response cache instead of refetching. Builds on #3015 (the server-side authoring surface); part of #2899.

Surface

  • Client(cache=...): None (default) = on with a per-client in-memory store; a CacheConfig to customize (pluggable async ResponseCacheStore, partition, server-identity override, default TTL, injectable clock); False = off entirely.
  • Per-call cache_mode: "use" | "refresh" | "bypass" on list_tools, list_prompts, list_resources, list_resource_templates, read_resource.
  • mcp.client exports the config and store-contract types; CacheableMethod/CACHEABLE_METHODS moved to mcp_types.methods (derived from the method registry, re-exported from mcp.server.caching unchanged).

Semantics worth reviewing

  • Hints are honored only on 2026-07-28 sessions. Legacy peers can put ttlMs/cacheScope keys on the wire (the 2025 surfaces validate-and-discard unknown keys), so wire presence is not an era signal — the cache gates on the negotiated version and treats legacy results as hint-absent.
  • Sharing is partitioned. Keys carry a sha256 of a client-derived server identity (the URL you dialed, userinfo-stripped, or CacheConfig.target_id; never serverInfo) plus the caller-supplied partition. "public" entries are partition-scoped by default; CacheConfig.share_public (constructor-only) opts into cross-partition serving. A custom store requires an explicit partition; a custom store with no derivable server identity is a construction-time error.
  • Multi-round-trip reads are never cached (spec MUST): seeded read_resource calls skip the cache entirely, input_required results are never stored, and terminal results reached through driver rounds are not cached.
  • Cursor continuation pages never read or write the cache; an expired-cursor INVALID_PARAMS rejection evicts the method's entry (spec SHOULD).
  • Notifications evict: *_list_changed evicts the matching list(s), resources/updated evicts that URI (exact-URI, both arms).
  • Calls carrying meta always reach the server — a progress token or tracing key is never swallowed by a cache hit.
  • A tools/list cache hit re-applies the session's post-fetch pipeline (x-mcp-header MUST filter, Mcp-Param-* header maps, output schemas), so served and fetched listings behave identically on later tools/calls.
  • Negative inbound ttlMs is now treated as 0 at the parse seams (spec SHOULD) instead of failing the call — this also fixes mode="auto" silently downgrading to the legacy handshake when a discover result carried a negative TTL.
  • Hardening: 24h TTL cap, deep copies on write and serve, delete-opposite-arm-first writes with a post-set eviction re-check (cancellation-shielded), store failures degrade to misses and never fail a successful fetch.

Server-side fixes that ride along

  • Configured cache_hints now apply to mapping-shaped handler returns per-field (previously a dict return on a hinted method failed the request at 2026, since the surface requires the fields and the stamp only ran on model instances).
  • validate_cache_hints reports non-string unknown keys with the intended ValueError instead of a TypeError from sorted/join.

Docs & tests

docs/advanced/caching.md client section rewritten around the real cache (was "no built-in response cache"), tutorial003 is now a runnable cache demo with an injected clock, migration note added. ~130 new tests (unit, e2e against Server(cache_hints=...), scripted legacy/modern peers, fault-injecting stores), 100% branch coverage maintained.

AI Disclaimer

@maxisbey maxisbey marked this pull request as ready for review June 29, 2026 17:39

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 issues found across 16 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="docs_src/caching/tutorial003.py">

<violation number="1" location="docs_src/caching/tutorial003.py:27">
P2: `tutorial003.main()` mutates module-global cache demo state without resetting it, so repeated runs produce incorrect/non-deterministic output.</violation>
</file>

Tip: cubic can generate docs of your entire codebase and keep them up to date. Try it here.

Fix all with cubic | Re-trigger cubic

Comment thread src/mcp/client/caching.py Outdated
Comment thread docs_src/caching/tutorial003.py Outdated
async with Client(mcp) as client:
tools = await client.list_tools()
print(f"{len(tools.tools)} tools, fresh for {tools.ttl_ms / 1000:.0f}s, scope={tools.cache_scope}")
global now

@cubic-dev-ai cubic-dev-ai Bot Jun 29, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: tutorial003.main() mutates module-global cache demo state without resetting it, so repeated runs produce incorrect/non-deterministic output.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At docs_src/caching/tutorial003.py, line 27:

<comment>`tutorial003.main()` mutates module-global cache demo state without resetting it, so repeated runs produce incorrect/non-deterministic output.</comment>

<file context>
@@ -1,15 +1,34 @@
-    async with Client(mcp) as client:
-        tools = await client.list_tools()
-        print(f"{len(tools.tools)} tools, fresh for {tools.ttl_ms / 1000:.0f}s, scope={tools.cache_scope}")
+    global now
+    async with Client(server, cache=CacheConfig(clock=lambda: now)) as client:
+        await client.list_tools()  # fetch 1
</file context>
Suggested change
global now
global now, fetches
now = 1_000_000.0
fetches = 0
Fix with cubic

Comment thread docs/migration.md Outdated
Comment thread docs/advanced/caching.md Outdated
Comment thread src/mcp/client/caching.py Outdated

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 7 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="docs_src/caching/tutorial003.py">

<violation number="1" location="docs_src/caching/tutorial003.py:27">
P2: `tutorial003.main()` mutates module-global cache demo state without resetting it, so repeated runs produce incorrect/non-deterministic output.</violation>
</file>

Tip: Review your code locally with the cubic CLI to iterate faster.

Fix all with cubic | Re-trigger cubic

Comment thread src/mcp/client/client.py Outdated

@claude claude Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The store set-failure issue I flagged earlier is now fixed at HEAD (the failed-set path purges the warm own-arm entry, with a covering test), and I found no new bugs in this revision — but this PR adds a whole client-side caching subsystem with multi-tenant sharing semantics, so it still needs a human reviewer.

Extended reasoning...

Overview

This PR implements the client half of SEP-2549: a new mcp.client.caching module (store contract, in-memory LRU store, and a ClientResponseCache coordinator), wiring into Client (per-call cache_mode, server-identity hashing, notification-driven eviction), an inbound ttlMs clamp in ClientSession, the move of CacheableMethod/CACHEABLE_METHODS into mcp_types.methods, server-side fixes for dict-shaped handler results, plus extensive docs and ~3,000 lines of new tests.

Status of prior findings

My earlier inline comment flagged that a failed store set during a refresh left the superseded warm entry serving. At the current head, ClientResponseCache.write checks the boolean from _set and purges the own arm on failure (src/mcp/client/caching.py:286-289), and tests/client/test_caching.py adds test_a_failed_set_purges_the_pre_existing_own_arm_entry covering the warm-entry case. The cubic doc-wording comments also appear addressed. This run's bug hunt surfaced no new issues.

Security risks

The cache introduces real cross-tenant sharing surface: partition/arm key construction, the share_public opt-in, server-identity derivation from URLs (userinfo stripping, sha256 before key material), and era scoping. The defaults look conservative (private-by-default, partition-scoped public entries, custom store requires explicit partition), but these are exactly the kinds of decisions a maintainer should sign off on, since a subtle keying mistake leaks one principal's responses to another.

Level of scrutiny

High. This touches the core Client request path for five verbs, changes default behavior (caching on by default for hint-bearing 2026 servers), and adds public API surface (CacheConfig, ResponseCacheStore, cache_mode) that will be hard to change later. Cancellation-shielded store cleanup and eviction-race generation tracking are also concurrency-sensitive.

Other factors

Test coverage is unusually thorough (store contract tests, fault-injecting stores, scripted legacy/modern peers, e2e eviction tests), which raises confidence, but the size, new public API, and multi-tenant semantics put this well outside the bar for shadow approval.

@claude claude Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing the failed-set purge — the fix in ClientResponseCache.write (with the warm-entry regression test) resolves my earlier inline comment, and I found no new issues in this revision. That said, this is a large change to the client's core verbs with security-relevant cache partitioning semantics, so it deserves a maintainer's review rather than an automated approval.

Extended reasoning...

Overview

This PR adds the full client half of SEP-2549: a new mcp.client.caching module (store contract, in-memory LRU store, ClientResponseCache coordinator), wiring through Client (cache= kwarg, per-call cache_mode, server-identity derivation, notification-driven eviction), a ttlMs clamp at the session parse seam, tool-listing absorption/pruning changes in ClientSession, server-side fixes for dict-shaped handler results, plus extensive docs and ~3,000 lines of new tests.

Prior feedback status

My earlier inline comment (ignored _set return leaving a superseded warm entry servable after a failed store set) is fixed at HEAD: write() now purges the own arm when _set returns False, mirroring the failed-opposite-arm-delete path, and test_a_failed_set_purges_the_pre_existing_own_arm_entry covers the warm-entry case. The cubic findings on userinfo stripping and the docs qualifications were also addressed in subsequent commits. No new issues were identified in this pass.

Security risks

The cache introduces genuinely security-relevant surface: partition isolation of private entries in shared stores, the share_public cross-tenant opt-in, server-identity hashing (credentials stripped from URLs before keying), and era-scoping of entries. The implementation appears careful (partition-scoped public arm by default, sha256 of identity, constructor-only sharing widening, scope verification on read), and these properties are pinned by tests — but the consequences of a mistake here are cross-tenant data leakage, which is exactly the kind of judgment call a human maintainer should sign off on.

Level of scrutiny

High. This touches every cacheable Client verb's hot path, changes default behavior (cache on by default, even though hint-less servers see identical traffic), and adds a public API surface (CacheConfig, ResponseCacheStore, cache_mode) the project will need to maintain. There are also design decisions worth a maintainer's eye: deviating from the TypeScript SDK on public-entry partitioning, the meta-implies-refresh rule, and the no-coalescing/no-stale-if-error rulings.

Other factors

Test coverage is unusually thorough (store contract, fault-injecting stores, cancellation/shielding paths, e2e against scripted legacy/modern peers), which raises confidence in correctness, but the sheer size and the API/design decisions keep this firmly outside the scope of automated approval.

@maxisbey maxisbey merged commit b15b1d5 into main Jun 30, 2026
35 checks passed
@maxisbey maxisbey deleted the client-response-cache branch June 30, 2026 10:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants