feat(agent): run MCP tools as scripts#2771
Conversation
Prompt To Fix All With AIFix the following 2 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 2
packages/agent/src/mcp-scripting/signatures.ts:104-106
`*/` sequence in tool description breaks the JSDoc comment early. Any MCP tool whose description contains `*/` (e.g. `"Computes a*/b"`) will produce output like `/** Computes a*/b */`, which closes the JSDoc at the first `*/` and leaves trailing garbage text. The model then sees malformed TypeScript that could silently misrepresent the available signatures.
```suggestion
function oneLine(text: string): string {
return text.replace(/\s+/g, " ").trim().replace(/\*\//g, "* /");
}
```
### Issue 2 of 2
packages/agent/src/mcp-scripting/runner.ts:96-103
**Double-budget timing: total wall-clock time up to 2× `timeoutMs`**
The `run` IIFE calls `script.runInContext(context, { timeout: timeoutMs })` synchronously. This synchronous phase completes — and blocks — entirely *before* `withTimeout(run, timeoutMs)` starts its wall-clock timer. So a script with a synchronous CPU-bound loop (capped at `timeoutMs`) followed by async tool calls (capped at another `timeoutMs`) can run for nearly 2× the configured budget. At the `MAX_TIMEOUT_MS` of 120 s, that ceiling is 240 s. The `RunScriptOptions` docs and the tool's parameter description both promise a single "wall-clock budget," but the actual guarantee is two independent, sequential budgets.
Reviews (1): Last reviewed commit: "feat(agent): expose run_mcp_script / lis..." | Re-trigger Greptile |
4c0a6f6 to
49f7c85
Compare
|
React Doctor found no issues in the changed files. 🎉 Reviewed by React Doctor for commit |
Prompt To Fix All With AIFix the following 1 code review issue. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 1
packages/agent/src/mcp-scripting/runner.ts:80-81
**`setTimeout` callbacks outlive `pool.close()` and spawn leaked connections**
The sandbox receives the real Node.js `setTimeout`, so a script can schedule work that fires after `runScript` returns and after the `finally { await pool.close() }` in `tools.ts`. When that callback calls `tools.someServer.someMethod()`, `getClient` finds `this.clients` cleared but `this.configs` still populated, so it opens a brand-new connection — spawning a fresh subprocess for stdio servers or a new HTTP/SSE socket for remote ones — with no tracking and no cleanup. The tool call succeeds silently, appears in neither `logs` nor `result`, and the connection leaks until the process exits.
Reviews (2): Last reviewed commit: "fix(agent): single timeout budget and JS..." | Re-trigger Greptile |
| setTimeout, | ||
| clearTimeout, |
There was a problem hiding this comment.
setTimeout callbacks outlive pool.close() and spawn leaked connections
The sandbox receives the real Node.js setTimeout, so a script can schedule work that fires after runScript returns and after the finally { await pool.close() } in tools.ts. When that callback calls tools.someServer.someMethod(), getClient finds this.clients cleared but this.configs still populated, so it opens a brand-new connection — spawning a fresh subprocess for stdio servers or a new HTTP/SSE socket for remote ones — with no tracking and no cleanup. The tool call succeeds silently, appears in neither logs nor result, and the connection leaks until the process exits.
Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/agent/src/mcp-scripting/runner.ts
Line: 80-81
Comment:
**`setTimeout` callbacks outlive `pool.close()` and spawn leaked connections**
The sandbox receives the real Node.js `setTimeout`, so a script can schedule work that fires after `runScript` returns and after the `finally { await pool.close() }` in `tools.ts`. When that callback calls `tools.someServer.someMethod()`, `getClient` finds `this.clients` cleared but `this.configs` still populated, so it opens a brand-new connection — spawning a fresh subprocess for stdio servers or a new HTTP/SSE socket for remote ones — with no tracking and no cleanup. The tool call succeeds silently, appears in neither `logs` nor `result`, and the connection leaks until the process exits.
How can I resolve this? If you propose a fix, please make it concise.Lets the agent write one JS script that calls connected MCP tools as async functions instead of one tool-call at a time. Adds: - McpClientPool: opens MCP clients from the session's McpServerConfig map, inheriting auth (stdio env, http/sse headers) verbatim - buildToolsProxy: lazy tools.<server>.<tool>(args) proxy - runScript: constrained node:vm sandbox with wall-clock timeout, captured console, and no ambient fs/net/process authority - renderToolsetSignatures: JSON Schema to TS-style signatures
Registers the scripting tools in the local-tools registry and threads the session's external MCP server map into LocalToolCtx from both the Claude (claude-agent.ts) and Codex (codex-agent.ts) adapters, so a script dials the same servers with inherited auth. Tools self-disable when no external MCP servers are connected.
- runScript now enforces timeoutMs as one shared wall-clock deadline across the synchronous and async phases (previously up to 2x the budget) - signature rendering neutralizes */ in tool descriptions so a description can't close the generated JSDoc block early
f3a43fa to
5bf7d60
Compare
Problem
The agent calls connected MCP tools one at a time. Anything that fans out (read 100 issues, comment on the stale ones) becomes 100 sequential round-trips, each paying full tool-call overhead and burning context. There was no way to express "loop over these results and act on each" in a single step.
Changes
Adds a capability that lets the agent write one JavaScript script that calls connected MCP tools as async functions:
Two local tools sit alongside the existing signed-git tools:
list_mcp_toolsreturns.d.ts-style signatures for everytools.<server>.<tool>(args), generated from each tool's MCP input schema.run_mcp_scripttakes{ script, timeoutMs? }and returns{ result, logs, error? }.New module
packages/agent/src/mcp-scripting/:client-pool.tsopens and caches MCPClients from the session config map, withlistTools/callTooland ascriptableServerNameshelper.proxy.tsbuilds the lazytools.<server>.<tool>(args)proxy and rejects on toolisError.runner.tsruns the script in a constrainednode:vmsandbox with a wall-clock timeout and captured console.signatures.tsrenders JSON Schema into TS-style signatures.tools.tsholds the two local-tool definitions.Wiring:
local-tools/registry.tscarries the scriptable MCP server map onLocalToolCtx,local-tools/index.tsregisters the tools, and both the Claude and Codex adapters thread the external MCP server map through.Build, not adopt
Evaluated Cloudflare Code Mode,
@utcp/code-mode/code-mode-mcp, andmcpac. Each runs as a separate process or MCP server with its own config and credentials, and several layer in a second abstraction (UTCP). None reuse the in-processMcpServerConfigmap with already-resolved credentials, which is the entire integration here. Building it is a thin layer over the MCP SDKClientwe already depend on, with no new runtime dependencies (only@modelcontextprotocol/sdkandzod, both already present). Full rationale in the module README.Auth and sandbox
No new auth path. The proxy dials the exact
McpServerConfigmap the agent's own MCP tools use, so credentials are inherited verbatim (stdioenv, http/sseheaders). In-processsdkservers are excluded since they have no dialable transport. A script can only reach servers the session was already authorized for; nothing the model writes can set or escalate credentials.The
node:vmsandbox runs with an explicit-allowlist global set (tools, capturedconsole, pure helpers) and deniesrequire,import,process,global/globalThis,Buffer,fetch, and filesystem access.codeGenerationis disabled soeval/new Functionthrow. This is deliberately not a hard boundary against adversarial in-process code, which is documented as a known limitation: the script author is the same agent that already calls these tools, and cloud runs sandbox the whole agent for real isolation.How did you test this?
Automated tests only (authored by an agent):
mcp-scripting.test.ts(20) andclient-pool.integration.test.ts(7), the latter running end-to-end against a real stdio MCP server fixture. They cover proxy generation, script execution, looping/batching, timeout enforcement, error surfacing (script throw and toolisError), blocked sandbox escapes (require/process/global/Buffer/fetch/new Function), signature rendering, tool gating, stdioenvpropagation, and reporting of unreachable servers.packages/agentsuite: 761 passing across 58 files.Automatic notifications
🤖 Agent context
Autonomy: Human-driven (agent-assisted). Michael Matloka directed the design and scope; an agent implemented and tested it. The capability is gated behind explicit local tools and reuses existing session credentials, so it adds no new auth surface.