feat(agent): run MCP tools as scripts by Twixes · Pull Request #2771 · PostHog/code

Twixes · 2026-06-19T07:27:01Z

Problem

The agent calls connected MCP tools one at a time. Anything that fans out (read 100 issues, comment on the stale ones) becomes 100 sequential round-trips, each paying full tool-call overhead and burning context. There was no way to express "loop over these results and act on each" in a single step.

Changes

Adds a capability that lets the agent write one JavaScript script that calls connected MCP tools as async functions:

const issues = await tools.linear.listIssues({ teamId })
for (const i of issues.filter((x) => x.stale)) {
  await tools.linear.createComment({ issueId: i.id, body: "bump" })
}
return { bumped: issues.length }

Two local tools sit alongside the existing signed-git tools:

list_mcp_tools returns .d.ts-style signatures for every tools.<server>.<tool>(args), generated from each tool's MCP input schema.
run_mcp_script takes { script, timeoutMs? } and returns { result, logs, error? }.

New module packages/agent/src/mcp-scripting/:

client-pool.ts opens and caches MCP Clients from the session config map, with listTools / callTool and a scriptableServerNames helper.
proxy.ts builds the lazy tools.<server>.<tool>(args) proxy and rejects on tool isError.
runner.ts runs the script in a constrained node:vm sandbox with a wall-clock timeout and captured console.
signatures.ts renders JSON Schema into TS-style signatures.
tools.ts holds the two local-tool definitions.

Wiring: local-tools/registry.ts carries the scriptable MCP server map on LocalToolCtx, local-tools/index.ts registers the tools, and both the Claude and Codex adapters thread the external MCP server map through.

Build, not adopt

Evaluated Cloudflare Code Mode, @utcp/code-mode / code-mode-mcp, and mcpac. Each runs as a separate process or MCP server with its own config and credentials, and several layer in a second abstraction (UTCP). None reuse the in-process McpServerConfig map with already-resolved credentials, which is the entire integration here. Building it is a thin layer over the MCP SDK Client we already depend on, with no new runtime dependencies (only @modelcontextprotocol/sdk and zod, both already present). Full rationale in the module README.

Auth and sandbox

No new auth path. The proxy dials the exact McpServerConfig map the agent's own MCP tools use, so credentials are inherited verbatim (stdio env, http/sse headers). In-process sdk servers are excluded since they have no dialable transport. A script can only reach servers the session was already authorized for; nothing the model writes can set or escalate credentials.

The node:vm sandbox runs with an explicit-allowlist global set (tools, captured console, pure helpers) and denies require, import, process, global / globalThis, Buffer, fetch, and filesystem access. codeGeneration is disabled so eval / new Function throw. This is deliberately not a hard boundary against adversarial in-process code, which is documented as a known limitation: the script author is the same agent that already calls these tools, and cloud runs sandbox the whole agent for real isolation.

How did you test this?

Automated tests only (authored by an agent):

27 new tests across mcp-scripting.test.ts (20) and client-pool.integration.test.ts (7), the latter running end-to-end against a real stdio MCP server fixture. They cover proxy generation, script execution, looping/batching, timeout enforcement, error surfacing (script throw and tool isError), blocked sandbox escapes (require / process / global / Buffer / fetch / new Function), signature rendering, tool gating, stdio env propagation, and reporting of unreachable servers.
Full packages/agent suite: 761 passing across 58 files.
Typecheck and build both clean.

Automatic notifications

Publish to changelog?
Alert Sales and Marketing teams?

🤖 Agent context

Autonomy: Human-driven (agent-assisted). Michael Matloka directed the design and scope; an agent implemented and tested it. The capability is gated behind explicit local tools and reuses existing session credentials, so it adds no new auth surface.

greptile-apps · 2026-06-19T07:32:47Z

Prompt To Fix All With AI

Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
packages/agent/src/mcp-scripting/signatures.ts:104-106
`*/` sequence in tool description breaks the JSDoc comment early. Any MCP tool whose description contains `*/` (e.g. `"Computes a*/b"`) will produce output like `/** Computes a*/b */`, which closes the JSDoc at the first `*/` and leaves trailing garbage text. The model then sees malformed TypeScript that could silently misrepresent the available signatures.

```suggestion
function oneLine(text: string): string {
  return text.replace(/\s+/g, " ").trim().replace(/\*\//g, "* /");
}
```

### Issue 2 of 2
packages/agent/src/mcp-scripting/runner.ts:96-103
**Double-budget timing: total wall-clock time up to 2× `timeoutMs`**

The `run` IIFE calls `script.runInContext(context, { timeout: timeoutMs })` synchronously. This synchronous phase completes — and blocks — entirely *before* `withTimeout(run, timeoutMs)` starts its wall-clock timer. So a script with a synchronous CPU-bound loop (capped at `timeoutMs`) followed by async tool calls (capped at another `timeoutMs`) can run for nearly 2× the configured budget. At the `MAX_TIMEOUT_MS` of 120 s, that ceiling is 240 s. The `RunScriptOptions` docs and the tool's parameter description both promise a single "wall-clock budget," but the actual guarantee is two independent, sequential budgets.

_{Reviews (1): Last reviewed commit: "feat(agent): expose run_mcp_script / lis..." | Re-trigger Greptile}

github-actions · 2026-06-19T07:38:53Z

React Doctor found no issues in the changed files. 🎉

_{Reviewed by React Doctor for commit 5bf7d60.}

greptile-apps · 2026-06-20T12:10:54Z

Prompt To Fix All With AI

Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
packages/agent/src/mcp-scripting/runner.ts:80-81
**`setTimeout` callbacks outlive `pool.close()` and spawn leaked connections**

The sandbox receives the real Node.js `setTimeout`, so a script can schedule work that fires after `runScript` returns and after the `finally { await pool.close() }` in `tools.ts`. When that callback calls `tools.someServer.someMethod()`, `getClient` finds `this.clients` cleared but `this.configs` still populated, so it opens a brand-new connection — spawning a fresh subprocess for stdio servers or a new HTTP/SSE socket for remote ones — with no tracking and no cleanup. The tool call succeeds silently, appears in neither `logs` nor `result`, and the connection leaks until the process exits.

_{Reviews (2): Last reviewed commit: "fix(agent): single timeout budget and JS..." | Re-trigger Greptile}

greptile-apps · 2026-06-20T12:10:58Z

+      setTimeout,
+      clearTimeout,


setTimeout callbacks outlive pool.close() and spawn leaked connections

The sandbox receives the real Node.js setTimeout, so a script can schedule work that fires after runScript returns and after the finally { await pool.close() } in tools.ts. When that callback calls tools.someServer.someMethod(), getClient finds this.clients cleared but this.configs still populated, so it opens a brand-new connection — spawning a fresh subprocess for stdio servers or a new HTTP/SSE socket for remote ones — with no tracking and no cleanup. The tool call succeeds silently, appears in neither logs nor result, and the connection leaks until the process exits.

Prompt To Fix With AI

This is a comment left during a code review. Path: packages/agent/src/mcp-scripting/runner.ts Line: 80-81 Comment: **`setTimeout` callbacks outlive `pool.close()` and spawn leaked connections** The sandbox receives the real Node.js `setTimeout`, so a script can schedule work that fires after `runScript` returns and after the `finally { await pool.close() }` in `tools.ts`. When that callback calls `tools.someServer.someMethod()`, `getClient` finds `this.clients` cleared but `this.configs` still populated, so it opens a brand-new connection — spawning a fresh subprocess for stdio servers or a new HTTP/SSE socket for remote ones — with no tracking and no cleanup. The tool call succeeds silently, appears in neither `logs` nor `result`, and the connection leaks until the process exits. How can I resolve this? If you propose a fix, please make it concise.

Lets the agent write one JS script that calls connected MCP tools as async functions instead of one tool-call at a time. Adds: - McpClientPool: opens MCP clients from the session's McpServerConfig map, inheriting auth (stdio env, http/sse headers) verbatim - buildToolsProxy: lazy tools.<server>.<tool>(args) proxy - runScript: constrained node:vm sandbox with wall-clock timeout, captured console, and no ambient fs/net/process authority - renderToolsetSignatures: JSON Schema to TS-style signatures

Registers the scripting tools in the local-tools registry and threads the session's external MCP server map into LocalToolCtx from both the Claude (claude-agent.ts) and Codex (codex-agent.ts) adapters, so a script dials the same servers with inherited auth. Tools self-disable when no external MCP servers are connected.

- runScript now enforces timeoutMs as one shared wall-clock deadline across the synchronous and async phases (previously up to 2x the budget) - signature rendering neutralizes */ in tool descriptions so a description can't close the generated JSDoc block early

stamphog

The review agent couldn't reach its LLM backend — an infrastructure or credentials issue, not a problem with this PR. The Stamphog label has been kept; the review retries automatically on the next push, or re-apply the label once the backend recovers.

Twixes self-assigned this Jun 19, 2026

greptile-apps Bot reviewed Jun 19, 2026

View reviewed changes

Comment thread packages/agent/src/mcp-scripting/signatures.ts

Comment thread packages/agent/src/mcp-scripting/runner.ts Outdated

Twixes force-pushed the feat/mcp-tools-as-scripts branch from 4c0a6f6 to 49f7c85 Compare June 19, 2026 07:38

Twixes marked this pull request as ready for review June 20, 2026 12:06

greptile-apps Bot reviewed Jun 20, 2026

View reviewed changes

Twixes added 3 commits June 20, 2026 15:11

Twixes force-pushed the feat/mcp-tools-as-scripts branch from f3a43fa to 5bf7d60 Compare June 20, 2026 13:12

charlesvien added the Stamphog This will request an autostamp by stamphog on small changes label Jun 21, 2026

stamphog Bot reviewed Jun 21, 2026

View reviewed changes

stamphog Bot removed the Stamphog This will request an autostamp by stamphog on small changes label Jun 21, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(agent): run MCP tools as scripts#2771

feat(agent): run MCP tools as scripts#2771
Twixes wants to merge 3 commits into
mainfrom
feat/mcp-tools-as-scripts

Twixes commented Jun 19, 2026

Uh oh!

greptile-apps Bot commented Jun 19, 2026

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Jun 19, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented Jun 20, 2026

Uh oh!

greptile-apps Bot Jun 20, 2026

Uh oh!

stamphog Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Twixes commented Jun 19, 2026

Problem

Changes

Build, not adopt

Auth and sandbox

How did you test this?

Automatic notifications

🤖 Agent context

Uh oh!

greptile-apps Bot commented Jun 19, 2026

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps Bot commented Jun 20, 2026

Uh oh!

greptile-apps Bot Jun 20, 2026

Choose a reason for hiding this comment

Uh oh!

stamphog Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions Bot commented Jun 19, 2026 •

edited

Loading