What is MetaMCP?

The idea

Your model should write clean, sequential tool calls. Everything else -- connection management, error recovery, schema discovery, process lifecycle -- should be invisible.

MetaMCP is a runtime that sits between your model and N child MCP servers. It absorbs the complexity downward so the model never encounters it. The model sees 4 tools. The runtime handles the rest.

                    ┌─── playwright (52 tools)
                    │
LLM ──► MetaMCP ────┼─── fetch (3 tools)
        (4 tools)   │
                    ├─── sqlite (6 tools)
                    │
                    └─── ... N more servers

The problem it solves

Every MCP server registers tool schemas with the LLM. Each schema eats context tokens.

Servers	Tools per server	Total schemas	Estimated tokens
2	15	30	~4,500
5	20	100	~15,000
10	25	250	~37,500

At 5 servers you spend ~15,000 tokens on schemas alone. Every request pays this cost, even when the model only needs one tool from one server. Large schema sets also degrade tool selection accuracy -- more descriptions means more ambiguity.

What the runtime absorbs

MetaMCP collapses everything into 4 constant-cost tools (~1,000 tokens). Behind those 4 tools, the runtime handles each layer of complexity:

Connection lifecycle. Servers are spawned lazily on first use, pooled with LIFO eviction, and shut down with escalating signals. The model never manages a process.

Error classification. Not all errors are equal. Auth failures (401, 403) are logged but never trip the circuit breaker. Transient errors (timeouts, crashes) count toward the breaker threshold and trigger automatic cooldown. The model gets a clean error message; the runtime decides what to do about it.

Schema discovery. Tool catalogs are built on first connect, cached to disk for fast cold starts, and searched with hybrid ranking (semantic + keyword). The model calls mcp_discover and gets ranked results. It never parses raw schema JSON.

Sandboxed execution. mcp_execute runs JavaScript in a locked-down V8 context with all provisioned servers in scope. The model composes multi-step workflows -- loops, conditionals, error handling -- in a single call. No network access, no filesystem, no eval.

Transport abstraction. Local servers connect over stdio. Remote servers connect over HTTP or SSE with optional OAuth. The model calls mcp_call the same way regardless of transport.

The 4 tools

mcp_discover -- search tool catalogs across all child servers. Without a query, returns server status.
mcp_provision -- describe what you need. MetaMCP resolves the right server, including auto-installing from npm.
mcp_call -- forward a tool call to a specific child server. Retries once on crash for vital servers.
mcp_execute -- run code in a V8 sandbox with access to all provisioned servers.

For a detailed breakdown, see The Four Tools.

When to use it

MetaMCP makes sense when you run 3 or more MCP servers. The token savings scale with every server you add. If you only use one or two, the indirection may not be worth it.

It is a good fit when:

You want constant token cost regardless of how many servers you add
You need a single connection point for the LLM
You want code-mode composition across multiple servers in a single turn
You want error recovery and connection management handled automatically

Next steps

Installation to get MetaMCP running.
Quick Start for a hands-on walkthrough.
Architecture for connection pooling, circuit breakers, and shutdown behavior.

NextInstallation