MCP isn't another function-calling wrapper — it collapses the N×M integration problem into N+M.
You already write tool schemas and wire up APIs. MCP's engineering value isn't "now I can call tools" — it's that it changes the topology of integration: instead of writing glue for every model × every data source (N×M), you write a server once and every client reuses it (N+M). But what actually decides whether MCP works for you are three points most people overlook: the control axis of the three primitives (what belongs in a Tool vs a Resource), the process model of the transport (one obscure stdio bug can silently corrupt your whole JSON-RPC stream), and MCP's biggest production failure mode — the context tax of tool definitions (connect 10 servers and you burn thousands of tokens before the user says a word). This issue assumes you know what MCP is; it goes straight to engineering it and avoiding the traps.
An MCP server can expose three primitives. The difference isn't function but control axis:
Most people make everything a Tool, turning "read a config file" into a model tool call — wasting a decision and adding noise to tool selection. The right rule: read-only data → Resource, side-effecting action → Tool, user-initiated workflow → Prompt.
This control axis is also a trust axis: model-controlled Tools carry the highest risk (the model may fire side effects when you didn't expect it), so hosts broadly put permission gates on Tools; application-controlled Resources are host-curated — what gets injected and when is under control; user-controlled Prompts are human-initiated, the highest trust. Misplacing an action as a Resource bypasses an approval that should exist; misplacing data as a Tool drops a zero-risk read into the high-risk, approval-gated channel. Primitive choice isn't just UX — it's the permission boundary.
# Classify before you design the server — pin to the top of the file
# Does it change external state? → yes → Tool
# Does it just "let the model see" data? → yes → Resource
# Does the user have to click to fire it? → yes → Prompt
# Unsure: default to Resource (cheapest, no tool-list pollution)
The official Python SDK's FastMCP automates JSON-RPC, schema generation, and transport; you write three decorators. The key engineering point carries over from Day 4 Tool Use: parameter descriptions > parameter names, and the docstring is read by the model, not by humans. Type hints auto-convert to JSON Schema and docstrings auto-become tool descriptions — these two directly decide whether the model picks the right tool and passes the right args.
Know the limits of the automation: simple type hints (str / int / list[str]) convert to clean schema, but complex parameters can't express their constraints through a type name alone (value ranges, formats, mutual exclusions) — those must go into the docstring or be added via pydantic's Field descriptions. In other words, auto-generated schema saves boilerplate, not the cognitive work of spelling out constraints for the model — and that latter work is exactly the dividing line of server quality. A server that writes each parameter's bounds, units, and examples into its descriptions will have a meaningfully higher call-success rate than one with bare type names.
# pip install "mcp[cli]"
from mcp.server.fastmcp import FastMCP
mcp = FastMCP("weather")
@mcp.tool()
def get_forecast(city: str, days: int = 3) -> str:
"""Get the weather forecast for a city.
Args:
city: City name in English, e.g. 'Tokyo'
days: Number of days, 1-7, default 3
"""
return _call_weather_api(city, days)
@mcp.resource("config://units")
def units() -> str:
"""Current temperature unit preference (read-only, auto into context)"""
return "celsius"
@mcp.prompt()
def trip_brief(city: str) -> str:
"""Generate a travel weather brief (user-triggered template)"""
return f"In one sentence, is {city} good for outdoor activity?"
if __name__ == "__main__":
mcp.run(transport="stdio")
MCP uses JSON-RPC 2.0 over two standard transports:
/mcp endpoint accepting both POST and GET, optionally using SSE to stream multiple server messages. Scalable, multi-tenant — but you handle auth yourself. Note: the old HTTP+SSE two-endpoint transport was deprecated in 2025-03; new projects use Streamable HTTP directly.Both transports run the same protocol lifecycle: after connecting, an initialize handshake runs — client and server exchange protocol versions and negotiate the capabilities each supports (the server declares whether it has tools / resources / prompts, and whether it supports dynamic list-change notifications). This means different clients connecting to the same server may see different capability surfaces; it also means your server must honestly declare its capabilities, or the client won't fetch the corresponding lists. Capability negotiation is the root of MCP's "N+M reuse": the client needn't know in advance what a server looks like — it asks once at handshake.
// Local stdio: Claude Desktop / Claude Code config
{
"mcpServers": {
"weather": {
"command": "python",
"args": ["weather_server.py"]
}
}
}
print("debug") or a library's progress bar silently corrupts the entire protocol stream — the client reports "invalid JSON" with no traceable source. Iron rule: stdio server logs always go to stderr (logging defaults to stderr, but don't slip a print in).MCP's convenience has a hidden cost, and the cost is twofold. First, preloading: clients typically dump the full tool schemas of every connected server into context at the start — a fixed overhead independent of what the user asks. Second, more insidious — intermediate-result accumulation: in an agentic loop every tool call's return value stays in context for later turns, so one call to a tool with a large return permanently nails thousands of lines of JSON into every subsequent turn's input. The two compound: more tools makes the opening expensive, more calls makes the process bloat, and by the late stage of a long task the context is packed with schemas and intermediate data no longer relevant — burning money and diluting the model's attention (back to Day 2 lost-in-the-middle). Anthropic's 2025-11 engineering blog Code execution with MCP offers a counterintuitive fix: expose MCP servers as code APIs, let the agent write code to call them, import only the tools it needs on demand, and process intermediate data in the execution environment before returning. They report a workflow that previously consumed about 150k tokens dropping to about 2k tokens (~98.7% reduction). Core insight: the tool catalog shouldn't live in context — it should be progressively disclosed (discovered on demand).
# Three savings you can make today without code execution:
# 1. Connect only the servers this task needs; close when done
# (Claude Code: don't put all MCP servers in global settings)
# 2. Don't pile dozens of tools in one server — split by
# responsibility, or merge similar ones
# 3. Trim/summarize large return values server-side; don't let
# thousands of lines of raw JSON pass through context