Ship an MCP Server in Python That Exposes Your Internal API to LLMs
Wrap a corporate REST API in three typed tools using FastMCP, inspect them locally, and connect them to Claude Desktop—without ever exposing credentials to the model.
What You'll Build
A Python MCP server using FastMCP that wraps a corporate REST API as three structured tools—search_customers, get_order, and create_support_ticket. Any MCP-compatible client (Claude Desktop, Cursor, custom agents) can call your API with full type safety, without the model ever seeing credentials or constructing raw URLs.
Prerequisites
- Python 3.10+ (required for built-in generic types like
list[dict]) piporuvfor package management- Node.js 18+ —
mcp devinvokesnpx @modelcontextprotocol/inspectorunder the hood - Latest Claude Desktop (for end-to-end testing; optional if using only the inspector)
- A REST API with a bearer token — a mock URL works fine to follow along
- Comfortable with
async/awaitPython
1. Set Up the Project
mkdir mcp-internal-api && cd mcp-internal-api
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install "mcp[cli]" httpx python-dotenv
mcp[cli] installs the mcp CLI used for local inspection. httpx handles async HTTP to your backend.
Create .env for local credentials — add it to .gitignore now:
API_BASE_URL=https://api.corp.example.com
API_KEY=sk-your-real-token-here
2. Write the Server
Create server.py:
import os
import httpx
from dotenv import load_dotenv
from mcp.server.fastmcp import FastMCP
load_dotenv()
mcp = FastMCP("internal-api")
_BASE = os.environ["API_BASE_URL"]
_KEY = os.environ["API_KEY"]
def _auth_headers() -> dict[str, str]:
return {"Authorization": f"Bearer {_KEY}", "Accept": "application/json"}
@mcp.tool()
async def search_customers(query: str, limit: int = 10) -> list[dict]:
"""Search customers by name or email. Returns a list of customer records."""
async with httpx.AsyncClient() as client:
r = await client.get(
f"{_BASE}/customers",
headers=_auth_headers(),
params={"q": query, "limit": limit},
timeout=10.0,
)
r.raise_for_status()
return r.json()
@mcp.tool()
async def get_order(order_id: str) -> dict:
"""Fetch a single order by its ID."""
async with httpx.AsyncClient() as client:
r = await client.get(
f"{_BASE}/orders/{order_id}",
headers=_auth_headers(),
timeout=10.0,
)
r.raise_for_status()
return r.json()
@mcp.tool()
async def create_support_ticket(
customer_id: str,
subject: str,
body: str,
priority: str = "normal",
) -> dict:
"""Open a support ticket for a customer.
Args:
customer_id: The customer's UUID.
subject: One-line summary (max 120 chars).
body: Full description of the issue.
priority: 'low', 'normal', or 'high'.
"""
if priority not in {"low", "normal", "high"}:
raise ValueError(f"priority must be low/normal/high, got '{priority}'")
async with httpx.AsyncClient() as client:
r = await client.post(
f"{_BASE}/tickets",
headers=_auth_headers(),
json={
"customer_id": customer_id,
"subject": subject,
"body": body,
"priority": priority,
},
timeout=10.0,
)
r.raise_for_status()
return r.json()
if __name__ == "__main__":
mcp.run()
Why each decision matters:
| Detail | Reason |
|---|---|
| Type annotations | FastMCP auto-generates JSON Schema from them — the LLM receives exact parameter types, not free-form text |
| Docstrings | Become the tool description the model reads before calling; write them like an API spec |
raise_for_status() + ValueError |
Exceptions surface to the LLM as structured tool errors rather than crashing the server process |
| Credentials in env vars | Never passed as tool arguments, never echoed in responses, never in source control |
mcp.run() defaults to stdio transport, which is what Claude Desktop and most local clients expect — the client spawns your server as a subprocess and talks JSON-RPC over stdin/stdout.
3. Inspect Locally with mcp dev
Before touching any LLM, validate the wiring in a browser UI:
mcp dev server.py
This starts your server and opens the MCP Inspector (the URL is printed in your terminal). Navigate to Tools — you'll see all three tools with auto-generated input forms matching your Python signatures. Call search_customers with query = "alice" and confirm a JSON response or a typed upstream error.
Tip: Set
API_BASE_URL=https://httpbin.orgtemporarily to exercise the async/auth plumbing without a live internal API. You'll get a 404 back, which correctly surfaces as anhttpx.HTTPStatusErrortool error.
4. Wire to Claude Desktop
Locate the config file:
| OS | Path |
|---|---|
| macOS | ~/Library/Application Support/Claude/claude_desktop_config.json |
| Windows | %APPDATA%\Claude\claude_desktop_config.json |
Add your server entry. Use absolute paths — Claude Desktop spawns a clean, non-login shell that won't activate your virtualenv:
{
"mcpServers": {
"internal-api": {
"command": "/absolute/path/to/.venv/bin/python",
"args": ["/absolute/path/to/server.py"],
"env": {
"API_BASE_URL": "https://api.corp.example.com",
"API_KEY": "sk-your-real-token-here"
}
}
}
}
Restart Claude Desktop, then in a new conversation:
"Search for customers named 'smith', then open a high-priority support ticket for the first result explaining their order is delayed."
Claude will call search_customers, inspect the output, then call create_support_ticket — tool calls appear inline in the UI with their arguments and responses visible.
Verify It Works
Inspector: After mcp dev server.py, the Tools tab lists all three tools with correct schemas and no import errors in the terminal.
Claude Desktop: Open Settings → Developer. Your server appears as internal-api with a green connected indicator. If it shows an error state, restart Claude Desktop after editing the config.
Schema sanity-check — confirm FastMCP generated correct schemas without starting a full client:
python -c "
import os; os.environ['API_BASE_URL']='http://x'; os.environ['API_KEY']='x'
import asyncio
import server
async def main():
for tool in await server.mcp.list_tools():
print(tool.name, tool.inputSchema)
asyncio.run(main())
"
You should see each tool name alongside its JSON Schema inputSchema dict.
Troubleshooting
ModuleNotFoundError: No module named 'mcp' — Claude Desktop uses a clean shell; your virtualenv isn't activated. Confirm "command" points to the venv interpreter: /path/to/.venv/bin/python, not the system python.
Tools don't appear in Claude Desktop — Run mcp dev server.py first; import errors or missing env vars appear there immediately. Also check ~/Library/Logs/Claude/ on macOS — Claude Desktop writes a per-server log file named after your server key (internal-api).
KeyError: 'API_BASE_URL' — The env block in claude_desktop_config.json replaces the shell environment entirely; load_dotenv() won't read your .env from there. Set all required keys explicitly in the JSON config.
httpx.ReadTimeout — Your backend is slow. Raise timeout=30.0, or restructure long-running operations to return an AsyncGenerator and use yield to stream partial results back to the client.
Next Steps
- Resources: Expose read-only context (OpenAPI specs, internal wikis) via
@mcp.resource()so the LLM can pull reference material without consuming tool-call budget. - HTTP/SSE transport: For multi-user or remote deployments, replace
mcp.run()withmcp.run(transport="sse")and mount it behind a secured reverse proxy; validate per-request tokens in middleware rather than a static env var. - Rate limiting: Wrap
_auth_headers()with a token-bucket limiter (aiolimiteris async-native) to prevent an agentic loop from flooding your upstream API. - Richer schemas: Replace
dictreturn types with Pydantic models —FastMCPgenerates detailed JSON Schema from them, giving the model better guidance on what fields to expect and use. - Spec & SDK: modelcontextprotocol.io/docs and the Python SDK on GitHub.
Mariana covers the fast-moving world of machine learning and generative AI, with a particular focus on how these technologies are reshaping development workflows. When she isn't stress-testing the latest foundation models, she's usually at a local hackathon.
Discussion 0
No comments yet
Be the first to weigh in.