Skip to content
AI Advanced Tutorial

Ship an MCP Server in Python That Exposes Your Internal API to LLMs

Wrap a corporate REST API in three typed tools using FastMCP, inspect them locally, and connect them to Claude Desktop—without ever exposing credentials to the model.

Mariana Souza
Mariana Souza
Senior Editor · Jun 13, 2026 · 8 min read

What You'll Build

A Python MCP server using FastMCP that wraps a corporate REST API as three structured tools—search_customers, get_order, and create_support_ticket. Any MCP-compatible client (Claude Desktop, Cursor, custom agents) can call your API with full type safety, without the model ever seeing credentials or constructing raw URLs.

Prerequisites

  • Python 3.10+ (required for built-in generic types like list[dict])
  • pip or uv for package management
  • Node.js 18+ — mcp dev invokes npx @modelcontextprotocol/inspector under the hood
  • Latest Claude Desktop (for end-to-end testing; optional if using only the inspector)
  • A REST API with a bearer token — a mock URL works fine to follow along
  • Comfortable with async/await Python

1. Set Up the Project

mkdir mcp-internal-api && cd mcp-internal-api
python -m venv .venv
source .venv/bin/activate        # Windows: .venv\Scripts\activate
pip install "mcp[cli]" httpx python-dotenv

mcp[cli] installs the mcp CLI used for local inspection. httpx handles async HTTP to your backend.

Create .env for local credentials — add it to .gitignore now:

API_BASE_URL=https://api.corp.example.com
API_KEY=sk-your-real-token-here

2. Write the Server

Create server.py:

import os
import httpx
from dotenv import load_dotenv
from mcp.server.fastmcp import FastMCP

load_dotenv()

mcp = FastMCP("internal-api")

_BASE = os.environ["API_BASE_URL"]
_KEY  = os.environ["API_KEY"]


def _auth_headers() -> dict[str, str]:
    return {"Authorization": f"Bearer {_KEY}", "Accept": "application/json"}


@mcp.tool()
async def search_customers(query: str, limit: int = 10) -> list[dict]:
    """Search customers by name or email. Returns a list of customer records."""
    async with httpx.AsyncClient() as client:
        r = await client.get(
            f"{_BASE}/customers",
            headers=_auth_headers(),
            params={"q": query, "limit": limit},
            timeout=10.0,
        )
        r.raise_for_status()
        return r.json()


@mcp.tool()
async def get_order(order_id: str) -> dict:
    """Fetch a single order by its ID."""
    async with httpx.AsyncClient() as client:
        r = await client.get(
            f"{_BASE}/orders/{order_id}",
            headers=_auth_headers(),
            timeout=10.0,
        )
        r.raise_for_status()
        return r.json()


@mcp.tool()
async def create_support_ticket(
    customer_id: str,
    subject: str,
    body: str,
    priority: str = "normal",
) -> dict:
    """Open a support ticket for a customer.

    Args:
        customer_id: The customer's UUID.
        subject: One-line summary (max 120 chars).
        body: Full description of the issue.
        priority: 'low', 'normal', or 'high'.
    """
    if priority not in {"low", "normal", "high"}:
        raise ValueError(f"priority must be low/normal/high, got '{priority}'")

    async with httpx.AsyncClient() as client:
        r = await client.post(
            f"{_BASE}/tickets",
            headers=_auth_headers(),
            json={
                "customer_id": customer_id,
                "subject": subject,
                "body": body,
                "priority": priority,
            },
            timeout=10.0,
        )
        r.raise_for_status()
        return r.json()


if __name__ == "__main__":
    mcp.run()

Why each decision matters:

Detail Reason
Type annotations FastMCP auto-generates JSON Schema from them — the LLM receives exact parameter types, not free-form text
Docstrings Become the tool description the model reads before calling; write them like an API spec
raise_for_status() + ValueError Exceptions surface to the LLM as structured tool errors rather than crashing the server process
Credentials in env vars Never passed as tool arguments, never echoed in responses, never in source control

mcp.run() defaults to stdio transport, which is what Claude Desktop and most local clients expect — the client spawns your server as a subprocess and talks JSON-RPC over stdin/stdout.

3. Inspect Locally with mcp dev

Before touching any LLM, validate the wiring in a browser UI:

mcp dev server.py

This starts your server and opens the MCP Inspector (the URL is printed in your terminal). Navigate to Tools — you'll see all three tools with auto-generated input forms matching your Python signatures. Call search_customers with query = "alice" and confirm a JSON response or a typed upstream error.

Tip: Set API_BASE_URL=https://httpbin.org temporarily to exercise the async/auth plumbing without a live internal API. You'll get a 404 back, which correctly surfaces as an httpx.HTTPStatusError tool error.

4. Wire to Claude Desktop

Locate the config file:

OS Path
macOS ~/Library/Application Support/Claude/claude_desktop_config.json
Windows %APPDATA%\Claude\claude_desktop_config.json

Add your server entry. Use absolute paths — Claude Desktop spawns a clean, non-login shell that won't activate your virtualenv:

Advertisement
{
  "mcpServers": {
    "internal-api": {
      "command": "/absolute/path/to/.venv/bin/python",
      "args": ["/absolute/path/to/server.py"],
      "env": {
        "API_BASE_URL": "https://api.corp.example.com",
        "API_KEY": "sk-your-real-token-here"
      }
    }
  }
}

Restart Claude Desktop, then in a new conversation:

"Search for customers named 'smith', then open a high-priority support ticket for the first result explaining their order is delayed."

Claude will call search_customers, inspect the output, then call create_support_ticket — tool calls appear inline in the UI with their arguments and responses visible.

Verify It Works

Inspector: After mcp dev server.py, the Tools tab lists all three tools with correct schemas and no import errors in the terminal.

Claude Desktop: Open Settings → Developer. Your server appears as internal-api with a green connected indicator. If it shows an error state, restart Claude Desktop after editing the config.

Schema sanity-check — confirm FastMCP generated correct schemas without starting a full client:

python -c "
import os; os.environ['API_BASE_URL']='http://x'; os.environ['API_KEY']='x'
import asyncio
import server
async def main():
    for tool in await server.mcp.list_tools():
        print(tool.name, tool.inputSchema)
asyncio.run(main())
"

You should see each tool name alongside its JSON Schema inputSchema dict.

Troubleshooting

ModuleNotFoundError: No module named 'mcp' — Claude Desktop uses a clean shell; your virtualenv isn't activated. Confirm "command" points to the venv interpreter: /path/to/.venv/bin/python, not the system python.

Tools don't appear in Claude Desktop — Run mcp dev server.py first; import errors or missing env vars appear there immediately. Also check ~/Library/Logs/Claude/ on macOS — Claude Desktop writes a per-server log file named after your server key (internal-api).

KeyError: 'API_BASE_URL' — The env block in claude_desktop_config.json replaces the shell environment entirely; load_dotenv() won't read your .env from there. Set all required keys explicitly in the JSON config.

httpx.ReadTimeout — Your backend is slow. Raise timeout=30.0, or restructure long-running operations to return an AsyncGenerator and use yield to stream partial results back to the client.

Next Steps

  • Resources: Expose read-only context (OpenAPI specs, internal wikis) via @mcp.resource() so the LLM can pull reference material without consuming tool-call budget.
  • HTTP/SSE transport: For multi-user or remote deployments, replace mcp.run() with mcp.run(transport="sse") and mount it behind a secured reverse proxy; validate per-request tokens in middleware rather than a static env var.
  • Rate limiting: Wrap _auth_headers() with a token-bucket limiter (aiolimiter is async-native) to prevent an agentic loop from flooding your upstream API.
  • Richer schemas: Replace dict return types with Pydantic models — FastMCP generates detailed JSON Schema from them, giving the model better guidance on what fields to expect and use.
  • Spec & SDK: modelcontextprotocol.io/docs and the Python SDK on GitHub.
Mariana Souza
Written by
Mariana Souza · Senior Editor

Mariana covers the fast-moving world of machine learning and generative AI, with a particular focus on how these technologies are reshaping development workflows. When she isn't stress-testing the latest foundation models, she's usually at a local hackathon.

Discussion 0

Join the discussion

Sign in or create an account to comment and vote.

No comments yet

Be the first to weigh in.

Related Reading