API Design for MCP Server Boundaries

Outcome focus: Turned general API design guidance into a practical standard for HTTP APIs that back MCP servers, with current protocol corrections, checklists, and source links.

APIs are where software makes promises.

That sounds heavier than most API work feels at the beginning. At first, an API is just a route, a handler, a schema, a function call, a tool definition, or a quick integration so another system can get something done.

Then people build on it.

They hard-code fields. They retry requests. They page through records. They depend on error shapes. They write scripts that run every morning. They wire dashboards, workflows, automations, agents, and other products into the boundary you exposed.

At that point the API is no longer just an implementation detail.

It is a contract.

That is even more true when the API powers a Model Context Protocol server. An MCP server is not merely another client of your API. It is a layer that exposes your system to AI applications through tools, resources, and prompts. The API underneath has to be predictable enough for software, and the MCP layer has to be explicit enough for a model-mediated workflow.

This document is written as a Confluence-ready guide. It has two layers:

General API design for HTTP and JSON APIs.
Additional design rules for APIs that back MCP servers.

It incorporates Sean Goedecke's excellent post, Everything I know about good API design, community feedback around that post, the current MCP architecture docs, the MCP server concepts docs, and the current MCP versioning page.

Four rules cover most of it:

Be boring.

Do not break users.

Design for retries.

Document the contract.

Make expensive work explicit.

Give agents narrow, typed, permission-aware capabilities.

Validation notes#

Before using this as an internal standard, a few current-state corrections matter.

The old MCP quickstart URL now redirects to the current "Build an MCP server" docs. The guidance about stdio logging is still important: stdio servers must not write logs to stdout because stdout carries JSON-RPC messages. Write logs to stderr or a file.

The current MCP protocol version listed in the official versioning docs is 2025-11-25, not 2025-06-18. MCP uses date-based version identifiers in YYYY-MM-DD format, and clients and servers negotiate a single protocol version during initialization.

The current architecture docs show MCP over two supported transports: stdio for local process communication and Streamable HTTP for remote communication. Streamable HTTP uses HTTP POST for client-to-server messages and can use Server-Sent Events for streaming.

The current docs show tool list change notification examples as notifications/tools/list_changed, not just tools/list_changed.

The official modelcontextprotocol/servers repository is a reference implementation repository. Its README warns that those servers are educational examples and not production-ready solutions. Treat them as examples of SDK usage and protocol behavior, not as your security baseline.

1. Non-negotiables#

Be boring#

The best API is usually the one consumers understand before reading much documentation.

Sean Goedecke's core point is that API consumers are trying to accomplish some other goal. They are not trying to admire your API. Familiar REST-ish resource shapes, predictable names, normal JSON, clear examples, and boring authentication are features.

Avoid clever abstractions that make sense only to the team that built the system.

Do:

Use familiar resource names.
Use stable field names.
Make common operations obvious.
Provide copy-paste examples.
Prefer consistency over novelty.

Avoid:

Hidden modes behind generic endpoints.
Clever naming.
Transport tricks that make debugging harder.
Requiring consumers to understand your internal data model.

Do not break userspace#

Once consumers depend on your API, changes are expensive.

Additive changes are usually safe. Removing fields, renaming fields, changing field types, changing nesting, changing pagination behavior, or changing error shapes can break clients immediately.

Default rule:

Add fields.
Add endpoints.
Add optional parameters.
Do not remove or reshape existing contract elements.

If you must break the contract, create a versioned escape path and run old and new behavior in parallel long enough for real consumers to migrate.

Design for retries#

Distributed systems fail ambiguously.

A timeout does not tell the client whether the server did nothing, did half the work, or completed the work but failed to return the response. Stripe's idempotency guidance is still one of the clearest practical references here: retries are safe only when the server can recognize repeated attempts at the same operation.

For mutating operations, especially create operations, support idempotency keys.

Do:

Accept an Idempotency-Key header or equivalent.
Store the request fingerprint and final result for a defined retention period.
Return the same result for safe retries of the same operation.
Reject reused keys with different parameters.
Use backoff and jitter guidance for client retry behavior.

Avoid:

Making clients guess whether a timed-out mutation succeeded.
Creating duplicate records on retry.
Treating idempotency as a payments-only concern.

Keep onboarding easy#

For plain HTTP APIs, long-lived API keys or personal access tokens are often the fastest way to let someone write a first script.

That does not mean API keys are enough for every production use case. It means the first successful request should be easy.

For production:

Make keys revocable.
Make keys rotatable.
Scope keys.
Audit key usage.
Prefer OAuth or mTLS where the risk profile requires it.

For remote MCP servers, current MCP authorization guidance is more specific: authorization is optional, but when HTTP-based MCP authorization is supported, implementations should follow the MCP authorization specification, which is based on OAuth-related standards. Stdio servers should generally get credentials from the environment rather than trying to run the HTTP authorization flow.

2. Functional scope and responsibility#

Every API should have a sentence that explains what it owns.

If the sentence is vague, the boundary is probably vague.

Good:

This API manages customer eligibility decisions for offer targeting.

Bad:

This API handles customer stuff.

Document:

Business domain.
Primary consumers.
Supported operations.
Explicitly out-of-scope operations.
Source of truth for each resource.
Ownership and escalation path.

Single responsibility applies at the operation level too. If one endpoint behaves like five endpoints based on mode, type, or action, the API is probably hiding multiple responsibilities behind one route.

That same rule applies to MCP tools.

A tool named manage_customer is too broad.

Tools such as get_customer_profile, list_customer_orders, create_support_note, and check_offer_eligibility are easier for models and humans to reason about.

3. Data model honesty#

Do not leak internal awkwardness unless there is no better option.

Bad internal models produce bad APIs when the API mirrors them too literally. If your database stores comments as a linked list, consumers should not have to traverse the list one node at a time. If your internal system has historical table names, consumers should not inherit that vocabulary. If your workflow needs background jobs for large exports, expose a clean job resource rather than making clients reverse-engineer the implementation.

The API should express the product model, not the storage accident.

Ask:

What resource does the consumer think they are using?
What fields are stable enough to expose?
Which fields are implementation details?
Which operations are synchronous?
Which operations need jobs, polling, events, or webhooks?

For MCP, ask one more question:

Should this be a tool, a resource, or a prompt?

Use a tool when the model may perform an action.

Use a resource when the application or user needs context data.

Use a prompt when you want to publish a reusable workflow template.

4. Contract: inputs and outputs#

Every request should be schematized.

Every response should be predictable.

For HTTP APIs:

Use OpenAPI or JSON Schema where possible.
Mark fields required or optional.
Document defaults.
Document enum values.
Document date, time, currency, locale, and timezone behavior.
Document nullability.
Document sorting and filtering semantics.
Document pagination behavior.

For MCP tools:

Define inputSchema with JSON Schema.
Keep tool arguments minimal.
Use specific types and descriptions.
Prefer explicit optional parameters over open-ended blobs.
Avoid passing raw natural language where structured fields are available.

Bad tool schema:

{
  "type": "object",
  "properties": {
    "query": { "type": "string" }
  }
}

Better tool schema:

{
  "type": "object",
  "properties": {
    "customer_id": {
      "type": "string",
      "description": "Stable customer identifier."
    },
    "include_inactive": {
      "type": "boolean",
      "default": false,
      "description": "Whether inactive accounts should be included."
    }
  },
  "required": ["customer_id"]
}

The second version gives the model less room to guess.

5. Idempotency and mutation design#

Reads should be safe.

Writes should be retryable.

Deletes should have clear semantics.

For create operations, use idempotency keys when duplicate creation would matter. This includes payments, tickets, records, messages, workflow launches, issue creation, file uploads, orders, and any MCP tool that triggers a real-world side effect.

Recommended behavior:

Client sends Idempotency-Key.
Server associates the key with tenant, operation, request fingerprint, and result.
Same key and same request returns the original result.
Same key and different request returns a conflict.
Keys expire after a documented retention period.
High-risk systems use durable transactional storage, not a best-effort cache.

Sean Goedecke notes that a short-lived store such as Redis can be a pragmatic improvement for many low-risk systems, while the HN discussion correctly points out that payments or high-risk operations need stronger atomicity than a bolt-on cache can provide.

For MCP tools, include idempotency in the tool design when the tool mutates state.

Example:

{
  "name": "create_support_ticket",
  "description": "Create one support ticket for a customer issue.",
  "inputSchema": {
    "type": "object",
    "properties": {
      "customer_id": { "type": "string" },
      "subject": { "type": "string" },
      "body": { "type": "string" },
      "idempotency_key": {
        "type": "string",
        "description": "Unique key used to safely retry ticket creation."
      }
    },
    "required": ["customer_id", "subject", "body", "idempotency_key"]
  }
}

If a model or client retries this tool call, the system should not create duplicate tickets.

6. Operations and usage patterns#

Do not design only for the first happy path.

Document how the API will be used:

Single-record reads.
List reads.
Search.
Filter.
Sort.
Bulk create.
Bulk update.
Export.
Import.
Async jobs.
Webhooks or notifications.
Long-running workflows.

Avoid forcing consumers into N round trips when the use case is naturally bulk.

Examples:

GET /customers/{id} for one record.
POST /customers/batch-get for known IDs.
GET /customers?status=active&cursor=... for filtered lists.
POST /exports/customers for large export jobs.

For MCP:

Use tools for bounded operations.
Use resources for context the application can browse or select.
Avoid tools that return huge unbounded lists.
Prefer a resource template or paginated tool for large collections.
Expose prompts for common workflows that combine multiple tools.

7. Pagination#

Pagination should protect the server without punishing the client.

Cursor-based pagination should be the default for datasets that may grow large. Offset pagination is easier, but it becomes slower as offsets grow and behaves poorly when records are inserted or deleted during iteration.

Good list response:

{
  "data": [
    { "id": "cus_123", "name": "Example Customer" }
  ],
  "next_cursor": "opaque_cursor_value",
  "has_more": true
}

Rules:

Prefer opaque cursors.
Include next_cursor or next_page.
Support a sensible default page size.
Let clients request larger page sizes within documented caps.
Do not make page sizes so tiny that clients spend most of their time on round-trip latency.

The HN discussion adds a useful practical correction: pagination is necessary, but tiny hard caps are hostile to programmatic consumers. Big pages with pagination are often a better compromise than no pagination or tiny pages.

For MCP tools:

Include cursor arguments when returning lists.
Keep output sizes bounded.
Prefer summaries plus follow-up retrieval for large objects.
Consider resources for browsing large datasets.

8. Performance, throttling, and backpressure#

An API can be called at the speed of code.

That changes the threat model.

Operations that are harmless in the UI can become dangerous through an API. A user can click a button a few times. A script can call the endpoint thousands of times.

Document:

Target p50, p95, and p99 latency by operation class.
Timeouts.
Retry guidance.
Rate limits.
Burst limits.
Tenant quotas.
Bulk limits.
Async job limits.

Use:

429 Too Many Requests for rate limits.
Retry-After when clients should wait.
Exponential backoff and jitter guidance.
Kill switches or temporary tenant-level disables for abusive integrations.
Deadlines or cancellation fields for work that stops mattering after a point.

HN feedback called out deadlines and backpressure as missing pieces in many API design discussions. That is right. An API should avoid overloading its own dependencies with useless work after the caller has already given up.

For MCP:

Set bounded runtimes for tools.
Return progress or async task handles for long work when supported by the stack.
Avoid giving a model a tool that can accidentally launch unbounded work.
Make expensive tools explicit in name and description.

9. Error handling#

Errors are part of the contract.

For HTTP APIs, use Problem Details for HTTP APIs, RFC 9457 when you need structured error bodies. RFC 9457 obsoletes RFC 7807 and defines the application/problem+json format.

Recommended HTTP error shape:

{
  "type": "https://api.example.com/problems/validation-error",
  "title": "Request validation failed.",
  "status": 422,
  "detail": "One or more fields were invalid.",
  "instance": "req_123",
  "errors": [
    {
      "pointer": "/email",
      "message": "Must be a valid email address."
    }
  ]
}

Rules:

Use status codes consistently.
Include stable machine-readable error types.
Include field-level validation errors.
Include a request or trace identifier.
Avoid leaking internal implementation details.
Do not require clients to parse human-readable strings.

For MCP, remember that the data layer is JSON-RPC 2.0. JSON-RPC errors use an error object with code, message, and optional data.

Recommended MCP error shape:

{
  "jsonrpc": "2.0",
  "id": 7,
  "error": {
    "code": -32602,
    "message": "Invalid params",
    "data": {
      "request_id": "req_123",
      "retryable": false,
      "fields": [
        {
          "path": "/customer_id",
          "message": "customer_id is required"
        }
      ]
    }
  }
}

Use JSON-RPC reserved error codes correctly:

-32700 for parse errors.
-32600 for invalid request.
-32601 for method not found.
-32602 for invalid params.
-32603 for internal error.
-32000 to -32099 for server-defined errors.

10. Observability#

Every API response should be traceable.

Minimum:

request_id in every response.
Structured logs with request_id.
Tenant or account identifier.
Consumer identifier.
Operation name.
Latency.
Status or error type.
Retry count when known.
Rate-limit state.

For MCP servers, also track:

Protocol version negotiated.
Transport type.
Tool list calls.
Tool execution counts.
Tool latency.
Tool errors by code.
Resource reads.
Prompt retrievals.
Notification counts.
Authorization failures.
User approval outcomes if available from the host or application layer.

Do not log secrets, access tokens, API keys, raw PII, or full prompt contents unless your governance model explicitly permits it.

11. Versioning and change management#

Versioning is a tool, not a hobby.

Default stance:

Prefer additive changes.
Avoid breaking changes.
Use versioning only when compatibility cannot be preserved.

When versioning is required:

Run old and new behavior in parallel.
Publish migration notes.
Provide deprecation headers or warnings.
Contact known consumers.
Keep a translation layer where possible.
Set a realistic sunset date.

URL versioning such as /v1 is discoverable and pragmatic. Header or media-type versioning can work well when client tooling and ecosystem expectations support it. Pick one strategy and be consistent.

HN feedback adds nuance here: you do not always need to version the entire API when one endpoint changes. Version the thing that is changing when the boundary allows it.

For MCP, separate two kinds of versioning:

MCP protocol version, negotiated during initialization.
Your server and tool contract version, owned by your implementation.

The current MCP protocol version is 2025-11-25. Your server may also expose its own serverInfo.version, tool descriptions, resource schemas, or changelog. Do not confuse those with the protocol version.

12. Security and trust#

API security is not just authentication.

Minimum:

Use least privilege scopes.
Make credentials revocable and rotatable.
Never log secrets.
Use TLS.
Validate input.
Enforce authorization at the resource level.
Rate-limit expensive operations.
Keep audit logs for mutating operations.
Treat exports, searches, and bulk endpoints as high risk.

For MCP:

Treat tools as capability grants.
Keep tool names and descriptions precise.
Separate read-only tools from mutating tools.
Require explicit confirmation through the host or client UX for risky actions where appropriate.
Design as if tool descriptions may be shown to users.
Return outputs that are safe to put in a model context.
Avoid leaking hidden instructions, secrets, internal credentials, or unrelated customer data through tool output.

The MCP server concepts docs describe tools as model-controlled and note that applications can provide user oversight such as approval dialogs, permission settings, and activity logs. That means approval is usually enforced by the host or application, not by your server alone. Your server still needs authorization and safe behavior.

13. API to MCP: what changes#

An MCP server wraps capabilities for AI applications. That changes the interface design.

The AI application discovers capabilities dynamically. The model may choose a tool based on your tool name, description, and schema. The user may see the tool call. The host may ask for approval. The server may be local over stdio or remote over Streamable HTTP.

MCP has three server-side building blocks:

Tools: executable functions the model can call.
Resources: passive context sources the application can read.
Prompts: reusable templates the user can invoke.

This means an HTTP endpoint does not always map one-to-one to an MCP tool.

Example:

GET /customers/{id} could become a resource template: customer://{customer_id}.
POST /tickets could become a tool: create_support_ticket.
A multi-step support triage workflow could become a prompt: triage_customer_issue.

Use the MCP primitive that matches the interaction.

14. MCP lifecycle and transport#

MCP uses a JSON-RPC 2.0 data layer.

Initialization includes:

Protocol version negotiation.
Capability negotiation.
Client and server identity exchange.
notifications/initialized when ready.

Tool discovery:

Client calls tools/list.
Server returns tool definitions with names, descriptions, titles, and input schemas.

Tool execution:

Client calls tools/call.
Server returns a content array with typed content.

Notifications:

Server can notify clients of changes when capabilities were declared.
Current docs show tool list changes as notifications/tools/list_changed.

Transports:

Stdio uses stdin and stdout between local processes.
Streamable HTTP uses HTTP POST and optional SSE.

Stdio warning:

Never write logs to stdout. Use stderr or files. stdout belongs to JSON-RPC messages.

15. MCP server skeleton#

Minimal Python stdio server:

from mcp.server.fastmcp import FastMCP
 
mcp = FastMCP("customer_support")
 
@mcp.tool()
async def lookup_customer(customer_id: str) -> dict:
    """Retrieve a customer summary by stable customer id."""
    return {
        "customer_id": customer_id,
        "status": "active"
    }
 
if __name__ == "__main__":
    mcp.run(transport="stdio")

Logging rule:

import logging
import sys
 
logging.basicConfig(stream=sys.stderr, level=logging.INFO)
logging.info("Server started")

Do not use plain print() for stdio server logs unless you explicitly send it to stderr.

16. Testing and debugging#

Use the MCP Inspector during development. It can connect to local servers, inspect resources, prompts, and tools, test tool inputs, view results, and monitor notifications.

Test:

Initialization.
Protocol negotiation.
Capability negotiation.
tools/list.
Valid tool calls.
Invalid tool calls.
Missing arguments.
Permission failures.
Long-running operations.
Concurrent operations.
Notification behavior.
Stdio logging safety.
HTTP authentication.
Rate limits.
Retry behavior.

The Inspector docs explicitly call out edge case testing for invalid inputs, missing prompt arguments, concurrent operations, and error responses.

17. General API checklist#

18. MCP server checklist#

19. Decisions the team should make up front#

Where do versions live?#

Recommendation:

Use additive changes by default. Use URL versioning for major public HTTP API breaks if your ecosystem values discoverability. Use header or media-type versioning only if client tooling supports it well.

For MCP, support the current protocol version and document your server/tool contract separately.

What gets an idempotency key?#

Recommendation:

Require idempotency keys for create operations and high-risk mutations. Optional idempotency is acceptable for low-risk mutations, but the mechanism should exist.

How large can pages be?#

Recommendation:

Use cursor pagination and allow larger page sizes within caps. Set caps by response size and backend cost, not by arbitrary item counts alone.

What belongs as a tool versus a resource?#

Recommendation:

If it takes action, make it a tool. If it provides context, make it a resource. If it guides a workflow, make it a prompt.

Can the MCP server call private APIs?#

Recommendation:

Yes, but it must enforce the same authorization, audit, and tenancy boundaries as any other client. Do not let the MCP layer become a bypass around the platform API.

API Design for MCP Server Boundaries

Validation notes#

1. Non-negotiables#

Be boring#

Do not break userspace#

Design for retries#

Keep onboarding easy#

2. Functional scope and responsibility#

3. Data model honesty#

4. Contract: inputs and outputs#

5. Idempotency and mutation design#

6. Operations and usage patterns#

7. Pagination#

8. Performance, throttling, and backpressure#

9. Error handling#

10. Observability#

11. Versioning and change management#

12. Security and trust#

13. API to MCP: what changes#

14. MCP lifecycle and transport#

15. MCP server skeleton#

16. Testing and debugging#

17. General API checklist#

18. MCP server checklist#

19. Decisions the team should make up front#

Where do versions live?#

What gets an idempotency key?#

How large can pages be?#

What belongs as a tool versus a resource?#

Can the MCP server call private APIs?#

20. Related notes#

Sources#