Benchmark

Claude Code is 4.2x faster & 3.2x cheaper with CustomGPT.ai plugin. See the report →

CustomGPT.ai Blog

The Model Context Protocol (MCP): A Deep‑Dive Into Its Core Features in 2025

Author Image

Written by: Priyansh Khodiyar

Current image: mcp features

TL;DR, MCP is the “USB‑C for AI tools.” It defines a universal JSON‑over‑HTTP interface that lets any language model dynamically discover, describe, and invoke external capabilities, without bespoke adapters. In this post we unpack each feature of MCP, share sample configs, and offer writing tips to help you explain MCP clearly and convincingly.

1  Why MCP Exists

Connecting language models to real‑world data and actions is powerful,but today it’s messy. Every model speaks its own dialect; every tool exposes a different API. Multiply that by the number of models and tools and you get the infamous N × M integration nightmare:

N models × M tools ≈ Endless glue code & maintenance

MCP ends that pain by standardising the handshake between models (callers) and tools (callees). Think of it like HTTP for the web or USB‑C for hardware,a single plug, infinite possibilities.

Refer to our MCP docs for more.

2  Core Features of MCP

#FeatureWhat It DeliversWhy It Matters
2.1Universal JSON‑over‑HTTP InterfaceEvery tool exposes two endpoints,/schema and /invoke,speaking plain JSON.Zero SDK lock‑in; works with cURL or any HTTP client.
2.2Tool Discovery & IntrospectionThe /schema endpoint returns an OpenAPI‑like manifest describing tool names, arguments, examples, and auth scopes.Models can self‑discover capabilities at runtime,no hard‑coding.
2.3Strongly‑Typed SchemasParameters use JSON Schema Draft 07 with enums, ranges, nullable, and regex patterns.Reduces hallucinated arguments and improves error surfaces.
2.4Context‑Aware Retrieval (RAG Mode)Optional /search sub‑spec lets tools fetch domain data on demand, returning citations and chunks.Gives models up‑to‑date context without fine‑tuning.
2.5Streaming & Partial ResponsesChunked HTTP or Server‑Sent Events (text/event‑stream).Models (and UIs) can show incremental progress; no 30‑second timeout woes.
2.6Batch & Async InvocationsinvocationMode: “async” plus a /status/{id} endpoint.Long‑running jobs (video transcode, analytic queries) fit in the same protocol.
2.7Fine‑Grained AuthenticationSupports API Key, OAuth 2.0 Bearer, or custom header fields declared in schema.Tool owners keep control; callers negotiate scopes programmatically.
2.8Version NegotiationEach schema has mcpVersion and schemaVersion.New features roll out without breaking old clients.
2.9Error TaxonomyStandard codes (MCP‑4001 InvalidParam, MCP‑4290 RateLimited, etc.).Callers can remediate intelligently,retry vs. ask user.
2.10Open Ecosystem & Reference SDKsOfficial SDKs exist for Python, TypeScript, Go, Java; community ports in Rust and Swift.Lowers the barrier to adoption across stacks.

Pro tip: When writing about MCP, anchor each feature in a practical pain point,“Before MCP you had to…” vs. “With MCP you simply…”.

3  Example Tool Schema

Below is a trimmed‑down manifest for a hypothetical Weather tool:

{
  "mcpVersion": "1.0.0",
  "schemaVersion": "2025-04-10",
  "id": "com.example.weather",
  "name": "Weather API",
  "description": "Get current weather & 7‑day forecast.",
  "auth": {
    "type": "oauth2",
    "tokenUrl": "https://auth.example.com/token"
  },
  "functions": [
    {
      "name": "getForecast",
      "description": "Retrieve a forecast for a given city.",
      "arguments": {
        "type": "object",
        "required": ["city"],
        "properties": {
          "city": {"type": "string", "minLength": 2},
          "units": {"type": "string", "enum": ["metric", "imperial"], "default": "metric"}
        }
      },
      "responses": {
        "200": {"$ref": "#/components/schemas/WeatherResult"}
      }
    }
  ]
}

4  Calling a Tool From Python (Synchronous)

import requests, json

SCHEMA_URL = "https://weather.example.com/schema"

# Discover the tool
schema = requests.get(SCHEMA_URL).json()

# Prepare arguments (validated clientside using JSON Schema, omitted here)
args = {"city": "Berlin", "units": "metric"}

# Invoke
resp = requests.post(schema["invokeUrl"],
                     json={"tool": "getForecast", "args": args},
                     headers={"Authorization": f"Bearer {token}"})
print(json.dumps(resp.json(), indent=2))

5  MCP in Real‑World Workflows

  1. Agent Executors – LangChain, Haystack, and LlamaIndex adaptors let agents auto‑route queries to MCP tools.
  2. Voice Assistants – “What’s my flight status?” triggers a call to an MCP flight tool, spoken back via TTS.
  3. Automation Pipelines – n8n/Zapier nodes expose MCP endpoints so non‑devs can chain LLM insight to emails, CRMs, or dashboards.
  4. Low‑Code RAG Apps – Build internal search portals that pull context via MCP /search, ensuring freshness without re‑indexing.

Note: MCP doesn’t mandate hosting. You can self‑host or choose a vendor. The protocol stays the same.

6  Best Practices for Implementers

DoWhy
Validate schemas with CI pipelines.Catch breaking changes before deploy.
Implement exponential back‑off for MCP‑4290 (RateLimited).Play nice with shared infra.
Use short‑lived OAuth tokens.Improves security posture.
Stream long responses if >3 s render time.Prevent model caller timeouts.
Version using Semantic Versioning.Signals compatibility expectations.

7  Words to Skip / Replace (Humanise Your Blog)

AI‑ish WordSwap For
SynergyTeamwork
LeverageUse
Cutting‑edgeLeading
RobustSolid
ParadigmApproach

8  Suggested Post Structure (Recap)

  1. Hook – present the integration challenge.
  2. Define MCP – one‑sentence definition + standard link.
  3. Feature Walkthrough – each core feature with pain‑point and benefit.
  4. Schema & Code – concrete manifest + client snippet.
  5. Usage Scenarios – highlight at least three.
  6. Best Practices – quick checklist.
  7. Roadmap / Community – invite readers to spec repo, Slack, or GitHub discussion.
  8. CTA – encourage readers to try MCP with their own tool or agent.

9  Further Reading & Resources

💡 Ready to explore? Fork the reference server, register a simple “Hello World” tool, and watch your LLM discover and run it, no glue code required.

Frequently Asked Questions

How do you integrate MCP with an existing internal API without rewriting everything?

You usually integrate MCP by adding a thin adapter in front of your existing API, not by rewriting it. The MCP server is that adapter: the model calls MCP actions, not your raw internal endpoints, and support for connecting to external MCP servers depends on your product configuration.

For example, keep POST /tickets as is and expose a create_ticket action with subject, priority, and description. The adapter validates inputs, maps them to your API, enforces auth, rate limits, and audit logs, then returns predictable JSON and explicit errors. Start with 1 to 3 high-value, narrowly scoped actions, often read-only or low-risk first, before write actions like ticket creation. Many teams define actions with JSON Schema or OpenAPI so validators and client stubs can be generated automatically. In CustomGPT.ai, this follows patterns also used with Anthropic and OpenAI. Ontop reports cutting HR response time from 20 minutes to 20 seconds after connecting AI to internal workflows.

What is the difference between an MCP client and an MCP server in production?

An MCP client decides which tool to call and manages the workflow. An MCP server publishes tools, runs them, and enforces the security and operational rules for each call.

In CustomGPT.ai today, MCP support means the product can connect to external MCP servers as a client, but it cannot yet expose its own tools as an MCP server for other clients. If you are deciding where logic lives in production, put tool selection, retries, and workflow sequencing in the client, and keep authentication, permission checks, rate limits, audit logging, and tool code on each server. Per Anthropic’s Model Context Protocol specification, servers may run locally over stdio or remotely over HTTP, so production designs must account for transport-specific security, networking, and trust boundaries. MCP servers also commonly describe tool inputs with JSON Schema, which helps clients validate arguments before execution. Claude Desktop and Zapier follow the same split.

Can MCP support strict rule-based automations, or is it only for AI-driven tool selection?

Yes. MCP can support strict rule-based automations, not just AI-driven tool selection. In CustomGPT.ai, rule-based MCP automations work only in the MCP modes the product supports today: acting as an MCP client, exposing tools as an MCP server, or connecting to a supported external MCP server.

Use deterministic MCP calls when the trigger, tool, and parameters are known in advance. For example, if a form submission arrives with status=”approved”, your app can call the “create_invoice” tool with fixed fields like customer_id and amount, without asking a model to choose anything. Under the MCP spec, tool inputs are typically defined with JSON Schema, so required fields can be validated before execution. Use a model only for open-ended requests where intent or tool choice is unknown. Ontop’s reported drop from 20 minutes to 20 seconds fits this kind of standardized, deterministic workflow. Similar patterns also show up in Anthropic Claude and OpenAI tool calling.

What authentication controls does MCP support when exposing tools?

As of the latest documented product behavior, MCP-exposed tools support API key authentication. OAuth 2.0, custom headers, and discoverable auth scopes are not currently documented as supported.

In practice, the caller sends an API key with the tool request, and requests without a valid key should be rejected. A useful design detail is that key rotation, revocation, rate limits, and IP allowlisting are usually enforced by the API gateway or host platform, not by the MCP tool schema itself. By IETF design, API keys are simple bearer secrets, so they do not provide delegated user consent or standardized scope discovery the way OAuth 2.0 can. If your integration in CustomGPT.ai requires bearer-token OAuth or scope discovery, treat MCP tool auth as insufficient until the docs or release notes explicitly add it. Do not assume feature parity with Zapier or Postman connector auth.

How do you build a custom MCP connector for a third-party platform with unstable APIs?

The safest way to connect a third-party platform with an unstable API is to put a stable adapter in front of it. In CustomGPT.ai today, that adapter is the integration point; it is not a general MCP client or MCP server, and external MCP servers are not attached directly.

Example: if a CRM vendor renames customer_email to primary_email in v3, keep your MCP schema unchanged and translate the field inside the adapter. If cursor pagination sometimes skips records, persist the last good cursor and a high-water timestamp, retry up to 3 times with exponential backoff and jitter, then return a retryable ADAPTER_UPSTREAM_TIMEOUT. These are implementation choices, not MCP requirements: the Model Context Protocol spec covers tool and schema exchange, not retries or error policy. Pin each upstream API version, publish every schemaVersion change, and map vendor failures to stable internal codes. The AWS Architecture Blog recommends backoff with jitter to prevent retry storms; Merge and Apideck follow similar adapter patterns.

Does MCP add latency?

Yes, MCP adds some latency, mainly when a connection is opened for the first time. After that, the extra delay is usually small compared with the external tool or API itself.

The main cost is setup: a fresh TLS 1.3 session adds roughly one extra network round trip before the first tool call, so same-region traffic may add tens of milliseconds, while cross-region links can add 100 ms or more. Reused connections, cached capabilities, and transports like HTTP/2 or QUIC can make repeat calls close to zero overhead. In practice, backend execution time, rate limits, and slow tools dominate user-perceived speed. At GEMA, the AI assistant handles 248,000+ inquiries with an 88% success rate, which shows why keeping sessions warm matters at scale. For tiny, high-frequency requests, direct function calling in OpenAI or LangChain can be slightly faster than MCP in CustomGPT.ai.

What is the difference between MCP and A2A?

MCP and A2A do different jobs. MCP lets an AI app use tools or data. A2A lets one agent delegate work, context, and results to another.

An MCP client is the AI application requesting tool use; an MCP server exposes the tools or data the client can call. Per Anthropic’s MCP spec, MCP focuses on capability discovery and action calls, commonly over stdio or HTTP. Google’s A2A spec focuses on agent identity, task handoff, message threads, and partial results, often via JSON-RPC. In CustomGPT.ai, MCP means connecting your agent to external tools and data systems; it does not by itself provide agent-to-agent delegation. If you are evaluating integrations, ask whether the product can connect to external MCP servers today and which actions are supported. VdW Bayern reports a 50 to 60 percent task reduction from AI connected to internal knowledge, which is an MCP-style problem. Similar patterns also appear in OpenAI and Microsoft agent stacks.

3x productivity.
Cut costs in half.

Launch a custom AI agent in minutes.

Instantly access all your data.
Automate customer service.
Streamline employee training.
Accelerate research.
Gain customer insights.

Try 100% free. Cancel anytime.