MCP Enterprise Deployment: Production Guide for 2026
Everything you need to run MCP servers at enterprise scale: container orchestration, mutual TLS, OAuth 2.1 with Azure Entra ID, audit logging, multi-tenant isolation, and production observability.
TL;DR
- Run MCP servers as Kubernetes Deployments behind a gateway — never expose stdio servers over the public internet
- Use mTLS between the AI gateway and MCP pods; terminate public TLS at the ingress
- Authenticate users with OAuth 2.1 + PKCE; integrate with Azure Entra ID for enterprise SSO
- Emit structured JSON logs to stdout; collect with Fluentd and ship to Datadog or Elastic
- Isolate tenants via Kubernetes namespaces — one namespace per org, RBAC-enforced
- Gate deployments behind CircleCI approval jobs; never auto-push MCP server changes to production
Architecture Overview
A production MCP deployment has three layers: an AI gateway that handles OAuth, rate limiting, and routing; a pool of MCP server pods that expose tools via SSE transport; and the downstream data systems (databases, APIs, object storage) that the tools actually reach.
┌─────────────────────────────────────┐
│ AI Application │
│ (Claude, GPT-4, etc.) │
└────────────────┬────────────────────┘
│ HTTPS / OAuth 2.1
┌────────────────▼────────────────────┐
│ API Gateway (Kong / nginx) │
│ Rate limiting • Auth • Routing │
└────────────────┬────────────────────┘
│ mTLS
┌────────────────▼────────────────────┐
│ MCP Server Pods (k8s) │
│ SSE transport • Tool handlers │
└────────┬───────────────┬────────────┘
│ │
┌────────▼──────┐ ┌──────▼────────────┐
│ PostgreSQL │ │ External APIs │
│ Redis Cache │ │ (HubSpot, etc.) │
└───────────────┘ └───────────────────┘1. Docker Setup
Package each MCP server as a minimal Docker image. Use multi-stage builds to keep image size under 200MB. Run as a non-root user — this is required by most enterprise security policies and Kubernetes PodSecurityStandards.
Dockerfile — TypeScript MCP server
# Build stage FROM node:22-alpine AS builder WORKDIR /app COPY package*.json ./ RUN npm ci --only=production COPY tsconfig.json ./ COPY src ./src RUN npm run build # Runtime stage FROM node:22-alpine AS runtime WORKDIR /app # Non-root user (UID 1001 to avoid conflicts) RUN addgroup -g 1001 mcp && adduser -u 1001 -G mcp -s /bin/sh -D mcp USER mcp COPY --from=builder --chown=mcp:mcp /app/dist ./dist COPY --from=builder --chown=mcp:mcp /app/node_modules ./node_modules EXPOSE 3000 ENV NODE_ENV=production CMD ["node", "dist/server.js"]
2. Kubernetes Deployment
Deploy MCP servers as Kubernetes Deployments with at least 2 replicas for high availability. Set resource requests and limits to prevent noisy-neighbour issues. Use a PodDisruptionBudget to ensure rolling updates never take all replicas down simultaneously.
mcp-server-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: mcp-github-server
namespace: mcp-prod
spec:
replicas: 3
selector:
matchLabels:
app: mcp-github-server
template:
metadata:
labels:
app: mcp-github-server
spec:
securityContext:
runAsNonRoot: true
runAsUser: 1001
fsGroup: 1001
containers:
- name: server
image: yourregistry/mcp-github-server:v1.4.2
ports:
- containerPort: 3000
env:
- name: GITHUB_TOKEN
valueFrom:
secretKeyRef:
name: mcp-secrets
key: github-token
resources:
requests:
memory: "128Mi"
cpu: "100m"
limits:
memory: "512Mi"
cpu: "500m"
readinessProbe:
httpGet:
path: /health
port: 3000
initialDelaySeconds: 5
periodSeconds: 10
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: mcp-github-server-pdb
namespace: mcp-prod
spec:
minAvailable: 2
selector:
matchLabels:
app: mcp-github-server3. TLS and mTLS Configuration
Terminate public TLS at the ingress controller (nginx or AWS ALB). Between your AI gateway and the MCP pods, use mutual TLS (mTLS) — both sides present certificates — to prevent any rogue pod from impersonating an MCP server. Use cert-manager to automate certificate issuance and rotation via Let's Encrypt or your internal CA.
cert-manager Certificate resource
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: mcp-server-tls
namespace: mcp-prod
spec:
secretName: mcp-server-tls-secret
issuerRef:
name: internal-ca-issuer
kind: ClusterIssuer
commonName: mcp-github-server.mcp-prod.svc.cluster.local
dnsNames:
- mcp-github-server.mcp-prod.svc.cluster.local
duration: 720h # 30 days
renewBefore: 168h # Renew 7 days before expiry4. OAuth 2.1 with Azure Entra ID
Enterprise deployments require SSO integration. OAuth 2.1 with PKCE (Proof Key for Code Exchange) is the current recommended flow — it eliminates client secrets from the browser and prevents authorization code interception attacks. Register an app in Azure Entra ID (formerly Azure AD) and configure your MCP gateway to validate Entra-issued JWTs.
| Step | Where | What to configure |
|---|---|---|
| Register app | Entra ID portal | Single-page app, redirect URIs, API scopes |
| Define scopes | App registration → Expose an API | mcp.read, mcp.write, mcp.admin |
| JWKS endpoint | MCP gateway config | https://login.microsoftonline.com/{tenant}/discovery/v2.0/keys |
| Validate tokens | Gateway middleware | Audience, issuer, expiry, required scopes |
| Propagate identity | MCP server | Forward X-User-ID and X-User-Scopes headers |
5. Rate Limiting
Implement rate limiting at two layers: the gateway (to protect against API abuse) and inside each MCP server (to protect downstream systems). Use a sliding-window algorithm keyed on user ID — not IP address — so legitimate users behind corporate NAT are not collectively throttled.
Kong rate-limiting plugin config
plugins:
- name: rate-limiting
config:
minute: 120 # 120 tool calls per user per minute
hour: 2000
policy: redis # Shared state across gateway pods
redis_host: redis.mcp-prod.svc.cluster.local
identifier: consumer # Key by authenticated user, not IP
hide_client_headers: false6. Audit Logging
Every tool invocation must produce a structured audit log entry. At minimum, capture: who called what tool, with what arguments, at what time, and whether it succeeded. Strip sensitive values (tokens, passwords, PII) before logging. Retain audit logs for 90 days minimum — many compliance frameworks (SOC 2, ISO 27001) require this.
TypeScript — structured audit logging middleware
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
function withAuditLog(server: McpServer) {
const original = server.tool.bind(server);
return function auditedTool(
name: string,
schema: Record<string, unknown>,
handler: Function
) {
return original(name, schema, async (args: unknown, extra: unknown) => {
const start = Date.now();
const logEntry = {
timestamp: new Date().toISOString(),
tool: name,
userId: (extra as any)?.meta?.userId ?? "unknown",
sessionId: (extra as any)?.meta?.sessionId,
args: redactSensitiveFields(args),
status: "pending" as string,
durationMs: 0,
};
try {
const result = await handler(args, extra);
logEntry.status = "success";
logEntry.durationMs = Date.now() - start;
console.log(JSON.stringify(logEntry));
return result;
} catch (err) {
logEntry.status = "error";
logEntry.durationMs = Date.now() - start;
(logEntry as any).error = (err as Error).message;
console.log(JSON.stringify(logEntry));
throw err;
}
});
};
}
function redactSensitiveFields(args: unknown): unknown {
const REDACT = new Set(["token", "password", "secret", "api_key", "apiKey"]);
if (typeof args !== "object" || args === null) return args;
return Object.fromEntries(
Object.entries(args as Record<string, unknown>).map(([k, v]) => [
k,
REDACT.has(k.toLowerCase()) ? "[REDACTED]" : v,
])
);
}7. Multi-Tenant Isolation
For SaaS platforms serving multiple organizations, isolate tenants at the Kubernetes namespace level. Each organization gets its own namespace with dedicated secrets, resource quotas, and RBAC policies. MCP servers in one namespace cannot reach secrets or persistent volumes in another.
- One Kubernetes namespace per organization (
mcp-org-acme,mcp-org-initech) - Kubernetes
NetworkPolicyto deny cross-namespace pod communication - Separate database schemas per tenant — never a shared schema with a
tenant_idcolumn - Resource quotas (
ResourceQuota) to prevent one tenant from starving others
8. Monitoring with Prometheus and Datadog
Expose Prometheus metrics from each MCP server pod on /metrics. Track four golden signals: latency, traffic (tool calls/sec), errors (5xx rate), and saturation (CPU/memory usage). Forward to Datadog via the Datadog Agent DaemonSet for alerting and dashboards.
TypeScript — Prometheus metrics with prom-client
import { Counter, Histogram, register } from "prom-client";
import express from "express";
const toolCallsTotal = new Counter({
name: "mcp_tool_calls_total",
help: "Total MCP tool invocations",
labelNames: ["tool", "status"],
});
const toolLatencySeconds = new Histogram({
name: "mcp_tool_duration_seconds",
help: "MCP tool call latency",
labelNames: ["tool"],
buckets: [0.01, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5],
});
// Metrics endpoint for Prometheus scraping
const metricsApp = express();
metricsApp.get("/metrics", async (_req, res) => {
res.set("Content-Type", register.contentType);
res.end(await register.metrics());
});
metricsApp.listen(9090);9. CI/CD with CircleCI
Never deploy MCP server changes directly to production. Gate every deployment behind an automated test suite and a manual approval step. The pipeline below runs unit tests, builds the Docker image, pushes it to ECR, and deploys to staging automatically. Promotion to production requires a human approval in the CircleCI UI.
.circleci/config.yml (abbreviated)
version: 2.1
jobs:
test:
docker:
- image: node:22-alpine
steps:
- checkout
- run: npm ci
- run: npm test
- run: npm run lint
build-and-push:
machine: true
steps:
- checkout
- aws-ecr/build-and-push-image:
repo: mcp-github-server
tag: "${CIRCLE_SHA1}"
deploy-staging:
docker:
- image: bitnami/kubectl:latest
steps:
- run: |
kubectl set image deployment/mcp-github-server server=yourregistry/mcp-github-server:${CIRCLE_SHA1} -n mcp-staging
approve-production:
type: approval # Pauses pipeline; requires manual click
deploy-production:
docker:
- image: bitnami/kubectl:latest
steps:
- run: |
kubectl set image deployment/mcp-github-server server=yourregistry/mcp-github-server:${CIRCLE_SHA1} -n mcp-prod
workflows:
deploy:
jobs:
- test
- build-and-push:
requires: [test]
- deploy-staging:
requires: [build-and-push]
- approve-production:
requires: [deploy-staging]
- deploy-production:
requires: [approve-production]Production Readiness Checklist
| Area | Requirement | Tool |
|---|---|---|
| Containers | Non-root user, read-only rootfs | Docker, Kubernetes PodSecurityStandards |
| Encryption | TLS in transit, mTLS internal | cert-manager, Istio |
| Identity | OAuth 2.1 + PKCE, Entra ID SSO | Kong, Auth0, Entra ID |
| Rate limiting | Per-user sliding window, Redis-backed | Kong, nginx-ingress |
| Observability | Metrics, logs, traces | Prometheus, Datadog, Jaeger |
| Availability | 3+ replicas, PDB, health probes | Kubernetes Deployment |
| CI/CD | Automated tests + manual prod approval | CircleCI, GitHub Actions |