MCP Enterprise Deployment: Production Guide for 2026

Everything you need to run MCP servers at enterprise scale: container orchestration, mutual TLS, OAuth 2.1 with Azure Entra ID, audit logging, multi-tenant isolation, and production observability.

TL;DR

Run MCP servers as Kubernetes Deployments behind a gateway — never expose stdio servers over the public internet
Use mTLS between the AI gateway and MCP pods; terminate public TLS at the ingress
Authenticate users with OAuth 2.1 + PKCE; integrate with Azure Entra ID for enterprise SSO
Emit structured JSON logs to stdout; collect with Fluentd and ship to Datadog or Elastic
Isolate tenants via Kubernetes namespaces — one namespace per org, RBAC-enforced
Gate deployments behind CircleCI approval jobs; never auto-push MCP server changes to production

Architecture Overview

A production MCP deployment has three layers: an AI gateway that handles OAuth, rate limiting, and routing; a pool of MCP server pods that expose tools via SSE transport; and the downstream data systems (databases, APIs, object storage) that the tools actually reach.

┌─────────────────────────────────────┐
│          AI Application             │
│     (Claude, GPT-4, etc.)           │
└────────────────┬────────────────────┘
                 │ HTTPS / OAuth 2.1
┌────────────────▼────────────────────┐
│         API Gateway (Kong / nginx)  │
│  Rate limiting • Auth • Routing     │
└────────────────┬────────────────────┘
                 │ mTLS
┌────────────────▼────────────────────┐
│        MCP Server Pods (k8s)        │
│  SSE transport • Tool handlers      │
└────────┬───────────────┬────────────┘
         │               │
┌────────▼──────┐ ┌──────▼────────────┐
│  PostgreSQL   │ │  External APIs    │
│  Redis Cache  │ │  (HubSpot, etc.)  │
└───────────────┘ └───────────────────┘

1. Docker Setup

Package each MCP server as a minimal Docker image. Use multi-stage builds to keep image size under 200MB. Run as a non-root user — this is required by most enterprise security policies and Kubernetes PodSecurityStandards.

Dockerfile — TypeScript MCP server

# Build stage
FROM node:22-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY tsconfig.json ./
COPY src ./src
RUN npm run build

# Runtime stage
FROM node:22-alpine AS runtime
WORKDIR /app

# Non-root user (UID 1001 to avoid conflicts)
RUN addgroup -g 1001 mcp && adduser -u 1001 -G mcp -s /bin/sh -D mcp
USER mcp

COPY --from=builder --chown=mcp:mcp /app/dist ./dist
COPY --from=builder --chown=mcp:mcp /app/node_modules ./node_modules

EXPOSE 3000
ENV NODE_ENV=production

CMD ["node", "dist/server.js"]

2. Kubernetes Deployment

Deploy MCP servers as Kubernetes Deployments with at least 2 replicas for high availability. Set resource requests and limits to prevent noisy-neighbour issues. Use a PodDisruptionBudget to ensure rolling updates never take all replicas down simultaneously.

mcp-server-deployment.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: mcp-github-server
  namespace: mcp-prod
spec:
  replicas: 3
  selector:
    matchLabels:
      app: mcp-github-server
  template:
    metadata:
      labels:
        app: mcp-github-server
    spec:
      securityContext:
        runAsNonRoot: true
        runAsUser: 1001
        fsGroup: 1001
      containers:
        - name: server
          image: yourregistry/mcp-github-server:v1.4.2
          ports:
            - containerPort: 3000
          env:
            - name: GITHUB_TOKEN
              valueFrom:
                secretKeyRef:
                  name: mcp-secrets
                  key: github-token
          resources:
            requests:
              memory: "128Mi"
              cpu: "100m"
            limits:
              memory: "512Mi"
              cpu: "500m"
          readinessProbe:
            httpGet:
              path: /health
              port: 3000
            initialDelaySeconds: 5
            periodSeconds: 10
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: mcp-github-server-pdb
  namespace: mcp-prod
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: mcp-github-server

3. TLS and mTLS Configuration

Terminate public TLS at the ingress controller (nginx or AWS ALB). Between your AI gateway and the MCP pods, use mutual TLS (mTLS) — both sides present certificates — to prevent any rogue pod from impersonating an MCP server. Use cert-manager to automate certificate issuance and rotation via Let's Encrypt or your internal CA.

cert-manager Certificate resource

apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: mcp-server-tls
  namespace: mcp-prod
spec:
  secretName: mcp-server-tls-secret
  issuerRef:
    name: internal-ca-issuer
    kind: ClusterIssuer
  commonName: mcp-github-server.mcp-prod.svc.cluster.local
  dnsNames:
    - mcp-github-server.mcp-prod.svc.cluster.local
  duration: 720h   # 30 days
  renewBefore: 168h # Renew 7 days before expiry

4. OAuth 2.1 with Azure Entra ID

Enterprise deployments require SSO integration. OAuth 2.1 with PKCE (Proof Key for Code Exchange) is the current recommended flow — it eliminates client secrets from the browser and prevents authorization code interception attacks. Register an app in Azure Entra ID (formerly Azure AD) and configure your MCP gateway to validate Entra-issued JWTs.

Step	Where	What to configure
Register app	Entra ID portal	Single-page app, redirect URIs, API scopes
Define scopes	App registration → Expose an API	`mcp.read`, `mcp.write`, `mcp.admin`
JWKS endpoint	MCP gateway config	`https://login.microsoftonline.com/{tenant}/discovery/v2.0/keys`
Validate tokens	Gateway middleware	Audience, issuer, expiry, required scopes
Propagate identity	MCP server	Forward `X-User-ID` and `X-User-Scopes` headers

5. Rate Limiting

Implement rate limiting at two layers: the gateway (to protect against API abuse) and inside each MCP server (to protect downstream systems). Use a sliding-window algorithm keyed on user ID — not IP address — so legitimate users behind corporate NAT are not collectively throttled.

Kong rate-limiting plugin config

plugins:
  - name: rate-limiting
    config:
      minute: 120          # 120 tool calls per user per minute
      hour: 2000
      policy: redis        # Shared state across gateway pods
      redis_host: redis.mcp-prod.svc.cluster.local
      identifier: consumer # Key by authenticated user, not IP
      hide_client_headers: false

6. Audit Logging

Every tool invocation must produce a structured audit log entry. At minimum, capture: who called what tool, with what arguments, at what time, and whether it succeeded. Strip sensitive values (tokens, passwords, PII) before logging. Retain audit logs for 90 days minimum — many compliance frameworks (SOC 2, ISO 27001) require this.

TypeScript — structured audit logging middleware

import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";

function withAuditLog(server: McpServer) {
  const original = server.tool.bind(server);

  return function auditedTool(
    name: string,
    schema: Record<string, unknown>,
    handler: Function
  ) {
    return original(name, schema, async (args: unknown, extra: unknown) => {
      const start = Date.now();
      const logEntry = {
        timestamp: new Date().toISOString(),
        tool: name,
        userId: (extra as any)?.meta?.userId ?? "unknown",
        sessionId: (extra as any)?.meta?.sessionId,
        args: redactSensitiveFields(args),
        status: "pending" as string,
        durationMs: 0,
      };

      try {
        const result = await handler(args, extra);
        logEntry.status = "success";
        logEntry.durationMs = Date.now() - start;
        console.log(JSON.stringify(logEntry));
        return result;
      } catch (err) {
        logEntry.status = "error";
        logEntry.durationMs = Date.now() - start;
        (logEntry as any).error = (err as Error).message;
        console.log(JSON.stringify(logEntry));
        throw err;
      }
    });
  };
}

function redactSensitiveFields(args: unknown): unknown {
  const REDACT = new Set(["token", "password", "secret", "api_key", "apiKey"]);
  if (typeof args !== "object" || args === null) return args;
  return Object.fromEntries(
    Object.entries(args as Record<string, unknown>).map(([k, v]) => [
      k,
      REDACT.has(k.toLowerCase()) ? "[REDACTED]" : v,
    ])
  );
}

7. Multi-Tenant Isolation

For SaaS platforms serving multiple organizations, isolate tenants at the Kubernetes namespace level. Each organization gets its own namespace with dedicated secrets, resource quotas, and RBAC policies. MCP servers in one namespace cannot reach secrets or persistent volumes in another.

One Kubernetes namespace per organization (mcp-org-acme, mcp-org-initech)
Kubernetes NetworkPolicy to deny cross-namespace pod communication
Separate database schemas per tenant — never a shared schema with a tenant_id column
Resource quotas (ResourceQuota) to prevent one tenant from starving others

8. Monitoring with Prometheus and Datadog

Expose Prometheus metrics from each MCP server pod on /metrics. Track four golden signals: latency, traffic (tool calls/sec), errors (5xx rate), and saturation (CPU/memory usage). Forward to Datadog via the Datadog Agent DaemonSet for alerting and dashboards.

TypeScript — Prometheus metrics with prom-client

import { Counter, Histogram, register } from "prom-client";
import express from "express";

const toolCallsTotal = new Counter({
  name: "mcp_tool_calls_total",
  help: "Total MCP tool invocations",
  labelNames: ["tool", "status"],
});

const toolLatencySeconds = new Histogram({
  name: "mcp_tool_duration_seconds",
  help: "MCP tool call latency",
  labelNames: ["tool"],
  buckets: [0.01, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5],
});

// Metrics endpoint for Prometheus scraping
const metricsApp = express();
metricsApp.get("/metrics", async (_req, res) => {
  res.set("Content-Type", register.contentType);
  res.end(await register.metrics());
});
metricsApp.listen(9090);

9. CI/CD with CircleCI

Never deploy MCP server changes directly to production. Gate every deployment behind an automated test suite and a manual approval step. The pipeline below runs unit tests, builds the Docker image, pushes it to ECR, and deploys to staging automatically. Promotion to production requires a human approval in the CircleCI UI.

.circleci/config.yml (abbreviated)

version: 2.1
jobs:
  test:
    docker:
      - image: node:22-alpine
    steps:
      - checkout
      - run: npm ci
      - run: npm test
      - run: npm run lint

  build-and-push:
    machine: true
    steps:
      - checkout
      - aws-ecr/build-and-push-image:
          repo: mcp-github-server
          tag: "${CIRCLE_SHA1}"

  deploy-staging:
    docker:
      - image: bitnami/kubectl:latest
    steps:
      - run: |
          kubectl set image deployment/mcp-github-server             server=yourregistry/mcp-github-server:${CIRCLE_SHA1}             -n mcp-staging

  approve-production:
    type: approval   # Pauses pipeline; requires manual click

  deploy-production:
    docker:
      - image: bitnami/kubectl:latest
    steps:
      - run: |
          kubectl set image deployment/mcp-github-server             server=yourregistry/mcp-github-server:${CIRCLE_SHA1}             -n mcp-prod

workflows:
  deploy:
    jobs:
      - test
      - build-and-push:
          requires: [test]
      - deploy-staging:
          requires: [build-and-push]
      - approve-production:
          requires: [deploy-staging]
      - deploy-production:
          requires: [approve-production]

Production Readiness Checklist

Area	Requirement	Tool
Containers	Non-root user, read-only rootfs	Docker, Kubernetes PodSecurityStandards
Encryption	TLS in transit, mTLS internal	cert-manager, Istio
Identity	OAuth 2.1 + PKCE, Entra ID SSO	Kong, Auth0, Entra ID
Rate limiting	Per-user sliding window, Redis-backed	Kong, nginx-ingress
Observability	Metrics, logs, traces	Prometheus, Datadog, Jaeger
Availability	3+ replicas, PDB, health probes	Kubernetes Deployment
CI/CD	Automated tests + manual prod approval	CircleCI, GitHub Actions

Have Questions?

Join the MCP community on GitHub or Discord for help and discussion.