MCP Audit Logging: What to Capture for Every Tool Invocation

The MCP spec treats audit logging as optional. SOC 2, HIPAA, and PCI-DSS don't. Here's exactly what to capture - and how to do it safely.

MK

Mohammed Kafeel

Machine Learning Researcher

June 13, 202615 min read
On this page

Your AI agent just deleted a production record. Do you know which tool call did it - and why?

If you're running MCP servers without structured audit logging, the honest answer is probably no. The MCP spec doesn't require an audit trail. SOC 2, HIPAA, and PCI-DSS do. (Audit logging is one item on the broader MCP security checklist.)


⚡ Key Takeaways

  • MCP treats audit logging as optional (SHOULD, not MUST) - but SOC 2, HIPAA, PCI-DSS 4.0, and GDPR all require mandatory audit trails.
  • Every tool invocation must capture 7 fields: timestamp, principal/identity, tool name + server, input arguments (redacted), output/result, outcome + latency, and a trace/correlation ID.
  • Secret leakage through tool responses is a real, documented risk.
  • Log at the gateway/proxy layer, not just the MCP server.
  • Structured JSON to stdout is the right default; pipe to Fluent Bit, Grafana Alloy, or the OpenTelemetry Collector.
  • Hot storage (7–90 days) + cold storage (1–7 years) with append-only writes covers most compliance retention.

Why MCP Audit Logging Is Not Optional

MCP is the emerging standard for connecting AI agents to tools. Anthropic introduced it in November 2024. Every tool call flows through the MCP JSON-RPC 2.0 message layer.

Here's the uncomfortable truth: the MCP specification (as of the 2025-06-18 release) lists logging under "Additional Utilities" and uses SHOULD-level language, not MUST.

The result is three concrete problems:

  • Inconsistent coverage. One server logs every tool call. Another logs errors only. A third logs nothing.
  • Secret leakage. Tool responses often contain credentials, API keys, tokens, or PII.
  • Tamper risk. Logs in flat files or app databases can be modified or deleted. Forensic value: zero.

Compliance frameworks that require audit trails

Compliance Framework Key Requirement What MCP Must Log
SOC 2 Type II Log all customer data access with user attribution Principal, tool name, arguments, outcome, timestamp
HIPAA 45 CFR § 164.312(b) Audit controls for all ePHI access Who accessed what, when, what the tool returned
PCI-DSS 4.0 Req. 10 Immutable logs of cardholder data environment access Tamper-evident logs of every tool call touching payment data
GDPR Art. 30 Records of processing activities Mechanism to identify and log personal data

The 7 Things You Must Capture for Every Tool Invocation

1. Timestamp

Capture: ISO 8601 format, millisecond precision, UTC. Example: 2025-06-23T14:32:07.841Z

Why: Millisecond precision is non-negotiable for replay attack analysis and incident timelines.

How: Stamp at the gateway layer when the request arrives.

2. Principal / Identity

Capture: User ID, agent ID, or virtual key. In OIDC environments: name claim → preferred_usernameemail"anonymous" as last resort.

Why: Without identity, you can't answer "who did this?" - and identity is also what per-tool access control keys off when deciding whether a call is allowed.

3. Tool Name + MCP Server

Capture: The exact tool invoked (e.g., github_create_pr) and which MCP server hosted it (e.g., backend_name: "github").

Why: In multi-server environments, you need server-level attribution.

4. Input Arguments

Capture: Full JSON parameters - with sensitive fields redacted before storage.

The leak example:

{
  "tool": "database_query",
  "input": {
    "connection_string": "postgresql://admin:xK9$mP2zQ@prod-db.internal:5432/users",
    "query": "SELECT * FROM users LIMIT 10"
  }
}

That credential is now in your logs, your SIEM, and your cold storage archive - for years.

How: Schema-driven redaction. Tools that declare "sensitive": true on a field get auto-replaced with "[REDACTED]". Add pattern-based detection (regex for connection strings, API key formats) as a safety net.

5. Output / Result

Capture: What the tool returned - with configurable payload logging and a maxDataSize truncation limit.

Why: The response is where secret leakage most often occurs.

How: Default to includeResponseData: false. Set maxDataSize (e.g., 4096 bytes).

6. Outcome + Latency

Capture: success, failure, timeout, denied, or error - plus execution duration in milliseconds.

Why: Outcome drives alerting. Latency matters for SLA monitoring.

7. Trace / Correlation ID

Capture: A unique ID that links this tool call to the parent LLM request, session, and broader workflow.

Why: AI agents make chains of calls. Without a correlation ID, you can't reconstruct the full sequence.

Summary

Field Example Value Why It Matters
timestamp 2025-06-23T14:32:07.841Z Incident timelines
principal alice@company.com / sub-alice-123 Identity attribution
tool_name + server database_query / prod-db-server Multi-server attribution
input_args (redacted) {"query": "SELECT...", "connection_string": "[REDACTED]"} What was requested
output (configurable) {"rows": 10} (truncated) What the tool returned
outcome + duration_ms success / 234 SLA monitoring, anomalies
trace_id corr-abc123-xyz789 Multi-step agent reconstruction

The Audit Log Schema - A Practical Template

{
  "time": "2025-06-23T14:32:07.841Z",
  "level": "AUDIT",
  "msg": "audit_event",
  "audit_id": "a3f2b8d1-4c5e-6789-abcd-ef0123456789",
  "type": "mcp_tool_call",
  "logged_at": "2025-06-23T14:32:07.838Z",
  "outcome": "success",
  "component": "vmcp-production",
  "source": {
    "type": "network",
    "value": "10.0.1.50",
    "extra": {
      "user_agent": "Claude/3.5"
    }
  },
  "subjects": {
    "user": "alice@company.com",
    "user_id": "sub-alice-123",
    "client_name": "Claude Desktop",
    "client_version": "1.0.0"
  },
  "target": {
    "endpoint": "/mcp",
    "method": "tools/call",
    "type": "tool",
    "name": "database_query"
  },
  "metadata": {
    "extra": {
      "duration_ms": 187,
      "transport": "http",
      "backend_name": "prod-db-server"
    }
  },
  "data": {
    "request": {
      "query": "SELECT id, name FROM users LIMIT 10",
      "connection_string": "[REDACTED]"
    },
    "correlation_id": "corr-abc123-xyz789"
  }
}

What NOT to Log (and How to Redact Safely)

Logging too much is a security risk. Logging too little is a compliance risk.

Redaction strategies

Schema-driven redaction - tools declare "sensitive": true on fields → automatic masking before logging.

Pattern-based detection - regex for connection strings (postgresql://, mongodb://), API keys (sk-..., Bearer ...), AWS credentials (AKIA...).

Configurable payload logging with toggle (includeResponseData: true/false) and size cap (maxDataSize: 4096).

Allowlist logging - only log fields explicitly approved. Strictest approach for high-compliance environments.


MCP Event Types - Beyond Tool Calls

Event Type When It Fires Why Log It
sse_connection SSE transport established Detect unauthorized clients
mcp_initialize Client initializes MCP connection Session start anchor
mcp_ping Health check pings Usually excluded (noise)
mcp_tool_call Tool invoked Core audit event
mcp_tools_list Agent lists available tools Reconnaissance detection
mcp_resource_read Agent reads a resource Data access logging (HIPAA, GDPR)
mcp_prompt_get Agent retrieves prompt template Prompt injection surface
vmcp_workflow_started Multi-step workflow begins Workflow-level attribution
vmcp_workflow_completed Workflow finishes End-to-end latency tracking
vmcp_workflow_failed Workflow fails Immediate alert trigger

Start with mcp_tool_call and vmcp_workflow_* events. Exclude mcp_ping by default.


Retention, Storage, and Tamper-Proofing

Hot storage: 7–90 days

Fast, queryable, expensive. Use Elasticsearch, ClickHouse, or your SIEM's hot tier.

Cold storage: 1–7 years

Cheap, durable, compliance-oriented. Use S3 with Object Lock (WORM), GCS retention policies, or Azure Immutable Blob Storage.

Tamper-proofing

PCI-DSS 4.0 Requirement 10 explicitly mandates tamper-evident log storage.

Options:

  • Cloud WORM storage (S3 Object Lock, GCS retention) - simplest
  • Cryptographic chaining - each log entry hashes the previous
  • Dedicated audit log services - AWS CloudTrail, Azure Monitor Logs

Access control

  • Read-only for security teams and auditors
  • No modification rights for anyone
  • Log access to the logs

Integration


A Practical Checklist - Audit Logging Setup in 6 Steps

Step 1: Enable audit logging at the MCP gateway/proxy layer. Don't rely on individual MCP server implementations.

Step 2: Define your event type allowlist. Start with mcp_tool_call, mcp_initialize, vmcp_workflow_*. Exclude mcp_ping.

Step 3: Configure payload logging with redaction rules. Set includeRequestData: true, includeResponseData: false. Set maxDataSize: 4096.

Step 4: Set up structured JSON output to stdout → pipe to log collector. Filter for "level": "AUDIT" or "msg": "audit_event".

Step 5: Define retention tiers and automated archival. 30–90 days hot, 1–7 years cold (S3 Object Lock).

Step 6: Test with a simulated tool call and verify all 7 fields appear.


FAQ

What is MCP audit logging?

The practice of recording a structured, tamper-evident log entry for every operation performed by an MCP server - designed to satisfy compliance requirements (SOC 2, HIPAA, PCI-DSS) and support incident investigation.

What should every MCP tool invocation log capture?

Seven fields: timestamp (ISO 8601 UTC, ms precision), principal/identity, tool name + MCP server, input arguments (with sensitive fields redacted), output/result, outcome + execution duration, and trace/correlation ID.

Is MCP audit logging required for compliance?

Yes, if you're operating in a regulated environment. SOC 2 Type II requires user-attributed access logs. HIPAA §164.312(b) mandates audit controls. PCI-DSS 4.0 Req. 10 requires immutable logs. GDPR Art. 30 requires records of processing activities. (Our guide to compliance logging requirements covers HIPAA and GDPR for AI agents in depth.)

How do you prevent secrets from leaking into MCP audit logs?

Schema-driven redaction (tools declare "sensitive": true) combined with pattern-based detection (regex for connection strings, API keys, bearer tokens). Set includeResponseData: false by default. Always set a maxDataSize cap.

What's the difference between MCP logging and MCP audit logging?

MCP logging (in the protocol spec) refers to the notifications/message mechanism for debugging. MCP audit logging is a separate, security-oriented practice: structured, tamper-evident records of every protocol operation.

How long should MCP audit logs be retained?

Tiered: 30–90 days in hot storage, 1–7 years in cold immutable storage. PCI-DSS 4.0 requires ≥12 months with 3 months immediately available; HIPAA typically requires 6 years; SOC 2 auditors generally want ≥12 months.

Can MCP audit logs be used for debugging AI agent behavior?

Yes. With correlation IDs linking every tool call to a parent LLM request, you can reconstruct the exact sequence of tool invocations an agent made during a session.


Useful Sources