MCP Streaming and Triggers: Enabling Real-Time Events for AI Agents

MCP Streaming and Triggers let AI agents react to live data instead of waiting on polling cycles. This guide covers Streamable HTTP, SSE deprecation, MCP Triggers, and code examples.

MK

Mohammed Kafeel

Machine Learning Researcher

June 24, 202613 min read
On this page

Your AI agent just missed a critical stock price drop - because it was still waiting for its next polling cycle.

That's not a hypothetical. It's the default behavior of any agent built on a request/response loop without real-time event support. And it's exactly the problem that MCP Streaming and MCP Triggers are designed to fix.

In this guide, you'll learn:

  • What MCP Streaming is and how Streamable HTTP works under the hood
  • Why the old SSE transport was deprecated and what replaced it
  • What MCP Triggers are, how they differ from Tools and Resources, and why they matter
  • Working TypeScript and Python code examples you can run today
  • How streaming and triggers work together to build truly reactive AI agents

What Is MCP (Model Context Protocol)?

MCP is an open-source standard that lets AI models connect to external tools and data sources through a single, universal protocol. Think of it as the USB-C port for AI: instead of building a custom connector for every combination of model and tool, you build once to the MCP standard and everything just works. (New here? Start with what MCP is.)

Without MCP, you face the M×N problem: M AI models × N external tools = M×N custom integrations to build and maintain. MCP collapses that to M+N - one MCP client per model, one MCP server per tool.

Anthropic engineers David Soria Parra and Justin Spahr-Summers created MCP, which was publicly announced on November 25, 2024. In December 2025, Anthropic donated the protocol to the Agentic AI Foundation (AAIF), a directed fund under the Linux Foundation, alongside OpenAI's AGENTS.md and Block's Goose. That move made MCP a vendor-neutral industry standard. (We cover that handover in what the AAIF means for MCP.)

Current stable spec: 2025-11-25. The upcoming 2026-07-28 revision focuses on stateless protocol design.

MCP defines three core primitives:

  • Tools - Model-controlled. The AI decides when to call them. They perform actions and may have side effects (e.g., "send an email").
  • Resources - Application-controlled. The user or app decides what data to load. Read-only, no side effects (e.g., "load a file").
  • Triggers - Server-initiated. The MCP server proactively notifies the client when something happens. This is the emerging primitive for event-driven AI agents.

(For a deeper look at Tools, Resources, and Prompts, see MCP tools vs resources vs prompts.) The official spec and documentation live at modelcontextprotocol.io.


Why Do AI Agents Need Real-Time Data?

Traditional request/response is too slow for live scenarios. When an agent has to ask "any updates?" on a fixed interval - that's polling - it introduces latency equal to at least half the polling interval, and it wastes compute on empty responses.

Consider where polling actively fails:

  • Stock trading: A price threshold is crossed between polling cycles. The agent acts too late.
  • IoT sensor alerts: A temperature spike lasts 30 seconds. A 60-second polling interval misses it entirely.
  • Live customer support queues: A high-priority ticket sits unassigned for minutes because the agent hasn't polled yet.
  • Fraud detection: A suspicious transaction clears before the agent's next check.

The solution is a combination of streaming (continuous data push) and triggers (event-driven notifications). Here's how the three approaches compare:

Approach How It Works Latency Resource Use Best For
Polling Client asks repeatedly High (interval-dependent) Wasteful Simple, infrequent updates
Streaming Server pushes data continuously Low Moderate Live feeds, progress updates
Triggers Server notifies on specific events Near-zero Efficient Event-driven workflows

For Model Context Protocol real-time use cases, streaming and triggers aren't optional extras - they're the architecture.


What Is MCP Streaming? How Does It Work?

MCP Streaming lets a server push multiple JSON-RPC messages to a client over a single HTTP connection, instead of waiting to bundle everything into one response. The current standard for doing this is called Streamable HTTP.

The Evolution of MCP Transports

MCP has gone through three transport generations (we compare them all in our MCP transport comparison):

  • STDIO: The original transport. The client launches the MCP server as a subprocess and communicates via standard input/output. Fast and simple, but local-only - no network streaming.

  • HTTP + SSE (Server-Sent Events): The first remote transport, introduced in the 2024-11-05 spec. It used two separate endpoints (a GET endpoint for the SSE stream, a POST endpoint for messages). Deprecated in May 2025. Why? Security issues: authentication tokens ended up in URL query strings (visible in logs and browser history), and the two-connection model added unnecessary complexity.

  • Streamable HTTP: The current standard, introduced in the March 26, 2025 spec update. Single endpoint, both POST and GET, cleaner security model. This is what you should build on today.

How Streamable HTTP Works (Step-by-Step)

Here's exactly what happens when a client talks to an MCP server over Streamable HTTP:

  1. Client sends an HTTP POST to the /mcp endpoint with a JSON-RPC request body and an Accept: application/json, text/event-stream header.
  2. Server decides: return a single JSON response, or open a streaming SSE response.
  3. If streaming: server replies with Content-Type: text/event-stream and pushes multiple JSON-RPC messages down the connection.
  4. Client can also issue an HTTP GET to /mcp to open a persistent SSE stream for server-initiated messages - without sending any data first.
  5. Session management: the server assigns a session ID via the Mcp-Session-Id header during initialization. The client includes this on every subsequent request.
  6. Stream resumption: if the connection drops, the client sends a new GET with the Last-Event-ID header. The server replays any missed events from that point.

Key Technical Specs

Feature Detail
Endpoint Single /mcp endpoint (POST + GET)
Client → Server HTTP POST with JSON-RPC body
Server → Client SSE stream (text/event-stream)
Session ID Mcp-Session-Id header
Resumability Last-Event-ID header
Auth Bearer token on every request
Security Must validate Origin header; bind to localhost

Performance Benefits

Streamable HTTP dramatically improves TTFB - Time to First Byte, meaning how fast the first data arrives. Instead of waiting for the server to assemble a 5MB tool response before sending anything, your agent starts processing the first chunk immediately.

Concrete advantages:

  • Lower memory footprint: the server doesn't buffer the full response before sending
  • Higher concurrency: connections are stateless-friendly and work through standard HTTP proxies and load balancers
  • Better UX for progressive outputs: users see partial results as they stream in

Rule of thumb: consider streaming for any response over 1MB. For live data feeds - stock prices, sensor readings, log tails - streaming pays off even at small payload sizes. (Once those feeds get busy, a gateway helps - see our guide to handling real-time MCP at scale.)


MCP Streaming in Practice: Code Examples

Here's working code for both a TypeScript server and a Python client using Streamable HTTP. These are based on the official MCP SDKs.

TypeScript Server (Streamable HTTP)

This sets up an MCP server with a single calculate tool, using StreamableHTTPServerTransport to handle streaming connections. Each new client gets a cryptographically unique session ID via randomUUID().

import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StreamableHTTPServerTransport } from "@modelcontextprotocol/sdk/server/streamableHttp.js";
import { randomUUID } from "crypto";

const server = new McpServer({ name: "My Streaming Server", version: "1.0.0" });

// Register a tool
server.tool('calculate', { a: 'number', b: 'number' }, async ({ a, b }) => {
  return { content: [{ type: 'text', text: `Result: ${a + b}` }] };
});

// Set up Streamable HTTP transport
const transport = new StreamableHTTPServerTransport({
  sessionIdGenerator: () => randomUUID(),
  onsessioninitialized: (sessionId) => {
    console.log(`New session: ${sessionId}`);
  }
});

await server.connect(transport);
console.log("MCP server running on http://localhost:3000/mcp");

Install with: npm install @modelcontextprotocol/sdk

Python Client (Streamable HTTP)

This Python client connects to the server above, lists available tools, and calls the calculate tool. The streamable_http_client context manager handles the SSE connection lifecycle automatically.

from mcp.client.streamable_http import streamable_http_client
from mcp import ClientSession

async def main():
    async with streamable_http_client(url="http://localhost:3000/mcp") as (read, write):
        async with ClientSession(read, write) as session:
            await session.initialize()

            # List available tools
            tools = await session.list_tools()
            print(f"Available tools: {[t.name for t in tools.tools]}")

            # Call a tool
            result = await session.call_tool("calculate", {"a": 10, "b": 20})
            print(f"Result: {result.content[0].text}")

import asyncio
asyncio.run(main())

Install with: pip install mcp

The client connects, initializes a session, discovers tools via list_tools(), and calls calculate with a=10, b=20 - expecting Result: 30 back. Clean, minimal, and production-ready.


What Are MCP Triggers? The Next Frontier for Event-Driven AI

MCP Triggers are a server-initiated primitive that lets an MCP server proactively notify a client when something happens - without the client having to ask. Think of them as webhooks for AI agents.

Triggers vs. Tools vs. Resources

The three MCP primitives have distinct roles:

  • Tools: The AI model decides when to call them. They perform actions and may have side effects. Example: "Send an email."
  • Resources: The application or user decides what data to load. Read-only, no side effects. Example: "Load a file."
  • Triggers: The MCP server decides when to fire. Server-initiated, no side effects on the client. Example: "New order received."
Primitive Who Initiates Direction Side Effects Example
Tool AI model Client → Server Yes "Send an email"
Resource Application/User Client → Server No "Load a file"
Trigger MCP Server Server → Client No "New order received"

How MCP Triggers Work

The core idea is a webhook-like callback mechanism: when server-side state changes, the server pushes a notification to the client. The client doesn't need to poll or hold an open SSE connection just to wait for something to happen.

The MCP Triggers & Events Working Group is co-led by Clare Liguori (Amazon Web Services) and Peter Alexander (Anthropic). The work is incubating at github.com/modelcontextprotocol/experimental-ext-triggers-events.

What the working group is standardizing:

  • Subscription lifecycle: how clients subscribe to and unsubscribe from trigger events
  • Delivery semantics: at-least-once vs. exactly-once delivery guarantees
  • Event ordering: NOT globally guaranteed - ordering is transport-dependent, so build your systems to handle out-of-order delivery
  • Callback mechanisms: the wire format for server-to-client notifications

Status: Experimental/incubation. This is not yet a stable spec. Check the GitHub repo before building production systems on it.

Real-World Trigger Scenarios

Here's where event-driven AI agents built on MCP Triggers would make an immediate difference:

  1. E-commerce: New order placed → trigger fires → AI agent auto-generates fulfillment instructions
  2. Finance: Stock price crosses threshold → trigger fires → AI agent alerts portfolio manager
  3. IoT: Sensor temperature exceeds limit → trigger fires → AI agent initiates safety protocol
  4. Customer support: New high-priority ticket created → trigger fires → AI agent drafts response
  5. DevOps: CI/CD pipeline fails → trigger fires → AI agent analyzes logs and suggests fix
  6. Healthcare: Lab result available → trigger fires → AI agent flags abnormal values for review

In every case, the agent reacts in near-zero latency - not on the next polling cycle.


Streaming + Triggers Together: Building Truly Reactive AI Agents

Streaming and triggers solve different problems, and they're most powerful when combined.

Think of it this way:

  • Streaming is the firehose - a continuous flow of data your agent consumes as it arrives (live sensor readings, log streams, LLM token output).
  • Triggers are the alarm bell - a selective notification that fires only when a specific condition is met (new order, threshold crossed, pipeline failed).

An agent that only has streaming still has to decide what to watch and when to start watching. An agent that only has triggers gets notified but may miss context between events. Together, they give you an agent that:

  1. Subscribes to a trigger (e.g., "notify me when a new high-priority ticket arrives")
  2. Receives the trigger notification with minimal latency
  3. Opens a streaming connection to pull the full ticket context and conversation history
  4. Processes and responds while the data is still streaming in

This is the architecture of a truly reactive AI agent - not one that wakes up every 30 seconds and asks "anything new?"


MCP Streaming vs. REST, WebSockets, and GraphQL Subscriptions

MCP Streamable HTTP isn't the only real-time option - but it's the only one purpose-built for AI agent workflows.

Protocol Bidirectional Streaming AI-Native Complexity Best For
REST (polling) No No No Low Simple CRUD
WebSockets Yes Yes No High Chat, gaming
GraphQL Subscriptions Partial Yes No High Data-driven apps
MCP Streamable HTTP Partial Yes Yes Medium AI agent workflows

The key differentiator: MCP carries context, tool schemas, and session state natively. A WebSocket connection is just a pipe - you have to build the AI-specific semantics on top. MCP Streamable HTTP gives you those semantics out of the box.

WebSockets are more powerful for true full-duplex, high-frequency communication (think multiplayer games or trading terminals). But they require persistent connections that don't play nicely with standard HTTP infrastructure - proxies, load balancers, CDNs. MCP's Streamable HTTP is stateless-friendly and works through all of that without special configuration.

Use WebSockets if you need true full-duplex at very high frequency. Use MCP Streamable HTTP for AI agent workflows where you want AI-native semantics without the infrastructure headache.


What Are the Current Limitations of MCP Streaming and Triggers?

MCP Streaming is production-ready. MCP Triggers are not - yet. Here's the honest picture:

  • Triggers are still experimental: The Triggers & Events Working Group is actively incubating the spec. There's no stable release date confirmed as of June 2026. Don't build production systems on it without a fallback.

  • No global ordering guarantees: Event ordering in triggers is transport-dependent. Design your systems to handle out-of-order delivery from day one.

  • SSE deprecation migration: If you're running the old HTTP+SSE transport (spec 2024-11-05), you need to migrate to Streamable HTTP. The old transport is still supported for backward compatibility but shouldn't be used in new projects.

  • Security surface: Every POST request needs Bearer token validation. Servers must validate the Origin header to prevent DNS rebinding attacks. This isn't optional - skip it and you're exposed.

  • Stateless trade-offs: The upcoming 2026-07-28 spec pushes toward stateless design. If you're building session-heavy systems today, plan for how session state management will evolve. (We break the 2026 changes down in our MCP 2026 roadmap explainer.)

  • Tooling maturity: The SDKs are evolving fast. Expect breaking changes as the spec stabilizes - pin your SDK versions and watch the changelog.


How to Get Started with MCP Streaming Today

You can have a working MCP streaming server running in under 30 minutes. Here's the path:

  1. Read the official spec: Start at modelcontextprotocol.io - the 2025-11-25 spec is the current stable version. The transports page covers Streamable HTTP in full detail.

  2. Install the SDK:

    • TypeScript: npm install @modelcontextprotocol/sdk
    • Python: pip install mcp
  3. Build your first server: Use StreamableHTTPServerTransport (TypeScript) or streamable_http_client (Python) - the code examples above are your starting point.

  4. Test locally: Use STDIO transport for local development (simpler, no network config), then switch to Streamable HTTP for production deployments.

  5. Watch the Triggers WG: Follow github.com/modelcontextprotocol/experimental-ext-triggers-events for trigger spec updates. Star the repo to get notified.

  6. Join the community: MCP Discord and GitHub Discussions are active - good places to ask questions, report issues, and get early previews of spec changes.


🔑 Key Takeaways (Summary)

  • MCP is the open standard for AI-tool integration, now governed by the Linux Foundation's AAIF.
  • MCP Streaming via Streamable HTTP is the current production standard - single endpoint, SSE-based, resumable, secure.
  • The old SSE transport was deprecated in May 2025 due to security issues. Migrate now if you haven't.
  • MCP Triggers are the emerging server-initiated primitive - experimental, co-led by AWS and Anthropic, not yet stable.
  • Streaming + Triggers together = a reactive AI agent that consumes live data AND responds to specific events with near-zero latency.
  • Security: Bearer tokens + Origin header validation are mandatory, not optional.

Ready to build? Explore the official MCP spec, star the MCP GitHub repo, and keep an eye on the Triggers & Events Working Group for what's coming next. If this guide helped you, share it with a fellow AI engineer - the ecosystem grows when knowledge spreads.


Frequently Asked Questions

What is the difference between MCP Streaming and Server-Sent Events (SSE)?

SSE was the original MCP streaming transport, introduced in the 2024-11-05 spec and deprecated in May 2025. The problem: authentication tokens were exposed in URL query strings (visible in server logs and browser history), and the dual-endpoint design added unnecessary complexity. Streamable HTTP replaced it - single endpoint, POST and GET on /mcp, Bearer token in the Authorization header. SSE is still supported for backward compatibility but shouldn't be used in new projects.

Are MCP Triggers production-ready?

Not yet. Triggers are in the experimental/incubation phase, managed by the MCP Triggers & Events Working Group co-led by Clare Liguori (AWS) and Peter Alexander (Anthropic). The working group is still standardizing subscription lifecycle, delivery semantics, event ordering, and callback mechanisms. Check the GitHub repo for the latest status before building production systems on it.

How does MCP Streaming improve AI agent performance?

Streaming dramatically reduces TTFB (Time to First Byte) - the agent starts processing data as soon as the first chunk arrives, rather than waiting for the full response to be assembled. For large tool outputs (think a 10MB log file or a database query result), this means the agent can start reasoning while data is still arriving. Memory usage also drops because the server doesn't buffer the full response before sending.

Can I use MCP with WebSockets instead of Streamable HTTP?

Yes, WebSockets are a supported alternative for bidirectional real-time communication. However, Streamable HTTP is the recommended default: it's stateless-friendly, works through standard HTTP infrastructure (proxies, load balancers, CDNs), and doesn't require persistent connections. Use WebSockets if you need true full-duplex communication at very high frequency - otherwise, Streamable HTTP is simpler and more infrastructure-compatible.

What security measures does MCP Streaming require?

Every request must include a Bearer token in the Authorization header for authentication. Servers must validate the Origin header on all incoming connections to prevent DNS rebinding attacks - an attacker could otherwise use a malicious website to interact with a local MCP server. In local deployments, servers should bind only to localhost (127.0.0.1), not all interfaces (0.0.0.0). Session IDs (Mcp-Session-Id) must be cryptographically secure (e.g., a UUID or JWT) and contain only visible ASCII characters.

What is the M×N problem that MCP solves?

The M×N problem is the integration explosion that happens when you have M AI models and N external tools - without a standard protocol, you'd need M×N custom connectors. MCP solves this with a single standard: each tool exposes one MCP server, each AI model implements one MCP client. The result is M+N integrations instead of M×N. With over 10,000 published MCP servers already (as of the AAIF launch in December 2025), the ecosystem effect is already compounding.


Useful Sources