MCP's Largest Revision Yet: A Stateless Core, Extensions, and a Formal Deprecation Policy
The Model Context Protocol steering group locked a release candidate for the next MCP specification, describing it as the largest revision to the protocol since its launch. The final spec is due to publish on July 28, 2026, but the RC is already a useful preview of where MCP is heading architecturally.
The most consequential change is the removal of the initialize handshake and Mcp-Session-Id headers in favor of a fully stateless protocol core. Today, MCP sessions pin a client to a specific server instance via sticky sessions; under the new model, every request carries its own metadata, so "any MCP request can land on any server instance." That means MCP servers can sit behind a plain round-robin load balancer instead of requiring shared session stores — a meaningful simplification for anyone running MCP servers at scale.
The spec also formalizes an Extensions framework, where extensions get reverse-DNS identifiers and independent versioning so they can evolve without breaking the core protocol. Two official extensions ship with this revision: MCP Apps, which lets servers ship interactive HTML interfaces that hosts render in a sandboxed iframe, and Tasks, which moves from an experimental core feature into an extension restructured around the stateless model — servers return task handles, and clients drive progress polling.
On the security side, six proposals bring MCP's authorization model closer to standard OAuth 2.0 / OpenID Connect practice, including mandatory iss parameter validation and clearer binding of credentials to specific authorization servers. Finally, the spec introduces a formal deprecation policy — Roots, Sampling, and Logging are the first features marked deprecated under it, with a guaranteed minimum of twelve months between deprecation and removal, giving existing implementations a predictable runway.
Read more — Model Context Protocol Blog
Hugging Face and Meta Rally the Open Source Community Around OpenEnv for Agentic RL
Hugging Face, working with Meta's PyTorch team, is building out OpenEnv — a shared specification and hub for "frontier-grade" reinforcement learning environments that agents can train and evaluate against, covering things like terminals, browsers, and other interactive tools an agent might need to operate. The goal is to give the open source community a common format for packaging and sharing these environments, the same way model weights and datasets already have a shared home on the Hub.
The OpenEnv 0.1 spec is being released as an RFC to gather feedback from the community before it solidifies, and Hugging Face has published walkthroughs showing how to train and evaluate agents against OpenEnv-compatible environments using TRL and Unsloth, among other libraries. For teams building or evaluating tool-using agents, OpenEnv is worth watching as a potential standard for benchmarking agent behavior in realistic, interactive settings rather than static question-answering datasets — the kind of evaluation gap that's increasingly cited as a blocker to trusting agentic systems in production.
Read more — Hugging Face Blog
Safe & Secure AI Agent Practices
Docker: A Practical Framework for Securing AI Coding Agents
Docker published practical guidance for development teams on securing AI agents, organized around four areas: isolation, tool access control, identity management, and runtime monitoring. The core message is that developer laptops are becoming production environments — an AI coding agent running locally with broad filesystem and network access carries real production-grade risk, not just a sandboxed "toy" risk.
On isolation, the recommendation is to containerize AI agents so they can't directly affect the host system, which maps to Docker's own Sandboxes product for running coding agents in contained environments. Tool access control means explicitly restricting which tools and APIs an agent can invoke — Docker points to its MCP Catalog and Toolkit as one way to manage which MCP tools are reachable. Identity management focuses on controlling which credentials an agent can use, preventing a compromised or misbehaving agent from reaching systems and data it has no business touching. Runtime monitoring rounds it out: logging what an agent actually executes and which resources it touches, so anomalous behavior can be detected and investigated after the fact.
The post connects to two other recent Docker pieces worth a look if this is relevant to your team: "Coding Agent Horror Stories: The rm -rf ~/ Incident," which walks through a real case of an AI-generated destructive command and how sandboxing would have contained it, and "What is AI Governance?", which frames these isolation and access-control practices as part of a broader governance program rather than one-off mitigations.
Read more — Docker Blog
Anthropic Maps a Year of AI-Enabled Cyber Threats to MITRE ATT&CK — and Finds Gaps in the Framework
Anthropic published an analysis of 832 accounts banned for malicious cyber activity between March 2025 and March 2026, mapping the techniques used against the MITRE ATT&CK framework. The findings are a useful data point for anyone trying to reason about how AI is actually changing the threat landscape today, as opposed to how it might in theory.
The majority of malicious actors (67.3%) used AI primarily for malware development, while a smaller group used it for more complex post-compromise activity. Interestingly, AI-assisted phishing usage fell 8.6% over the period while AI-assisted account discovery rose 8.9% — Anthropic reads this as attackers shifting effort from initial access toward deeper lateral movement within already-compromised networks.
Perhaps the most actionable finding is that traditional risk signals — number of techniques used, type of platform targeted — no longer reliably predict how dangerous an actor is. The least and most sophisticated actors in the dataset used a similar number of distinct techniques (16 versus 20); what distinguished the highest-risk actors was architecture — chaining discrete attack stages together with minimal human oversight, i.e., autonomous orchestration. One disrupted state-sponsored espionage campaign mapped to only 30 ATT&CK techniques (a "medium" score under the framework's normal scoring) but, in Anthropic's assessment, warranted a maximum risk score of 100 once its autonomous, multi-stage design was accounted for.
Anthropic notes it has deployed cyber-specific safeguards on its models in response, and is working with MITRE on evolving ATT&CK to better capture these AI-driven, agentic behaviors — a sign that the standard frameworks defenders rely on are still catching up to how AI is actually being used offensively.
Read more — Anthropic