Open Responses: An Open-Source Inference Standard Built for Agentic AI
Open Responses is a new open-source inference specification initiated by OpenAI, built by the broader open-source AI community, and backed by the Hugging Face ecosystem. It extends and open-sources OpenAI's Responses API (launched March 2025) to address a fundamental architectural mismatch: the Chat Completion format was designed for turn-based conversations but falls short when systems need to reason, plan, and act autonomously across extended multi-step workflows.
The specification introduces several design choices tailored to agentic patterns. Responses are stateless by default, with optional encrypted reasoning for provider-specific internal traces. Streaming is delivered as typed semantic events (not raw text deltas), enabling clients to react to discrete reasoning steps, tool call initiations, and result deliveries without parsing undifferentiated token streams. Native agent loop support is built into the request schema: developers specify max_tool_calls, tool_choice, and tool definitions in a single payload, and the provider manages the reasoning/tool-call loop server-side.
Reasoning visibility is tiered through three optional fields — content (raw reasoning traces), encrypted_content (provider-specific protected content), and summary (a sanitised summary) — giving consumers control over how much internal reasoning they expose. The tool model distinguishes between external tools (MCP servers, client-side functions) and internal tools (hosted within provider infrastructure like file search or API integrations). A compatible implementation is available on Hugging Face Inference Providers today. For developers building routing layers, inference proxies, or multi-provider agents, Open Responses provides a shared interface that avoids hard-coding provider-specific API shapes.
Read more — Hugging Face Blog
Anthropic Research: AI Agent Sessions Are Getting Longer and More Autonomous
Anthropic published "Measuring AI Agent Autonomy in Practice", a research analysis examining millions of human-agent interactions across Claude Code and the public API from late 2025 through early 2026. The findings quantify how developers are actually granting autonomy to AI agents — and the trend is accelerating faster than model capability gains alone can explain.
The most striking data point: between October 2025 and January 2026, the 99.9th percentile turn duration in Claude Code sessions nearly doubled, from under 25 minutes to over 45 minutes. Crucially, this growth was smooth across model releases, suggesting that user behaviour — not just model capability — is driving longer autonomous runs. As users gain experience with Claude Code, they transition from reviewing individual actions to granting full auto-approval in over 40% of sessions (up from ~20% for new users), while also increasing interruption rates from 5% to 9%. The pattern reflects a shift from action-by-action supervision to exception-based intervention.
The research also found that Claude Code itself pauses for clarification more than twice as often as humans interrupt it on complex tasks — an emergent self-limiting behaviour where the model recognises uncertainty and requests human guidance proactively. Software engineering accounts for nearly 50% of all API tool calls, with healthcare and finance showing emerging but limited adoption. Reassuringly, only 0.8% of agent actions in the dataset appear irreversible, and 80% of actions involve safety check points.
For developers building or deploying agent systems, the research's practical recommendations are clear: invest in post-deployment monitoring infrastructure, train models to surface uncertainty rather than proceeding blindly, and design product interfaces that emphasize user visibility and intervention capabilities rather than mandating rigid approval workflows. The full paper is available on Anthropic's research page.
Read more — Anthropic Research
HuggingFace State of Open Source AI: Spring 2026
Hugging Face published its State of Open Source on Hugging Face: Spring 2026 report, covering ecosystem growth, geographic shifts, model adoption patterns, and emerging use cases. The report draws on data across 13 million users, 2 million public models, and 500,000 public datasets — with the user base having nearly doubled year-over-year.
The geographic picture has shifted significantly: China surpassed the United States in monthly downloads, with Chinese organizations accounting for 41% of downloads in 2025. Baidu grew from zero to over 100 releases; ByteDance and Tencent each increased releases by 8–9×. The Qwen family (Alibaba) accumulated more than 113,000 direct derivative models — more derivatives than Google and Meta combined — and over 200,000 when all Qwen-tagged models are counted. At the same time, the independent developer share of downloads rose from 17% to 39%, while corporate/industry share dropped from ~70% to 37%, reflecting a democratization of model development.
The robotics category stands out as the fastest-growing use case: robotics datasets grew from 1,145 to 26,991 in a single year (a 23× increase), climbing from rank 44 to the largest dataset category on the Hub. LeRobot GitHub stars nearly tripled. This reflects a broader trend identified in the report: open-source coordination is proving especially valuable where proprietary closed models struggle — robotics, scientific computing, protein folding, and drug discovery. The report also finds that 30%+ of Fortune 500 companies now maintain verified Hugging Face accounts, and that small models (hundreds of millions to a few billion parameters) continue to dominate production deployments despite flagship model sizes growing: the mean model size grew from 827M to 20.8B parameters, but the median moved only from 326M to 406M.
Read more — Hugging Face Blog
Links & Sources
- Open Responses: What you need to know — Hugging Face Blog
- Measuring AI agent autonomy in practice — Anthropic Research
- State of Open Source on Hugging Face: Spring 2026 — Hugging Face Blog