AWS Bedrock AgentCore Harness GA with Managed Knowledge Bases and Web Search
Amazon Web Services announced the general availability of the Bedrock AgentCore Harness at AWS Summit New York, enabling developers to build and run production-grade AI agents in minutes through configuration-based agent definition without writing orchestration loops. The Harness connects agents to organizational knowledge, web sources, and paid data feeds, while providing built-in production monitoring for finding and fixing issues in deployed agents.
The accompanying Fully Managed Knowledge Bases simplify enterprise RAG pipeline construction with three key capabilities: native data connectors for pre-built integrations with common data sources, Smart Parsing for automatic multi-format data preparation that handles documents, tables, and images without manual preprocessing, and an Agentic Retriever that handles complex multi-step queries by decomposing them and retrieving across multiple knowledge sources autonomously.
Web Search on Bedrock AgentCore arrives as a fully managed tool that gives agents access to current, cited web knowledge with zero data egress from the customer's secured AWS environment. This is significant for enterprise deployments where data residency requirements previously made web-grounded agent responses impractical. The feature requires no manual implementation or infrastructure management beyond enabling it in the agent configuration.
Read more — AWS News Blog
AWS WAF Launches AI Traffic Monetization for Content Providers
AWS WAF introduced AI traffic monetization as a new Bot Control capability, enabling content providers and publishers to price, meter, and collect payment from AI bots and agents accessing their content and APIs. The feature provides edge-based access control with third-party payment integration, operating at the WAF layer before requests reach origin servers.
The timing aligns with a broader industry reckoning around AI training data economics. As AI agents increasingly crawl web content to ground their responses in real-time information, content providers have faced a choice between blocking AI traffic entirely or allowing unrestricted access with no compensation. WAF AI traffic monetization introduces a middle path where providers can set per-request or volume-based pricing, meter consumption at the edge, and enforce payment before granting access.
For developers building AI agents that consume third-party web content, this signals a shift in the cost model. Agents that previously accessed content freely may now need to handle payment negotiation or authentication flows, and applications should be designed to gracefully handle HTTP 402 Payment Required responses from WAF-protected endpoints.
Read more — AWS News Blog
Google Cloud Capacity Advisor for Spot Enters Public Preview
Google Cloud launched Capacity Advisor for Spot in public preview, providing data-driven recommendations for optimizing Spot VM deployments. The tool offers an API for querying obtainability scores and minimum estimated uptimes across regions and machine types, alongside a new Console UI featuring a global availability map with Spot price lookups and historical preemption rate trends.
Spot VMs offer significant cost savings over on-demand instances but carry the risk of preemption when Google Cloud needs the capacity back. Until now, choosing the right region, zone, and machine type for Spot workloads required trial and error or manual analysis of preemption patterns. Capacity Advisor surfaces this data programmatically, letting teams build automated deployment strategies that maximize obtainability while minimizing disruption from preemptions.
Separately, Google Cloud published a new reference architecture for designing and deploying multi-tenant agentic AI systems. The architecture addresses three common pitfalls of scaling generative AI across business units: fragmented agent silos that duplicate effort, data exposure risks from shared infrastructure, and compliance drift from inconsistent governance. The reference design prescribes tenant isolation boundaries, shared model serving layers, and unified compliance controls for organizations running multiple AI agent deployments.
Read more — Google Cloud Blog