Cloud & Infrastructure News: Google Cloud Run GPU Inference, Bedrock AgentCore CLI, and AWS Lambda Ruby 4.0, 2026-05-06
cloud

Cloud & Infrastructure News: Google Cloud Run GPU Inference, Bedrock AgentCore CLI, and AWS Lambda Ruby 4.0, 2026-05-06

4 min read

Google Cloud Next 2026: Cloud Run GPU Inference, GKE Agent Sandbox, and BigQuery Graph

Google Cloud Next '26 (April 22–24, 2026) produced 260 announcements across Google's infrastructure and developer platform. Several are directly relevant to teams building AI-native applications on Google Cloud. Cloud Run now supports GPU inference using NVIDIA RTX 6000 Blackwell, allowing teams to deploy containerized inference workloads on managed serverless infrastructure without provisioning dedicated VM clusters. This significantly lowers the operational overhead for running custom model endpoints or self-hosted LLMs at production scale.

GKE Agent Sandbox introduces an isolated execution environment for AI coding agents operating within Kubernetes clusters. The sandbox enforces resource constraints and security boundaries so that agentic workloads — including code execution, shell commands, and file system access — cannot escape their intended scope. Combined with hypercluster support for scheduling across millions of accelerators, GKE now provides a credible path for large-scale multi-agent orchestration in regulated enterprise environments.

BigQuery gained native Graph capabilities, enabling developers to run graph traversal queries directly on BigQuery datasets without exporting to a separate graph database. The feature supports common graph analytics patterns — shortest path, connected components, neighbor enumeration — and integrates with the new AI.PARSE_DOCUMENT function that extracts structured data from unstructured enterprise documents at query time. These additions make BigQuery a more complete substrate for agentic applications that need to reason over connected enterprise data.

Gemini 3.1 Pro is now available in preview in Vertex AI, bringing improved long-context reasoning and coding performance to teams already using Vertex for managed model access. Remote MCP server management was also added to Cloud Run, allowing developers to expose and consume MCP tools directly through Cloud Run's managed infrastructure without standing up separate server processes.

Read more — Google Cloud Blog


Amazon Bedrock AgentCore: Managed Harness and Dedicated CLI for Agent Prototyping

Amazon Bedrock AgentCore entered preview in the week of April 27, 2026 with a managed agent harness, a dedicated CLI, and new developer tools aimed at reducing the time from prototype to deployed agent. The AgentCore CLI provides typed commands for creating, deploying, and iterating on Bedrock-hosted agents without writing orchestration code manually — developers describe agent capabilities, tools, and memory requirements declaratively, and AgentCore handles the runtime wiring.

The managed harness addresses one of the most common friction points in agent development: the gap between a working local prototype and a production-grade deployment with proper tool authentication, execution boundaries, and observability. AgentCore's harness provides these out of the box, including integration with AWS IAM for tool authorization and CloudWatch for tracing agent decisions and tool call outcomes.

Alongside AgentCore, the April 27 roundup confirmed that Anthropic is training its next generation of Claude models on AWS Trainium and Graviton infrastructure, deepening the AWS–Anthropic partnership beyond model availability in Bedrock to the actual compute layer. Claude Cowork also became available in Bedrock, adding collaborative team workflows where multiple users can share and continue the same Claude Code sessions.

Read more — AWS News Blog


AWS Lambda Gains Ruby 4.0 Runtime and New EC2 Instances for High-Throughput Workloads

The May 4, 2026 AWS weekly roundup confirmed Lambda Ruby 4.0 runtime support across all AWS regions, with advanced logging controls for structured JSON output — a useful combination for serverless applications that feed into log aggregation pipelines. Ruby 4.0 brings significant performance improvements over 3.x for CPU-bound workloads and is now a first-class Lambda runtime alongside Python, Node.js, and Java.

On the EC2 side, C8ine and M8ine network-optimized instances are now generally available, offering up to 600 Gbps network bandwidth. These complement the previously announced M8in and M8ib (memory and network-optimized) and R8in/R8ib (memory-optimized) instances, filling out the 8th-generation instance family. For teams running distributed inference, high-throughput data pipelines, or network-intensive agent coordination, the C8ine/M8ine lineup removes bandwidth as the bottleneck in multi-node deployments.

Amazon Connect evolved into a suite of four specialized agentic AI products at the What's Next event: Connect Decisions for supply chain planning, Connect Talent for hiring automation, Connect Customer for omnichannel experiences, and Connect Health for patient management workflows. The productization signals AWS's shift from selling raw AI infrastructure to selling vertical agent solutions built on top of it.

Read more — AWS News Blog


Stanislav Lentsov

Written by

Stanislav Lentsov

Software Architect

You May Also Enjoy