Cloud & Infrastructure News: Google Cloud Next 2026 — TPU 8th Gen, Virgo Network, Agentic Data Cloud, 2026-04-24
cloud

Cloud & Infrastructure News: Google Cloud Next 2026 — TPU 8th Gen, Virgo Network, Agentic Data Cloud, 2026-04-24

5 min read

Google Cloud Next 2026: Eighth-Generation TPUs and Virgo Network

Google Cloud Next 2026, held April 22–24 in Las Vegas, concluded with a series of infrastructure announcements centred on AI-scale compute. The centrepiece hardware reveal is the eighth generation of Google's Tensor Processing Units, split into two purpose-built variants: the TPU 8t, optimised for training, and the TPU 8i, optimised for inference.

The TPU 8t is designed for large-scale pre-training and fine-tuning workloads. It can scale to 9,600 TPUs in a single superpod configuration, backed by 2 petabytes of shared high-bandwidth memory. The unified memory model across the superpod allows large model weights and optimizer states to be distributed without the data movement overhead that typically plagues distributed training at that scale. The TPU 8i takes the opposite design philosophy—focused on inference density and per-token latency rather than maximum throughput—and is developed in collaboration with Marvell Technology, which co-designed a memory processing unit (MPU) to improve inference memory bandwidth alongside the TPU stack.

Supporting both variants is the Virgo Network, a new AI-native data centre fabric that Google is positioning as the interconnect layer for its AI Hypercomputer infrastructure. Virgo replaces the traditional Ethernet-based fabric with a topology designed specifically for the all-to-all communication patterns that dominate large-scale model training and inference. While Google has not published detailed specifications for Virgo's bandwidth or topology, the announcement frames it as a prerequisite for multi-superpod coordination at the scale required by frontier model training runs.

For developers using Google Cloud for inference and fine-tuning workloads, the TPU 8i availability roadmap and pricing will be the most immediately actionable details. Google has indicated that access will roll out progressively through 2026 via the standard TPU quota request process.

Read more — Google Cloud Blog


Google Cloud Next 2026: Cross-Cloud Lakehouse and 16 Billion Tokens Per Minute

Two data and AI capacity announcements from Google Cloud Next 2026 reflect the scale at which cloud AI workloads are now operating. The Cross-Cloud Lakehouse standardises on Apache Iceberg as the table format and allows BigQuery to query data residing in AWS S3 and Azure Blob Storage without requiring data movement or replication. Developers working in organisations with data spread across multiple cloud providers can run BigQuery SQL against their Iceberg tables wherever they live, reducing the latency and cost of centralising data before analysis.

The Apache Iceberg choice is deliberate. Iceberg's open table format has become the de facto standard for cloud-agnostic data lake interoperability, with support from Snowflake, Databricks, AWS Athena, and Azure Synapse. By anchoring Google's cross-cloud story on Iceberg, Google Cloud is betting on format-level interoperability rather than proprietary connectors, which lowers the cost of switching analytical workloads between providers. Teams already using Iceberg in a hybrid or multicloud strategy will find the Cross-Cloud Lakehouse integration relatively straightforward to adopt.

On the AI capacity front, Google reported that direct Gemini API calls now process 16 billion tokens per minute—up from 10 billion the previous quarter. This 60% quarter-over-quarter throughput increase reflects both hardware investments and infrastructure optimisations across Google's serving stack. The metric matters to developers building latency-sensitive applications on the Gemini API, since higher aggregate throughput generally correlates with more consistent per-request latency at peak load. Google also noted that 75% of Google Cloud customers are now using AI products, and the number of documented generative AI production use cases across enterprises, governments, and startups has reached 1,302.

Read more — Google Cloud Blog


Google Cloud Next 2026: Agentic Data Cloud

The third major theme of Google Cloud Next 2026 is what Google is calling the Agentic Data Cloud—a set of capabilities designed to let AI agents operate directly on business data rather than working exclusively from static context passed into a prompt. The core idea is that agents need to read, write, and reason over live operational data, not just summarise documents or answer questions from a knowledge base.

In practice, the Agentic Data Cloud encompasses several existing and new services. Google Cloud's databases, data warehouses, and stream processing infrastructure are being instrumented so that agents can call into them as tools via the Gemini API and the Model Context Protocol. The Cross-Cloud Lakehouse described above is part of this picture, as are the Gemini-integrated Cloud SQL and AlloyDB features announced earlier in April. Agents can read from Iceberg tables, query Cloud SQL via the SQL MCP server that reached GA earlier this month, and write enriched results back to BigQuery or Firestore.

Google's framing for this layer is that the bottleneck in most agentic workflows is not the model's reasoning capability but its access to timely, accurate data. By making cloud data stores natively agent-accessible—with appropriate IAM controls and audit logging—Google is positioning its data infrastructure as the connective tissue between model intelligence and business context.

Token processing reaching 16 billion per minute is the throughput backbone that makes this vision plausible at enterprise scale. Developers building agentic applications on Google Cloud should evaluate the new MCP-based data tool integrations as they continue to reach GA over the coming months.

Read more — Oplexa


Stanislav Lentsov

Written by

Stanislav Lentsov

Software Architect

You May Also Enjoy