Google I/O — Gemini
Google I/O 2026: Gemini 3.5 Flash GA, Gemini Spark 24/7 Agent, AI Ultra Cut to $99
Sundar Pichai’s keynote at Mountain View unveils Gemini 3.5 Flash — GA from day one at 289 tokens per second, four times faster than comparable frontier models — alongside Spark, a persistent cloud agent with native Workspace integration, and a restructured AI Ultra tier dropped to $99.99 per month.
Google I/O 2026 opened at Shoreline Amphitheatre Tuesday morning with the keynote Sundar Pichai had clearly been building toward for the better part of a year. Gemini 3.5 Flash — the headline model release — ships generally available from day one, an unusually aggressive launch posture for a frontier-class checkpoint. The model serves at 289 tokens per second, roughly four times the throughput of comparable frontier models at its price point, and pairs that with a one-million-token context window, full multimodal input across text, images, audio, and video, and benchmark performance that breaks through several previously sticky ceilings: 83.6 percent on MCP Atlas (8.3 points ahead of GPT-5.5), 76.2 percent on Terminal-Bench 2.1, and the leading score on a clutch of multimodal reasoning suites. Pricing comes in at $1.50 per million input tokens and $9.00 per million output tokens — aggressive enough to make Flash credible not just for chat-style workloads but for the kind of high-volume agentic loops that have driven Anthropic and OpenAI’s recent pricing pressure.
The model is, however, only the on-ramp to what Google framed as the real story of the keynote: Gemini Spark, a 24/7 persistent agent that lives on dedicated cloud VMs and operates continuously on a user’s behalf even when no browser tab is open. Spark plugs natively into Gmail, Docs, Sheets, Slides, and Calendar — the parts of the Workspace stack where Google’s data-access moat is deepest — and exposes MCP connectors to a launch roster of third-party services including Canva, OpenTable, and Instacart. The agent can be assigned ongoing tasks (“watch for any email about the Anderson contract and draft me a reply within an hour”) or one-shot work that requires cross-application orchestration (“book the team a dinner Thursday near the conference venue, send the calendar invite, and update the project doc”). It is the most ambitious productization to date of the persistent-agent thesis — the bet that the next consumer-AI category is not a chatbot but a continuously running digital coworker.
The pricing reshuffling underneath the Spark announcement is the more surprising business move. The AI Ultra tier — which Google had launched at $250 per month last year as the flagship consumer subscription — was repriced down to $99.99 per month, with Spark beta access bundled in, five-times-higher usage limits across the Gemini family, twenty terabytes of Google One storage, and YouTube Premium included at no additional cost. The cut is steep enough that several analysts on the morning circuit read it as a direct competitive response to ChatGPT Pro’s pricing posture; Google’s framing, in keeping with the keynote’s overall tone, was that Ultra had to be repriced because the value of the bundle had grown faster than the original price point reflected. Either way, the result is the cheapest frontier-class consumer AI subscription on the market that includes a persistent agent.
The third Gemini-side announcement of the keynote was Gemini Omni — a unified multimodal world model that Google described as the next step beyond Gemini’s existing multimodal capabilities. Omni rolls out first to Plus, Pro, and Ultra subscribers in the Gemini app, with API access to follow through the second half of the year. Google’s framing positions Omni as the substrate for the immersive-XR and live-video-understanding features that other parts of the keynote announced separately — Project Astra in Search, the Android XR glasses, Veo 3 with native audio — which together describe a stack-wide push to make multimodal understanding the default operating mode rather than a specialty feature accessed through dedicated entry points.
The aggregate picture the morning leaves is that Google has decided to compete on the frontier not by lobbing a single hero model into the market but by reorganizing the entire consumer surface around persistent multimodal agents and pricing them more aggressively than any frontier lab has previously been willing to. Whether that strategy survives the actual usage patterns that emerge over the next quarter remains to be seen — Spark’s reliability at 24/7 scale is the single biggest open question — but the strategic intent is unmistakable. Google is using its Workspace data moat, its YouTube bundle, and its willingness to subsidize at scale to try to establish the persistent-agent category as Google-coded before competitors can field equivalent offerings. The next several months will reveal whether the market accepts that framing.