Google Just Reset the AI Race: Inside Gemini 3.5 Flash and Omni

Vexlint Team · · 6 min read
Google Just Reset the AI Race: Inside Gemini 3.5 Flash and Omni

If you blinked, you may have missed it: Google walked onto the Shoreline Amphitheatre stage and quietly turned the page on what “fast AI” and “creative AI” are supposed to mean. Gemini 3.5 Flash and Gemini Omni aren’t just version bumps — they’re a statement about where the next 12 months of AI are headed. Here’s the full breakdown, in plain English.


The headline number that says it all: 900 million

Before we get into the models, one stat frames everything. Google reported that Gemini grew from 400 million users at last year’s I/O to more than 900 million monthly users, spanning over 230 countries and 70 languages. That more than doubling in a single year isn’t a footnote — it’s the pressure that made these announcements feel less like experiments and more like deployments at planetary scale.


Gemini 3.5 Flash: the “fast” model that stopped being a compromise

For years, the trade-off in AI was simple and annoying: you could have a smart model or a fast model, rarely both. Gemini 3.5 Flash is Google’s attempt to delete that trade-off.

What it actually does

The pitch is that Gemini 3.5 Flash combines frontier intelligence with the ability to perform agentic tasks — and the benchmarks back the ambition. According to Google, it surpasses Gemini 3.1 Pro on coding, agentic, and multimodal benchmarks. Read that again: the new “fast” model beats last quarter’s flagship-tier “pro” model on the hard stuff.

The speed is genuinely absurd

In terms of output tokens per second, Gemini 3.5 Flash is reportedly 4x faster than other frontier models. And it does this while being cheaper — roughly a third to a half cheaper than before. Faster and smarter and cheaper is the rare trifecta the industry keeps promising and rarely delivers all at once.

Why “agentic” is the word that matters

The most important shift here isn’t speed — it’s autonomy. Google frames the entire Gemini 3.5 family as combining frontier intelligence with action. The model is built for long-horizon agentic tasks: it can plan, build, and iterate to solve real problems like developing new applications, maintaining codebases, or preparing financial documents. It can even act as a platform for deploying teams of subagents — meaning one model orchestrating many smaller workers — and it generates richer, more interactive web UIs and graphics along the way.

In short: this isn’t a chatbot that answers a question. It’s a worker that finishes a project.

Where you can use it today

Gemini 3.5 Flash is rolling out across the Gemini app, Google Search, Google Antigravity 2.0, and the Gemini API. The bigger sibling, Gemini 3.5 Pro, is currently in testing and expected next month with the same focus.


A quick timeline (so the version numbers make sense)

ModelReleased
Gemini 3November 2025
Gemini 3.1February 2026
Gemini 3.5 FlashMay 2026
Gemini 3.5 ProExpected next month

The cadence tells its own story: roughly a major release every three months. The race isn’t slowing down — it’s compounding.


Gemini Omni: the model that turns “any input” into “any output”

If Gemini 3.5 Flash is the workhorse, Gemini Omni is the showstopper — and arguably the more conceptually radical of the two.

What makes it different

Most AI video tools generate video from scratch. Omni does something stranger and more powerful: it’s a new class of multimodal models that can reform actual real-life footage into something that, frankly, would only otherwise exist inside your head. You feed it images, audio, video, or just text describing your vision, and it builds — and crucially, it thinks the story through by analyzing multiple aspects together rather than just rendering frames.

Demis Hassabis, CEO of Google DeepMind, framed it directly on stage: the long-term goal for Omni is to generate any type of output from any kind of input. The first version, Gemini Omni Flash, is set to launch this summer.

The features that stood out

  • Conversational editing. You can refine videos using natural language until the result matches your vision — editing characters, backgrounds, and elements, even via voice commands.
  • Avatars that are you. Upload a digital version of yourself and create videos featuring characters that look and sound like you, dropped into action scenes.
  • Real-world physics. Google emphasized that generated videos follow real-life physics — the model understands gravity, kinetic energy, and even fluid dynamics. This is the difference between “AI video” and “video that feels real.”

How you’ll actually get it

Omni rolls out alongside the new Flash model in the Gemini app for paying subscribers across the Google AI Plus, Pro, and Ultra tiers. It will also be available through Flow, Google’s AI film-making tool. And — the clever distribution play — you’ll be able to use it for free to create Remixes of existing YouTube Shorts, including inside YouTube Create.

One open question Google has not yet answered: whether creators will be able to restrict AI remixing of their own content. That’s a story worth watching.


The supporting cast: Spark and a brand-new look

This wasn’t only about two models.

Gemini Spark was introduced as a 24/7 agent — a persistent assistant that works in the background rather than only when you open the app.

The Gemini app itself also got a full redesign called Neural Expressive: a new design language with fluid animations, vibrant colors, haptic feedback, and new typography, built around a pill-shaped prompt box. It’s rolling out now across Android, iOS, and the web.


So what does this actually mean?

Strip away the keynote polish and three things stand out:

  1. The speed-vs-intelligence trade-off is dying. When the “fast” model beats last quarter’s “pro” model, the old mental model of picking one or the other no longer holds.
  2. AI is shifting from answering to doing. “Agentic,” “long-horizon,” “teams of subagents” — the vocabulary has changed because the product has. The bar is now task completion, not response quality.
  3. Creative AI is becoming unified. Omni’s bet is that one model handling text, image, and video natively beats stitching specialized tools together. If that bet pays off, the era of juggling four different creative apps may be ending.

Is it all proven? No — many of these are Google’s own benchmark claims, and real-world testing always tells a messier story than a keynote slide. Independent evaluations over the coming weeks will be the real verdict. But as a direction of travel, the trajectory is unmistakable.

The models keep getting faster. The tasks keep getting bigger. And the gap between “describe what you want” and “here it is, finished” keeps shrinking.


Want a deeper dive into any single piece — the agentic coding benchmarks, the Omni physics engine, or how 3.5 Flash compares to rivals? That’s a whole post of its own.