DeepMind seems to me to be six to nine months behind the AI frontier, at least with respect to LLMs. At Google I/O on May 19, Sundar Pichai opened on what he called the agentic Gemini era, shipped the smaller Gemini 3.5 Flash, and said the flagship, Gemini 3.5 Pro, would arrive next month. On June 18 it still has not. I forecast it reaches the public in early July, a step behind the top of the field.
The reasoning and agentic coding Pichai showed on stage is roughly what Anthropic and OpenAI gave paying users earlier this year. DeepMind's best shipping model, Gemini 3.1 Pro, came out in February and trails today's leaders, and Deep Think, its strongest result, sits behind an Ultra subscription that runs up to $250 a month. Gemini 3.5 Pro is the model meant to close that gap, and it is late.
How good will it be?
I forecast Gemini 3.5 Pro at 160 on Epoch's Capabilities Index, an aggregate of about forty benchmarks, a point behind Claude Fable 5 at the top. Personally, I think Fable is significantly better than this eval shows, but this is still a useful target for forecasting. And anyway the strongest model anyone can actually run sits at GPT-5.5 Pro and Opus 4.8, about where I put 3.5 Pro, or maybe slightly ahead of it. This does depend on Deep Think, DeepMind's inference-time reasoning mode, which already posts 84.6% on ARC-AGI-2 and a gold medal at the 2025 IMO. Switch Deep Think off and the model drops a tier.
I put it behind the frontier rather than on it because of some FutureSearch forecasts of where it will land on specific benchmarks, shown below:
| Gemini 3.5 Pro | FutureSearch forecast |
|---|---|
| Public release, app and API | July 1, 2026 (June 23 to Aug 6) |
| Deep Think mode, public | July 11, 2026 (June 24 to Sept 7) |
| Epoch Capabilities Index, with Deep Think | 160 (156 to 162) |
| Artificial Analysis Intelligence Index | 61 (55 to 67) |
| Humanity's Last Exam, no tools | 49.9% (46 to 55) |
| Terminal-Bench, agentic coding | 83.8% (77 to 88) |
| Input price per million tokens | $4.33 ($2 to $13.50) |
| Output price per million tokens | $24.33 ($12 to $60) |
| Ships a 2 million token context window | 44% (1 million more likely, at 52%) |
| Keeps four named effort levels | 73% |
| No separate Ultra model, Deep Think stays a Pro mode | 85% |
| Share of OpenRouter coding traffic, first week | 2.5% (0.6 to 7) |
Ranges are 80% intervals. Percentages are the probability of the stated outcome.
What will Gemini-3.5-Pro be, and when will it ship?
Pichai said next month. I have it a few weeks later, in early July. The tell is what DeepMind did at I/O, shipping Flash while holding Pro back for more testing. (Google is notorious for vaporware announcements at I/O, frequently not shipping until the next calendar year, or sometimes not at all. Disclaimer: I worked at Google from 2014-2022.) The bar I am forecasting is the full public release, anyone in the app and through the API on standard Vertex and AI Studio plans, and that has trailed the on-stage teaser by a couple of weeks on past launches. Deep Think arrives later still, because DeepMind has kept its top reasoning mode in extra safety review before letting it out.
Forecast release window for Gemini 3.5 Pro against Google's I/O promise of next month.
What will it cost? Google models try to lead on price, and I expect it to bend that without breaking it. Gemini 3.1 Pro runs $2 and $12 per million tokens, and even a doubled 3.5 Pro lands at half the price of GPT-5.5 and Claude Opus, which keeps DeepMind the cheap seat at the frontier. The one path that breaks it is the $15 and $60 rumor going around, which would put DeepMind at parity and mark a real change of strategy. I give it weight in the tail, not the median.
The 2-million-token context window DeepMind has teased is closer to a coin flip than a promise, because it said 2 million for Gemini 2.5 Pro and shipped one. Two product calls look settled. The thinking control stays at four levels, since DeepMind's own developer docs already rule out the extra-high tier its rivals added. And there is no larger Ultra model coming. Ultra has become a subscription, and peak capability is Deep Think running on Pro.
A frontier-class model does not hand DeepMind back the developers it lost. Its share of usage on OpenRouter runs at about half of Anthropic's, and the coding category there belongs to Claude. I expect 3.5 Pro to take a sliver of that coding traffic in its first week, because the metric rewards cheap high-volume models and because most of DeepMind's coding usage never touches OpenRouter, running through its own tools instead. Underneath both, the people building agents have standardized on Claude Code. Winning them back takes a better model and time, and 3.5 Pro supplies only the first.
Where I might be wrong
It's hard to forecast capabilities, even though they seem continuous, it is possible Gemini-3.5-Pro is a Fable-class model. If Deep Think clears the frontier on math and science the way it cleared ARC-AGI, I may have undercalled it. I also can't rule out that it doesn't ship for a lot longer than I (or Sundar) think, because all the Fable drama could lead to a safety hold that pushes the release into August. Price is third. If the $15 and $60 numbers hold, DeepMind is fighting on capability instead of cost, a different company than the one these forecasts describe.
If 3.5 Pro ships in the next few weeks and the benchmarks land where I expect, DeepMind is back in the race, close behind the frontier. I argued in January that Anthropic was my pick for the top lab of 2026, and a 3.5 Pro this close to the frontier makes that race closer without settling it. For the financial side of the same three companies, see our OpenAI and Anthropic forecasts. I will grade these against the release.
Run this forecast yourself by
connecting FutureSearch to Claude
and asking it to refresh the numbers the moment DeepMind ships.