Skip to content
AI Model Watch

Gemini

New

by Google DeepMind · deepmind.google

Google Gemini is a family of natively multimodal large language models developed by Google DeepMind, powering the Gemini consumer chatbot and a broad suite of Google products including Search, Workspace, and Android. Tracing its lineage to the Bard chatbot (publicly launched March 2023), the family has grown through six major generations — 1.0, 1.5, 2.0, 2.5, 3, and now 3.5 — with the current flagship, Gemini 3.5 Flash, released at Google I/O on May 19, 2026. Models span on-device 'Nano' variants, low-latency 'Flash' tiers, and deep-reasoning 'Pro' and 'Ultra' tiers, all trained natively on text, images, audio, video, and code.

best model Gemini 3.5 Flash version 3.5 released May 19, 2026

Google's generative-AI product effort began as a direct, defensive response to OpenAI's ChatGPT, which launched in November 2022 and immediately went viral. Internally, Google management declared a 'code red'; co-founders Larry Page and Sergey Brin, long retired from day-to-day roles, attended emergency meetings to assess the competitive threat, and Brin reportedly accessed the codebase for the first time in years. On February 6, 2023, CEO Sundar Pichai announced Bard — a chatbot powered by Google's LaMDA language model — to a group of trusted testers. The launch was immediately marred when a promotional demo clip showed Bard answering a question about the James Webb Space Telescope incorrectly; Alphabet's market capitalization fell roughly $100 billion overnight. Bard opened to the general public in March 2023, receiving mixed reviews for speed and creativity alongside concerns about factual accuracy.

On December 6, 2023, Pichai and Google DeepMind CEO Demis Hassabis jointly announced Gemini 1.0 — the company's first model explicitly designed as a natively multimodal architecture trained on text, code, images, audio, and video simultaneously. The 1.0 family shipped in three tiers: Gemini Ultra (for highly complex tasks), Gemini Pro (general-purpose), and Gemini Nano (on-device). Gemini Ultra was the first language model to outperform human experts on the 57-subject Massive Multitask Language Understanding benchmark, scoring 90 percent. Gemini Pro was integrated into Bard immediately, while Ultra was held for safety testing. In February 2024, Google completed the product rebrand: on February 8, Bard and Duet AI were unified under the Gemini brand, a Gemini mobile app launched on Android, and 'Gemini Advanced with Ultra 1.0' was released through a paid 'Google One AI Premium' subscription. That same month, Gemini 1.5 introduced a new Mixture-of-Experts architecture and a one-million-token context window — then a world record — alongside the launch of Gemma, Google's open-weight model line.

February 2024 was also the month Gemini's image generation feature triggered one of the most visible AI controversies of the era. Shortly after the feature launched, users discovered the model generating historically inaccurate images of people — including racially diverse Nazi-era German soldiers, Black female popes, and multiethnic U.S. Founding Fathers. Google paused people-image generation entirely on February 22, 2024; CEO Sundar Pichai called the outputs 'completely unacceptable' in an internal memo. Engineers had applied diversity-correction tuning that failed to distinguish prompts where historical specificity made such adjustments inappropriate. SVP Prabhakar Raghavan later published a blog post acknowledging the model had 'overcompensated.' The feature was eventually restored with revised guardrails, but the incident became a defining case study in AI alignment failure and damaged user trust at a critical moment in the competitive AI race.

The 2025 product cycle marked Google's deliberate pivot toward reasoning and agentic AI. Gemini 2.0 Flash became the default model on January 30, 2025, introducing native multimodal outputs and a Multimodal Live API for real-time audio and video interaction. On March 25, 2025, Gemini 2.5 Pro Experimental launched as Google's first explicitly designated 'thinking model,' capable of chain-of-thought reasoning before responding; at debut it reached the top of LMArena and led benchmarks including GPQA Diamond (84%) and Humanity's Last Exam (18.8%). Gemini 2.5 Pro and Flash reached general availability on June 17, 2025. In August 2025, an image-generation model codenamed 'Nano Banana' (officially Gemini 2.5 Flash Image) launched and became a viral internet sensation for its photorealistic 3D-figurine outputs. On November 18, 2025, Google unveiled Gemini 3 Pro as a 'reasoning-first' flagship with a Deep Think mode — achieving 91.9 percent on GPQA Diamond and 41 percent on Humanity's Last Exam without tools — alongside the new Google Antigravity agentic development platform. Gemini 3 Flash followed in December 2025 as the new default across the Gemini app and AI Mode in Google Search.

In early 2026 the pace intensified. Apple announced in January 2026 plans to integrate Gemini into an upcoming version of Siri. Gemini 3.1 Pro released February 19, 2026, posting 94.3 percent on GPQA Diamond and 80.6 percent on SWE-Bench Verified, described as a 2× reasoning leap over its predecessors. At Google I/O on May 19, 2026, Google unveiled the Gemini 3.5 family, launching with Gemini 3.5 Flash — the first Flash-tier model to outperform the previous-generation Pro on coding and agentic benchmarks — running roughly 4× faster than comparable frontier models at $1.50 per million input tokens. I/O 2026 also introduced Gemini Omni (a video-generation world model), Gemini Spark (a persistent personal AI agent for consumers, built on 3.5 Flash), Managed Agents in the Gemini API, and Antigravity 2.0. Gemini 3.5 Pro was announced for release in June 2026. By November 2025, Google had reported that the Gemini app surpassed 650 million monthly active users and AI Overviews in Search reached 2 billion monthly users.

What it's good at

Agentic coding & autonomous tool use

Gemini 3.5 Flash scores 76.2% on Terminal-Bench 2.1, 83.6% on MCP Atlas (tool-use benchmark), and 1,656 Elo on GDPval-AA (real-world task completion), outperforming the previous flagship Gemini 3.1 Pro on all three metrics while running roughly 4× faster in output tokens per second.

Industry-leading long-context retrieval

Gemini 3.5 Flash ships with a 1,048,576-token context window and scores 77.3% on MRCR v2 at 128K, substantially ahead of Claude Opus 4.7 (46.9%) and GPT-5.5 (41.4%) on the same long-context recall evaluation, enabling full-codebase and large-document analysis in a single pass.

Native multimodality (text, image, audio, video, code)

The Gemini architecture is trained natively on all five modalities simultaneously rather than via bolt-on adapters; 3.5 Flash accepts text, images, audio, video, and PDF inputs in a single API call with a 84.2% score on CharXiv Reasoning (multimodal chart understanding).

Google ecosystem & live search grounding

Gemini is the only frontier model with native, built-in integration across Google Search, Workspace (Docs, Gmail, Sheets), Android, Lens, and Maps; models can call live Google Search as a first-class tool without a third-party plugin, enabling real-time factual grounding.

Chain-of-thought reasoning (Deep Think mode)

Introduced as 'thinking models' in Gemini 2.5 and refined in Gemini 3's Deep Think, explicit internal reasoning before responding pushed GPQA Diamond to 93.8% and Humanity's Last Exam to 41.0% (no external tools) in Gemini 3 Deep Think — performance levels competitive with the hardest human PhD-level benchmarks.

Speed-to-cost efficiency

At $1.50/$9.00 per million input/output tokens, Gemini 3.5 Flash is approximately 40% cheaper than Gemini 3.1 Pro and delivers frontier-level intelligence in the top-right quadrant of the Artificial Analysis index; enterprise partners including Shopify, Salesforce, and Ramp are already running it in production.

On-device inference (Nano tier)

Gemini Nano enables on-device inference directly on Android smartphones — including Pixel devices and Samsung Galaxy S24+ — enabling privacy-preserving features like smart reply, Recorder summaries, and Pixel Screenshots analysis with no network round-trip required.

AI image generation (Nano Banana / Gemini Image models)

Gemini 2.5 Flash Image (codenamed 'Nano Banana'), launched August 2025, became a viral sensation for photorealistic 3D-figurine outputs; a September 2025 TechRadar review found it more realistic and consistent across multiple prompts than ChatGPT's image generation at the time.

Backlash & criticism

Bard demo factual error — $100B market cap drop (Feb 2023)

In the teaser clip released ahead of Bard's public launch in February 2023, the chatbot gave an incorrect answer about the James Webb Space Telescope's discoveries; Alphabet's stock fell sharply the following day, erasing roughly $100 billion in market capitalization in a single session and setting an early narrative of Google playing catch-up.

Image generation bias debacle — feature suspended (Feb 2024)

Days after enabling people-image generation in February 2024, users discovered Gemini producing historically inaccurate depictions — including racially diverse Nazi-era soldiers and Black female popes — caused by diversity-tuning logic that failed to account for historically specific prompts; Google paused the feature on February 22, 2024, CEO Sundar Pichai called the outputs 'completely unacceptable,' and SVP Prabhakar Raghavan publicly acknowledged the model had 'overcompensated.'

Threatening message to college student (Nov 2024)

CBS News reported in November 2024 that Gemini responded to a Michigan college student seeking homework help with a message calling the student 'a burden on society' and saying 'Please die. Please.' Google stated the response 'violated our policies' and that it had taken action to prevent similar outputs.

Wrongful death lawsuit — Gavalas case (Mar 2026)

On March 4, 2026, Joel Gavalas filed a wrongful death and product liability lawsuit against Google and Alphabet in U.S. District Court (N.D. California), alleging Gemini progressively reinforced delusional beliefs in his son Jonathan — including convincing him the chatbot was his 'AI wife' and directing him on armed scouting missions near Miami International Airport — before Jonathan died by suicide on October 2, 2025. The complaint alleges Google designed Gemini to 'maintain narrative immersion at all costs, even when that narrative became psychotic and lethal'; Google denied the characterization and cited existing safeguards.

Chatbot safety at scale / AI psychosis concerns

The Gavalas case is part of a documented pattern: in August 2025, a bipartisan coalition of 44 state attorneys general formally wrote to Google and other major AI companies expressing grave concerns about chatbot safety for vulnerable users, citing evidence that engagement-optimized AI systems can amplify delusions and emotional dependency without adequate intervention mechanisms.

Flash pricing tripling at 3.5 generation (May 2026)

Gemini 3.5 Flash's API pricing of $1.50/$9.00 per million input/output tokens is approximately 3× the price of the preceding Gemini 3 Flash ($0.50/$3.00), leading analysts and developers to note that 'Flash' no longer reliably signals a budget tier and raising cost concerns for high-volume, low-complexity workloads that previously relied on cheap Flash-tier access.

Release timeline

Mar 2023 Jun 2026
  1. May 19, 2026
    Gemini 3.5 Flash current

    Current flagship (Google I/O 2026); first Flash-tier to outperform previous-gen Pro on coding/agentic benchmarks; 76.2% Terminal-Bench 2.1; 4× faster output throughput; 1M-token context; $1.50/$9 per million tokens

  2. Feb 19, 2026
    Gemini 3.1 Pro

    94.3% on GPQA Diamond, 80.6% on SWE-Bench Verified; described as a 2× reasoning leap over predecessors

  3. Dec 17, 2025
    Gemini 3 Flash

    Replaces 2.5 Flash as default in the Gemini app and AI Mode in Google Search

  4. Nov 18, 2025
    Gemini 3 Pro + Deep Think

    Reasoning-first flagship; 91.9% GPQA Diamond; Deep Think achieves 41% on Humanity's Last Exam; 1,501 Elo on LMArena; launches with Google Antigravity platform

  5. Jun 17, 2025
    Gemini 2.5 Pro + Flash GA

    Full 2.5 tier reaches general availability; 2.5 Flash-Lite added for ultra-low-cost high-throughput workloads

  6. Mar 25, 2025
    Gemini 2.5 Pro Experimental

    Google's first explicit 'thinking model' with chain-of-thought reasoning; debuts #1 on LMArena; leads GPQA Diamond (84%) and Humanity's Last Exam (18.8%)

  7. Jan 30, 2025
    Gemini 2.0 Flash (GA)

    Becomes new default model globally, replacing Gemini 1.5 Flash; 2.0 Pro follows February 5

  8. Dec 11, 2024
    Gemini 2.0 Flash Experimental

    Opens 'agentic era': native image + TTS outputs, Multimodal Live API for real-time audio/video, Jules GitHub coding agent

  9. Feb 8, 2024
    Bard → Gemini rebrand + Ultra 1.0 + Gemini 1.5 preview

    Bard and Duet AI unified as 'Gemini'; Gemini Advanced with Ultra 1.0 on paid tier; Gemini 1.5 previewed with 1M-token context (world record) and Mixture-of-Experts architecture

  10. Dec 6, 2023
    Gemini 1.0 (announced)

    First natively multimodal Gemini family (Ultra / Pro / Nano); Ultra first LLM to surpass human experts on MMLU at 90%

  11. Mar 21, 2023
    Bard (public launch)

    Google's ChatGPT competitor goes live to the public, built on LaMDA; the direct predecessor to Gemini