Multimodal AI4 min read

Gemini 3 by DeepMind launches: specs, benchmarks, pricing

DeepMind released Gemini 3, a multimodal model family with new deployment and API options plus updated safety and benchmark claims.

The Brieftide

TL;DR

  • 01DeepMind released Gemini 3, a multimodal model family with new deployment and API options plus updated safety and benchmark claims.
  • 02DeepMind released Gemini 3, the next generation in its Gemini model family, making the models available through cloud APIs and expanded on-device runtimes.
  • 03The rollout presents multiple model sizes and deployment formats, plus updated safety filters and benchmark results described by the company.

DeepMind released Gemini 3, the next generation in its Gemini model family, making the models available through cloud APIs and expanded on-device runtimes. The rollout presents multiple model sizes and deployment formats, plus updated safety filters and benchmark results described by the company.

Gemini 3 is positioned as a multimodal foundation model family that DeepMind says improves reasoning, code generation and visual understanding compared with prior Gemini releases. The company described a set of technical and product changes intended to support interactive assistants, tool use, and lower-latency on-device applications.

What changed in Gemini 3

Gemini 3 introduces a redesigned model stack with separate components for perception, reasoning and tool orchestration. DeepMind highlighted several areas of focus: multimodal input handling, expanded tool integrations, and layered safety mechanisms. The perception component accepts image and video frames alongside text, the core reasoning stack performs chain-of-thought style planning and execution, and a tooling layer mediates calls to external APIs and user-provided functions.

Model sizes are presented as a family, with smaller configurations tuned for on-device use and larger configurations hosted in the cloud for high-throughput workloads. DeepMind emphasized latency improvements for local inference and new optimizations for memory and compute efficiency that aim to make richer context windows practical for real-time assistants and creative workflows. The release also includes developer tooling such as a deployable runtime, SDKs, and a tools API intended to let models call external services in a structured way.

Safety and alignment work is layered into the runtime. DeepMind described a multi-tiered safety filter that screens inputs and generated outputs, plus guardrails for tool calls and data handling. The company said it applied red-team testing and iterative prompt-safety tuning to reduce harmful outputs, and it detailed plans for ongoing model audits and third-party evaluations.

Deployment, benchmarks and pricing

DeepMind is offering Gemini 3 through its cloud API, with specific endpoints for the different model sizes and runtimes. The company also announced expanded on-device runtimes, enabling OEMs and app developers to run smaller configurations locally. That combination is designed to serve both latency-sensitive consumer use cases and high-capacity enterprise workloads.

On benchmarks, DeepMind presented comparative results showing gains on reasoning, coding and multimodal understanding relative to earlier Gemini models. The company provided aggregated scores and examples highlighting fewer hallucinations in factual tasks and more reliable code synthesis in developer scenarios. Independent third-party benchmark results were not included in the announcement, DeepMind said it will publish additional evaluation data and partner studies over time.

Pricing is moving to a tiered model that differentiates cloud endpoint access from on-device licensing. DeepMind outlined per-request and subscription options for cloud usage, plus developer licenses for local deployment. The announcement emphasized flexible commercial terms for partners building consumer products and enterprises embedding models into internal workflows.

Why it matters

Gemini 3 signals DeepMind's push to position multimodal models across both cloud and local devices, prioritizing latency and tool integration as product requirements. The release increases options for developers who need on-device performance as well as enterprises that require high-capacity cloud inference, and it raises the bar on safety and tool orchestration expectations for foundation models.

Gemini 3 system components
User Input (text, image, video)Perception Encoder (vision + multimodalCore Reasoning Stack (LLM backbone)Tool Orchestration Layer (API calls, plugins)Safety & Moderation (filters, guardrails)Deployment Targets (cloud API, on-device runtime)Developer SDK & Monitoring (telemetry, logs)
Advertisement

Written by The Brieftide · Source: Google DeepMind (deepmind.google)

The Brieftide Daily · 06:00

Briefs like this one, in your inbox every morning.

 

FreeOne email a dayEvery claim sourcedUnsubscribe in one click
Advertisement