Open Source AIMarch 13, 20263 min readvia Last Week in AI

OpenAI GPT-5.4 vs Google Gemini 3.1 Flash Lite pricing

OpenAI released GPT-5.4 in Pro and Thinking variants; Google unveiled Gemini 3.1 Flash Lite at about one-eighth the Pro cost.

The Brieftide

March 13, 2026

TL;DR

01OpenAI released GPT-5.4 in Pro and Thinking variants; Google unveiled Gemini 3.1 Flash Lite at about one-eighth the Pro cost.
02OpenAI launched GPT-5.4 this week, offering Pro and Thinking versions to customers.
03Google followed with Gemini 3.1 Flash Lite, a lower-cost inference tier priced at roughly one eighth of its Pro offering.

OpenAI launched GPT-5.4 this week, offering Pro and Thinking versions to customers. Google followed with Gemini 3.1 Flash Lite, a lower-cost inference tier priced at roughly one eighth of its Pro offering.

GPT-5.4 and Gemini 3.1 represent the latest moves by the two largest model providers to segment offerings by capability and cost. OpenAI split GPT-5.4 into two named variants, Pro and Thinking, aimed at different usage patterns, while Google introduced a Flash Lite variant of Gemini 3.1 marketed around lower price per call.

What OpenAI released

GPT-5.4 arrives as the next model in OpenAI's GPT-5 line, released in at least two variants. The Pro edition is positioned for higher-capacity or higher-throughput users, and the Thinking edition is presented as an alternative tier. OpenAI has not published exhaustive specification tables for each variant in a single document. The new release continues OpenAI's approach of offering tiered access to the same core model family, with pricing and latency choices driving customer selection.

OpenAI also continues to provide developer tooling and API hooks that let customers select the variant that best matches their cost, latency, and throughput needs. Pricing details vary by region and contract, and enterprise deals can alter published per-call rates.

Gemini 3.1 Flash Lite and pricing

Google's Gemini 3.1 Flash Lite is explicitly presented as a low-cost option within the Gemini 3.1 family. The Flash Lite tier is priced at about one eighth of the cost of Gemini Pro for the same model generation, making it attractive for cost-sensitive inference workloads and high-volume applications.

Gemini Pro remains Google Cloud's higher-cost, higher-throughput option. Google has emphasized that Flash Lite is a tradeoff between cost and some performance characteristics, offering a budget-friendly path for developers who prioritize price over top-tier throughput. As with other tiered offerings, actual performance and pricing will vary by workload and integration.

The discussion around these releases also touched on supply chain risk and government attention to AI procurement. Those topics remain active in regulatory and industry conversations, particularly where third-party components, hardware supply chains, and vendor contracts influence availability and security of deployed models.

Why it matters

Tiered releases from OpenAI and Google illustrate a clearer split between high-performance and low-cost inference options, changing how developers budget model usage. Enterprises and startups that run large volumes of inference can reoptimize architecture and cost by choosing lower-cost tiers, while customers needing the highest performance will continue to pay premium rates. The shift also keeps supply chain and procurement issues in view, since more tiers and variants complicate vendor evaluation and risk assessments.

GPT-5.4 and Gemini 3.1 Flash Lite at a glance

Item
GPT-5.4	Pro, Thinking	Pro = 1x	Enterprise apps, research	OpenAI release with tiered variants
Gemini 3.1 Flash Lite	Flash Lite	≈0.125x Pro	Cost-sensitive inference, high-volume calls	Google positions as low-cost alternative
Gemini 3.1 Pro	Pro	1x	High-performance production workloads	Higher throughput and price

Primary source

Last Week in AI

lastweekin.ai

Read the original

The Brieftide Daily · 06:00

Briefs like this one, in your inbox every morning.

FreeNo adsNo trackingUnsubscribe in one click