OpenAI GPT-5.4 vs Google Gemini 3.1 Flash Lite pricing
OpenAI released GPT-5.4 in Pro and Thinking variants; Google unveiled Gemini 3.1 Flash Lite at about one-eighth the Pro cost.
TL;DR
- 01OpenAI released GPT-5.4 in Pro and Thinking variants; Google unveiled Gemini 3.1 Flash Lite at about one-eighth the Pro cost.
- 02OpenAI launched GPT-5.4 this week, offering Pro and Thinking versions to customers.
- 03Google followed with Gemini 3.1 Flash Lite, a lower-cost inference tier priced at roughly one eighth of its Pro offering.
OpenAI launched GPT-5.4 this week, offering Pro and Thinking versions to customers. Google followed with Gemini 3.1 Flash Lite, a lower-cost inference tier priced at roughly one eighth of its Pro offering.
GPT-5.4 and Gemini 3.1 represent the latest moves by the two largest model providers to segment offerings by capability and cost. OpenAI split GPT-5.4 into two named variants, Pro and Thinking, aimed at different usage patterns, while Google introduced a Flash Lite variant of Gemini 3.1 marketed around lower price per call.
What OpenAI released
GPT-5.4 arrives as the next model in OpenAI's GPT-5 line, released in at least two variants. The Pro edition is positioned for higher-capacity or higher-throughput users, and the Thinking edition is presented as an alternative tier. OpenAI has not published exhaustive specification tables for each variant in a single document. The new release continues OpenAI's approach of offering tiered access to the same core model family, with pricing and latency choices driving customer selection.
OpenAI also continues to provide developer tooling and API hooks that let customers select the variant that best matches their cost, latency, and throughput needs. Pricing details vary by region and contract, and enterprise deals can alter published per-call rates.
Gemini 3.1 Flash Lite and pricing
Google's Gemini 3.1 Flash Lite is explicitly presented as a low-cost option within the Gemini 3.1 family. The Flash Lite tier is priced at about one eighth of the cost of Gemini Pro for the same model generation, making it attractive for cost-sensitive inference workloads and high-volume applications.
Gemini Pro remains Google Cloud's higher-cost, higher-throughput option. Google has emphasized that Flash Lite is a tradeoff between cost and some performance characteristics, offering a budget-friendly path for developers who prioritize price over top-tier throughput. As with other tiered offerings, actual performance and pricing will vary by workload and integration.
The discussion around these releases also touched on supply chain risk and government attention to AI procurement. Those topics remain active in regulatory and industry conversations, particularly where third-party components, hardware supply chains, and vendor contracts influence availability and security of deployed models.
Why it matters
Tiered releases from OpenAI and Google illustrate a clearer split between high-performance and low-cost inference options, changing how developers budget model usage. Enterprises and startups that run large volumes of inference can reoptimize architecture and cost by choosing lower-cost tiers, while customers needing the highest performance will continue to pay premium rates. The shift also keeps supply chain and procurement issues in view, since more tiers and variants complicate vendor evaluation and risk assessments.
| Item | |||||
|---|---|---|---|---|---|
| GPT-5.4 | Pro, Thinking | Pro = 1x | Enterprise apps, research | OpenAI release with tiered variants | |
| Gemini 3.1 Flash Lite | Flash Lite | ≈0.125x Pro | Cost-sensitive inference, high-volume calls | Google positions as low-cost alternative | |
| Gemini 3.1 Pro | Pro | 1x | High-performance production workloads | Higher throughput and price |
Primary source
Last Week in AI
lastweekin.aiThe Brieftide Daily · 06:00
Briefs like this one, in your inbox every morning.
Read next