GPT-4.1 release, OpenAI API models, pricing and benchmarks
OpenAI added GPT-4.1, mini and nano to the API with up to a 1 million-token context, new SWE-bench scores and nano pricing.
TL;DR
- 01OpenAI added GPT-4.1, mini and nano to the API with up to a 1 million-token context, new SWE-bench scores and nano pricing.
- 02OpenAI announced that GPT-4.1, GPT-4.1 mini, and GPT-4.1 nano are now available in the API, per @sama.
- 03The new family is highlighted for coding, instruction following, and handling long contexts, including support for contexts up to 1 million tokens.
OpenAI announced that GPT-4.1, GPT-4.1 mini, and GPT-4.1 nano are now available in the API, per @sama. The new family is highlighted for coding, instruction following, and handling long contexts, including support for contexts up to 1 million tokens.
Capabilities, pricing and early numbers
Users on social channels pointed to several concrete claims about capabilities and cost. @kevinweil noted a 54 score for GPT-4.1 on SWE-bench verified. @polynoamial shared a related figure, stating GPT-4.1 achieves 55% on SWE-Bench Verified, noting the model is not a reasoning model. Windsurf AI, cited by @omarsar0, reported an internal improvement of 60% for GPT-4.1 over GPT-4o on internal benchmarks like the SWE-benchmark, and described behavior changes including a 40% reduction in the need to read unnecessary files and a 70% reduction in modifying unnecessary files.
Instruction following is another emphasized area. @OpenAIDevs stated GPT-4.1 follows instructions more reliably than GPT-4o, specifically citing format adherence, complying with negative instructions, and ordering. @OpenAIDevs also said GPT-4.1 is significantly more skilled at frontend coding and shows reliable tool use.
On pricing, @stevenheidel provided concrete per-token numbers for the smallest variant: GPT-4.1-nano costs $0.10 per 1 million input tokens, $0.03 per 1 million input tokens when cached, and $0.40 per 1 million output tokens. Integrations were already being announced: @llama_index said Llama Index has day 0 support for GPT-4.1.
Reception and comparisons on social channels
Initial impressions varied. @aidan_mclau wrote that startup engineers were amazed by GPT-4.1 mini and nano, calling them comparable to GPT-4o but much cheaper, and describing GPT-4.1 mini and nano as a "Pareto optimal, Swiss Army knife API model" and an upgrade for agent stacks over newssonnet. By contrast, @scaling01 advised against using GPT-4.1-nano, calling it a "terrible model" and saying the GPT-4.1 API version is worse than Optimus Alpha.
Users also compared GPT-4.1 to other models and leaderboards. @omarsar0 highlighted that GPT-4.1 shows strong improvements over GPT-4o on internal Windsurf AI benchmarks. Some commenters juxtaposed GPT-4.1 with models such as DeepSeek, Gemini and others in ongoing benchmark conversations, though those comparisons were expressed as user commentary rather than official benchmark releases.
OpenAI is also consolidating its model lineup. @OpenAIDevs announced that GPT-4.5 Preview in the API will be deprecated starting today and fully turned off on July 14, stating GPT-4.1 offers improved or similar performance. Separately, @DanHendrycks suggested that free access to GPT-4.1 mini on ChatGPT might be limited to encourage ChatGPT Plus subscriptions among students.
Why it matters
The combination of a 1 million-token context window and claimed improvements in instruction following and coding behavior could change how teams build code-aware assistants, debuggers and long-document workflows, because those applications rely on large context windows and precision in formatting and instruction compliance. The availability of much cheaper mini and nano variants introduces a direct cost-performance decision for engineers building agent stacks or high-volume APIs, while the planned deprecation of GPT-4.5 signals OpenAI is consolidating traffic onto GPT-4.1.
What to watch
Watch the July 14 date when GPT-4.5 preview is scheduled to be turned off, as that will show how smoothly teams migrate to GPT-4.1. Also look for independent benchmark confirmations of the SWE-bench numbers and the Windsurf AI internal claims, and monitor adoption signals such as third-party integrations beyond Llama Index and pricing uptake for the nano tier.
Links cited in social posts include OpenAI's GPT-4.1 page, available at https://openai.com/index/gpt-4-1/.
| Item | |||
|---|---|---|---|
| Availability | GPT-4.1, GPT-4.1 mini, GPT-4.1 nano available in the API | @sama announced availability in the API | |
| Context window | Up to 1 million tokens | @sama | |
| SWE-Bench Verified score | 54 | @kevinweil noted a 54 score on SWE-bench verified | |
| SWE-Bench Verified alternate | 55% | @polynoamial stated GPT-4.1 achieves 55% on SWE-Bench Verified | |
| Windsurf internal improvement vs GPT-4o | 60% improvement | Cited by @omarsar0 from Windsurf AI on internal benchmarks | |
| Reduces reading unnecessary files | 40% less need to read unnecessary files | Windsurf AI claim cited by @omarsar0 | |
| Modifies unnecessary files less | 70% less | Windsurf AI claim cited by @omarsar0 | |
| GPT-4.1-nano input price | $0.10 per 1M input tokens ($0.03 cached) | @stevenheidel | |
| GPT-4.1-nano output price | $0.40 per 1M output tokens | @stevenheidel | |
| GPT-4.5 preview deprecation | Deprecated starting today; fully turned off on July 14 | @OpenAIDevs announced deprecation and turn-off date |
Written by The Brieftide · Source: Smol AI News
The Brieftide Daily · 06:00
Briefs like this one, in your inbox every morning.
Continue reading
More in Open Source AIStrands Agents open-source SDK lets you run any model anywhere
Strands Agents offers an open-source SDK that powers Amazon's AI agents, with multi-cloud model support, guardrails.
OpenAI backs EU AI content transparency code
OpenAI pledged to support the European Code of Practice on AI content transparency.
PRC-linked AI influence campaigns target US tech policy debates
OpenAI says PRC-linked actors used AI-generated content and coordinated accounts to push narratives about data centers and tariffs.
LSEG adopts OpenAI to scale trusted AI across global teams
London Stock Exchange Group embedded OpenAI models across global teams, accelerating insights and shortening release cycles.