Open Source AI4 min read

GPT-4.1 release, OpenAI API models, pricing and benchmarks

OpenAI added GPT-4.1, mini and nano to the API with up to a 1 million-token context, new SWE-bench scores and nano pricing.

The Brieftide

TL;DR

  • 01OpenAI added GPT-4.1, mini and nano to the API with up to a 1 million-token context, new SWE-bench scores and nano pricing.
  • 02OpenAI announced that GPT-4.1, GPT-4.1 mini, and GPT-4.1 nano are now available in the API, per @sama.
  • 03The new family is highlighted for coding, instruction following, and handling long contexts, including support for contexts up to 1 million tokens.

OpenAI announced that GPT-4.1, GPT-4.1 mini, and GPT-4.1 nano are now available in the API, per @sama. The new family is highlighted for coding, instruction following, and handling long contexts, including support for contexts up to 1 million tokens.

Capabilities, pricing and early numbers

Users on social channels pointed to several concrete claims about capabilities and cost. @kevinweil noted a 54 score for GPT-4.1 on SWE-bench verified. @polynoamial shared a related figure, stating GPT-4.1 achieves 55% on SWE-Bench Verified, noting the model is not a reasoning model. Windsurf AI, cited by @omarsar0, reported an internal improvement of 60% for GPT-4.1 over GPT-4o on internal benchmarks like the SWE-benchmark, and described behavior changes including a 40% reduction in the need to read unnecessary files and a 70% reduction in modifying unnecessary files.

Instruction following is another emphasized area. @OpenAIDevs stated GPT-4.1 follows instructions more reliably than GPT-4o, specifically citing format adherence, complying with negative instructions, and ordering. @OpenAIDevs also said GPT-4.1 is significantly more skilled at frontend coding and shows reliable tool use.

On pricing, @stevenheidel provided concrete per-token numbers for the smallest variant: GPT-4.1-nano costs $0.10 per 1 million input tokens, $0.03 per 1 million input tokens when cached, and $0.40 per 1 million output tokens. Integrations were already being announced: @llama_index said Llama Index has day 0 support for GPT-4.1.

Reception and comparisons on social channels

Initial impressions varied. @aidan_mclau wrote that startup engineers were amazed by GPT-4.1 mini and nano, calling them comparable to GPT-4o but much cheaper, and describing GPT-4.1 mini and nano as a "Pareto optimal, Swiss Army knife API model" and an upgrade for agent stacks over newssonnet. By contrast, @scaling01 advised against using GPT-4.1-nano, calling it a "terrible model" and saying the GPT-4.1 API version is worse than Optimus Alpha.

Users also compared GPT-4.1 to other models and leaderboards. @omarsar0 highlighted that GPT-4.1 shows strong improvements over GPT-4o on internal Windsurf AI benchmarks. Some commenters juxtaposed GPT-4.1 with models such as DeepSeek, Gemini and others in ongoing benchmark conversations, though those comparisons were expressed as user commentary rather than official benchmark releases.

OpenAI is also consolidating its model lineup. @OpenAIDevs announced that GPT-4.5 Preview in the API will be deprecated starting today and fully turned off on July 14, stating GPT-4.1 offers improved or similar performance. Separately, @DanHendrycks suggested that free access to GPT-4.1 mini on ChatGPT might be limited to encourage ChatGPT Plus subscriptions among students.

Why it matters

The combination of a 1 million-token context window and claimed improvements in instruction following and coding behavior could change how teams build code-aware assistants, debuggers and long-document workflows, because those applications rely on large context windows and precision in formatting and instruction compliance. The availability of much cheaper mini and nano variants introduces a direct cost-performance decision for engineers building agent stacks or high-volume APIs, while the planned deprecation of GPT-4.5 signals OpenAI is consolidating traffic onto GPT-4.1.

What to watch

Watch the July 14 date when GPT-4.5 preview is scheduled to be turned off, as that will show how smoothly teams migrate to GPT-4.1. Also look for independent benchmark confirmations of the SWE-bench numbers and the Windsurf AI internal claims, and monitor adoption signals such as third-party integrations beyond Llama Index and pricing uptake for the nano tier.

Links cited in social posts include OpenAI's GPT-4.1 page, available at https://openai.com/index/gpt-4-1/.

Selected GPT-4.1 claims and figures
Item
AvailabilityGPT-4.1, GPT-4.1 mini, GPT-4.1 nano available in the API@sama announced availability in the API
Context windowUp to 1 million tokens@sama
SWE-Bench Verified score54@kevinweil noted a 54 score on SWE-bench verified
SWE-Bench Verified alternate55%@polynoamial stated GPT-4.1 achieves 55% on SWE-Bench Verified
Windsurf internal improvement vs GPT-4o60% improvementCited by @omarsar0 from Windsurf AI on internal benchmarks
Reduces reading unnecessary files40% less need to read unnecessary filesWindsurf AI claim cited by @omarsar0
Modifies unnecessary files less70% lessWindsurf AI claim cited by @omarsar0
GPT-4.1-nano input price$0.10 per 1M input tokens ($0.03 cached)@stevenheidel
GPT-4.1-nano output price$0.40 per 1M output tokens@stevenheidel
GPT-4.5 preview deprecationDeprecated starting today; fully turned off on July 14@OpenAIDevs announced deprecation and turn-off date
Advertisement

Written by The Brieftide · Source: Smol AI News

The Brieftide Daily · 06:00

Briefs like this one, in your inbox every morning.

 

FreeOne email a dayEvery claim sourcedUnsubscribe in one click
Advertisement