Open Source AIJune 25, 20263 min read

AI chatbots: Most models skew left, Gemini 3.1 Pro balances

A Washington Post probe found GPT-5.5 and Deepseek V4 Pro gave exclusively left answers in 80% and 70% of cases; Gemini presented both.

The BrieftideJune 25, 2026

TL;DR

01A Washington Post probe found GPT-5.5 and Deepseek V4 Pro gave exclusively left answers in 80% and 70% of cases; Gemini presented both.
02Most major AI chatbots still take clearly left-leaning positions on political questions.
03GPT-5.5 and Deepseek V4 Pro were the most skewed toward left-only answers, Claude Opus 4.8 sat in the middle, and Google's Gemini 3.1 Pro was the outlier for balance.

Most major AI chatbots still take clearly left-leaning positions on political questions. A Washington Post investigation asked six leading models and found OpenAI's GPT-5.5 produced exclusively left-leaning arguments in 80 percent of the cases it tested, and Deepseek's V4 Pro did so in 70 percent.

How did the six models compare?

GPT-5.5 and Deepseek V4 Pro were the most skewed toward left-only answers, Claude Opus 4.8 sat in the middle, and Google's Gemini 3.1 Pro was the outlier for balance. GPT-5.5 returned exclusively left-leaning arguments in 80 percent of responses and offered an exclusively right-leaning answer only once. Deepseek V4 Pro returned exclusively left-leaning answers 70 percent of the time. Anthropic's Claude Opus 4.8 gave exclusively left-leaning answers 43 percent of the time and presented both sides in 57 percent of cases. Google’s Gemini 3.1 Pro presented arguments for both political perspectives in 93 percent of cases, offered left-only arguments in 7 percent, and never produced an exclusively right-leaning response.

The investigation cited specific examples: GPT-5.5 backed higher taxes on the wealthy and a single-payer healthcare system, and both GPT-5.5 and Deepseek V4 Pro argued against the death penalty even though Gallup polling shows a long-running majority of Americans have supported it. When asked whether the U.S. should use its military to conquer new territory, Gemini was the only model that offered an argument in favor of expansion, saying it could strengthen the U.S. economy.

Why do some models marketed as conservative still skew left?

Models positioned as conservative still tended toward left-leaning outputs in the tests, and the probe points to training and editorial choices as likely causes. xAI's Grok 4.3, promoted by Elon Musk as "truth-seeking" and anti-"woke", produced more right-leaning answers than any other model tested, but it still gave exclusively left-leaning responses more often overall. The investigation notes Grok may have been trained on the same data as other chatbots, or even on their outputs, which would propagate similar patterns. The probe also highlights that Grok produced racist or antisemitic statements in some cases, attributing those incidents to xAI's deliberate neglect of safety guidelines combined with X users prompting the model accordingly.

Gab's Arya, which the company says was built with Christian values and conservative principles, responded with a left-leaning argument twelve times more often than a right-leaning one in the Washington Post test. Grok did take an exclusively right-leaning position on trans rights in the test, a stance the investigation says lines up exactly with Musk's public position and suggests deliberate intervention on at least some topics.

Why it matters

These findings matter because alignment claims and marketing do not guarantee predictable political behavior from large language models. When models marketed as conservative still return left-leaning answers, prospective users who choose a model for that reason will be surprised. The variation between models also shows that company-level choices — training data, safety guidelines, and targeted interventions — materially affect political output. The availability of the investigation's code and supplementary analysis on GitHub gives researchers a way to reproduce and probe those mechanisms.

What to watch

Look for whether companies publish more transparent audits or documentation explaining training and safety interventions, and whether subsequent independent tests reproduce the Washington Post's numbers. The Post's full code and supplementary analysis are available on GitHub, which will let other researchers re-run the prompts and check whether future model versions shift the balance of left, right, or mixed answers.

Written by The Brieftide · Source: The Decoder

The Brieftide Daily · 06:00

Briefs like this one, in your inbox every morning.

FreeOne email a dayEvery claim sourcedUnsubscribe in one click

Continue reading

OpenAI joins Appia Foundation to build shared AI standards

OpenAI supports evaluation frameworks, safety practices and global cooperation through the Appia Foundation.

The BrieftideDAILY BRIEF

Zhipu AI GLM-5.2: 1M-token context, closes gap with Opus 4.8

GLM-5.2 ships under the MIT license with a stable one-million-token context and scores 74.4% on FrontierSWE, one point behind Opus 4.8.

The BrieftideDAILY BRIEF

OpenAI: PRC-linked influence operations target US AI debates

OpenAI says PRC-linked campaigns are using AI to push narratives on U.S. tech debates, data centers, tariffs and false ChatGPT claims.

The BrieftideDAILY BRIEF

OpenAI: LSEG scales trusted AI, empowers 4,000 staff

LSEG uses OpenAI to scale trusted AI across its global business, accelerating insights, shrinking release cycles and empowering 4.