The company also released a new automated method for measuring political bias and published results suggesting its latest model, Claude Sonnet 4.5, outperforms or matches competitors on neutrality.
The company’s neutrality push indeed goes well beyond the typical marketing language. Anthropic says it has rewritten Claude’s system prompt—its always-on instructions—to include guidelines such as avoiding unsolicited political opinions, refraining from persuasive rhetoric, using neutral terminology, and being able to “pass the Ideological Turing Test” when asked to articulate opposing views.
The firm has also trained Claude to avoid swaying users in “high-stakes political questions,” implying one ideology is superior, and pushing users to “challenge their perspectives.”
Anthropic’s evaluation found Claude Sonnet 4.5 scored a 94% “even-handedness” rating, roughly on par with Google’s Gemini 2.5 Pro (97%) and Elon Musk’s Grok 4 (96%), and higher than OpenAI’s GPT-5 (89%) and Meta’s Llama 4 (66%). Claude also showed low refusal rates, meaning the model was typically willing to engage with both sides of political arguments rather than declining out of caution.
Correction, Nov. 14, 2025: A previous version of this article mischaracterized Anthropic’s timeline and impetus for political bias training in its AI model. Training began in early 2024.



