This is based on an analysis published on May 21."67% of claims (672 / 1,000; 95% CI: 64–70%) have at least one frontier model dissenting from the panel majority — or no majority forms at all."Lenz: Beyond Benchmarks: Disagreement Among Frontier LLMs on Real-World Fact-Checks https://lenz.io/research/llm-disagreement"Frontier LLMs disagree on 67% of real-world fact-checks, raising concerns for AI systems in legal, financial, and high-stakes production environments."The New Stack: Why GPT-5.4, Claude, and Gemini can’t agree on basic, real-world facts https://thenewstack.io/frontier-llm-factcheck-disagreement/ @TheNewStack #LLM