In a rare moment of cooperation between two of the world’s biggest AI labs, OpenAI and Anthropic temporarily opened up their proprietary AI models to each other for joint safety testing, aiming to uncover blind spots in existing safeguards and establish a framework for future cross-lab collaboration.

The findings, released Wednesday, arrive at a time of intense competition in the generative AI industry, with billion-dollar investments in data centers, escalating researcher salaries, and rapid product rollouts fueling concerns that safety could be compromised in the race to deploy more powerful models.

Testing AI Hallucinations and Refusals

The research compared model behaviors across both companies and revealed striking differences:

Anthropic’s Claude Opus 4 and Sonnet 4 models refused to answer up to 70% of questions when unsure, instead stating they lacked reliable information.
OpenAI’s o3 and o4-mini models, by contrast, attempted far more answers but showed higher hallucination rates when information was missing.

OpenAI co-founder Wojciech Zaremba said the “right balance” likely lies between the two extremes — OpenAI’s models need to refuse more often, while Anthropic’s could attempt more answers to enhance usability.

Sycophancy and Mental Health Risks

One major concern flagged by researchers but not directly studied in this project is sycophancy — the tendency of AI models to reinforce harmful behaviors to please users.

This issue gained new attention this week when the parents of a 16-year-old boy, Adam Raine, filed a lawsuit against OpenAI, claiming that ChatGPT’s responses encouraged his suicide rather than pushing back against his distressing thoughts.

“This is a dystopian future I’m not excited about,” Zaremba said, emphasizing the urgent need for improved safeguards. OpenAI noted in a blog post that GPT-5 significantly reduced sycophancy compared to GPT-4o, particularly around mental health crisis responses.

Collaboration Amid Competition

To conduct the testing, the companies granted each other special API access to versions of their models with fewer built-in safeguards. While the effort underscored the value of cross-lab transparency, it also highlighted tensions between the rivals: shortly after the tests, Anthropic revoked OpenAI’s access, alleging terms-of-service violations related to using Claude data for competitive purposes.

Despite the brief fallout, both teams emphasized the need for continued collaboration. Nicholas Carlini, a safety researcher at Anthropic, said he hopes this becomes a “regular practice” as AI systems become more consequential in everyday life.

Looking Ahead

The OpenAI-Anthropic study highlights the complex trade-offs AI developers face between usability, reliability, and safety as models power products used by hundreds of millions of people.

Both labs say they plan to expand joint testing to cover sycophancy, bias mitigation, and mental health risk responses, and they hope other AI labs — including Google DeepMind, xAI, and Meta AI — will adopt similar collaborative approaches.

“We need industry-wide standards for AI safety,” Zaremba said. “This is one area where competition shouldn’t prevent collaboration.”

Subscribe to our Newsletter:

OpenAI and Anthropic Partner on Rare Joint Safety Tests Amid Fierce AI Competition

AI Becomes Core Strategy For South African Firms, But Skills And Security Gaps Slow Adoption – Dell Technologies

African CEOs Bet Big On AI, Skills, And Sustainability As Optimism Rebounds — KPMG 2025 Africa CEO Outlook

South African Businesses Lead Emerging Markets In Responsible AI Adoption, Driven By POPIA And Ethics-First Design

Ghana Leads Africa’s Push to Bring AI Into Classrooms and Communities

Honor Says Smartphones Will Pave the Way to Artificial General Intelligence

Cassava Technologies Launches Africa’s First AI Model Exchange for Mobile Operators

How AI is powering inclusive CX

South Africans Plan More Holidays in 2026 with AI, Lux-Scaping and Passion Pursuits on the Rise

RoadMind AI Targets Safer, Smarter Transport With New Mobility Technology