by Anirban Ghoshal

Senior Writer

OpenAI, Anthropic agree to get their models tested for safety before making them public

news

Aug 30, 20244 mins

Regulation

The agreements signed with the US AI Safety Institute also include the entities engaging in collaborative research on evaluating capabilities and safety risks, and methods to mitigating those risks.

OpenAI, Stability AI, AI21 Labs, Anthropic, and Deepmind logos on their websites

Credit: Tada Images / Shutterstock

Large language model (LLM) providers OpenAI and Anthropic have signed individual agreements with the US AI Safety Institute under the Department of Commerce’s National Institute of Standards and Technology (NIST) in order to collaborate on AI safety research that includes testing and evaluation.

As part of the agreements, both Anthropic and OpenAI will share their new models with the institute before they are released to the public for safety checks.

“With these agreements in place, we look forward to beginning our technical collaborations with Anthropic and OpenAI to advance the science of AI safety,” Elizabeth Kelly, director of the US AI Safety Institute, said in a statement.

The agreements also include the entities engaging in collaborative research on how to evaluate capabilities and safety risks, as well as methods to mitigate those risks.

The agreements come almost a year after US President Joe Biden passed an executive order to set up a comprehensive series of standards, safety and privacy protections, and oversight measures for the development and use of artificial intelligence.

Earlier in July, the NIST released a new open source software package named Dioptra that allows developers to determine what type of attacks would make an AI model perform less effectively.

Along with Dioptra, the NIST had also released several documents promoting AI safety and standards in line with the executive order.

These documents included the initial draft of its guidelines for developing foundation models, dubbed Managing Misuse Risk for Dual-Use Foundation Models, and two guidance documents that will serve as companion resources to the NIST’s AI Risk Management Framework (AI RMF) and Secure Software Development Framework (SSDF), targeted at helping developers manage the risks of generative AI.

Agreements support collaboration with the UK’s AI Safety Institute

The agreements with the LLM providers also include a clause, which will allow the US Safety Institute to provide feedback to both companies on potential safety improvements to their models in collaboration with their partners at the UK AI Safety Institute.

Earlier in April, the US and the UK signed an agreement to test the safety LLMs that underpin AI systems.

The agreement or memorandum of understanding (MoU) — was signed in Washington by US Commerce Secretary Gina Raimondo and UK Technology Secretary Michelle Donelan and the collaboration between the AI Safety Institutes is a direct result of this agreement.

Other US measures around AI safety

The agreements signed by OpenAI and Anthropic come just as the California AI safety bill goes into its final stages of turning into a law. The bill could establish the nation’s most stringent regulations on AI and may pave the way for similar regulations across the country.

The legislation, known as the Safe and Secure Innovation for Frontier Artificial Intelligence Models Act (SB 1047), proposes rigorous testing and accountability measures for AI developers, particularly those creating large and complex models.

The bill, if enacted into law, would require AI companies to test their systems for safety before releasing them to the public.

Earlier this month, OpenAI opposed the bill for at least five days before pledging support for it last week.

The NIST has also taken other measures, including the formation of an AI safety advisory group in February this year that encompassed AI creators, users, and academics, to put some guardrails on AI use and development.

The advisory group named the US AI Safety Institute Consortium (AISIC) has been tasked with coming up with guidelines for red-teaming AI systems, evaluating AI capacity, managing risk, ensuring safety and security, and watermarking AI-generated content. Several major technology firms, including OpenAI, Meta, Google, Microsoft, Amazon, Intel, and Nvidia, joined the consortium to ensure the safe development of AI.

by Anirban Ghoshal

Senior Writer

Anirban Ghoshal is a senior writer covering enterprise software for CIO.com and databases and cloud and AI infrastructure for InfoWorld.

Show me more

Americas

Asia

Europe

Oceania

Topics

About

Policies

Our Network

More

OpenAI, Anthropic agree to get their models tested for safety before making them public

The agreements signed with the US AI Safety Institute also include the entities engaging in collaborative research on evaluating capabilities and safety risks, and methods to mitigating those risks.

Agreements support collaboration with the UK’s AI Safety Institute

Other US measures around AI safety

More from this author

Decoding OpenAI’s o1 family of large language models

Meta’s Llama models get 350 million downloads

Google’s Gemini gets new Gems assistants, Imagen 3

Character.ai founder to co-lead Gemini AI, says report

OpenAI releases new version of GPT-4o via Azure

Dell lays off sales team staffers as it eyes AI sales

Microsoft says 365 outage was amplified by internal errors

Microsoft 365 suite suffers outage due to Azure networking issues

Show me more

Microsoft's Patch Tuesday updates: Keeping up with the latest fixes

For December’s Patch Tuesday, 74 updates and a zero-day fix for Windows

The Macy’s accounting disaster: CIOs, this could happen to you.

Podcast: Why tech leaders are looking at political power

Podcast: AI disrupts business leaderships, revives others

Podcast: What is the outlook for tech jobs in 2025?

Why Big Tech leaders are seeking political power

AI shakes up leaders, revives others

2025 Tech Job Market: Rainbows or gloom?

OpenAI, Anthropic agree to get their models tested for safety before making them public

The agreements signed with the US AI Safety Institute also include the entities engaging in collaborative research on evaluating capabilities and safety risks, and methods to mitigating those risks.

Agreements support collaboration with the UK’s AI Safety Institute

Other US measures around AI safety

From our editors straight to your inbox

More from this author

Decoding OpenAI’s o1 family of large language models

Meta’s Llama models get 350 million downloads

Google’s Gemini gets new Gems assistants, Imagen 3

Character.ai founder to co-lead Gemini AI, says report

OpenAI releases new version of GPT-4o via Azure

Dell lays off sales team staffers as it eyes AI sales

Microsoft says 365 outage was amplified by internal errors

Microsoft 365 suite suffers outage due to Azure networking issues

Show me more

Microsoft's Patch Tuesday updates: Keeping up with the latest fixes

For December’s Patch Tuesday, 74 updates and a zero-day fix for Windows

The Macy’s accounting disaster: CIOs, this could happen to you.

Podcast: Why tech leaders are looking at political power

Podcast: AI disrupts business leaderships, revives others

Podcast: What is the outlook for tech jobs in 2025?

Why Big Tech leaders are seeking political power

AI shakes up leaders, revives others

2025 Tech Job Market: Rainbows or gloom?