CAISI gets pre-deployment access to Google, Microsoft, and xAI frontier models

TL;DR

On May 5, 2026, the Center for AI Standards and Innovation (CAISI), within the U.S. Department of Commerce, announced pre-deployment evaluation agreements with Google DeepMind, Microsoft, and xAI. The pacts add to OpenAI and Anthropic's 2024 agreements. CAISI has already completed 40+ frontier-AI evaluations. Labs are providing models with 'reduced or removed safeguards' for testing. The agreements support testing in classified environments. The U.S. government is now structurally upstream of every major frontier model launch.

CAISI signed pre-deployment evaluation agreements with Google DeepMind, Microsoft, and xAI.
The pacts build on OpenAI and Anthropic's 2024 agreements — five of the largest U.S. labs are now in.
CAISI sits inside NIST, under the Department of Commerce.
Labs provide models with 'reduced or removed safeguards' so CAISI can stress-test national-security-relevant capabilities.
More than 40 frontier-AI evaluations have already been completed. Testing in classified environments is supported.

On May 5, 2026, the Center for AI Standards and Innovation — CAISI, the Commerce Department body inside NIST — announced pre-deployment evaluation agreements with Google DeepMind, Microsoft, and xAI. The agreements expand on the 2024 pacts CAISI's predecessor signed with OpenAI and Anthropic. As of this week, five of the largest U.S. frontier labs have formal agreements giving the federal government access to evaluate models before public release.

The structure of the deal is the part to read carefully. Labs provide models to CAISI for "pre-deployment evaluations and targeted research to better assess the frontier AI capabilities and advance the state of AI security." More importantly, labs frequently provide CAISI with models that have "reduced or removed safeguards" — meaning CAISI is testing what the model can do absent the public-release guardrails.

The testing is not theoretical. CAISI has completed more than 40 frontier-AI evaluations to date. The agreements support testing in classified environments, and were drafted with what NIST calls "the flexibility required to rapidly respond to continued AI advancements."

Why this matters: the U.S. government is now structurally upstream of every major frontier model launch. The agreements are voluntary. They are also the new default. The next frontier lab that ships a flagship model without a CAISI eval is going to look like a regulatory outlier, not an industry standard.

What to watch: whether Meta, Mistral, and the smaller-but-frontier U.S. labs sign equivalent agreements in the next 90 days. Also worth tracking — whether the evaluations begin influencing release timing, capability claims, or safety disclosures in public-facing model cards. The first model that delays a release by a quarter to address a CAISI finding is going to be the moment this program becomes regulation in everything but name.

CAISI gets pre-deployment access to Google, Microsoft, and xAI frontier models

Sources

Get the AI briefing

Building with AI? Bring us in.

CAISI gets pre-deployment access to Google, Microsoft, and xAI frontier models

Sources

Get the AI briefing

More AI News Now

Nadella just testified for Altman. Zilis revealed Musk tried to poach him. The trial that decides the IPO.

Pentagon picks 8 AI vendors. Anthropic isn't one of them — and that may be the most valuable thing the company did this year.

YouTube wiped 4.7B views. Facebook just started paying creators in stablecoins. The migration is real.

Building with AI? Bring us in.