OpenAI & Anthropic Deepen Safeguards Through Collaboration with US CAISI & UK AISI

AI safety, man using Ipad with enhanced safety

As AI models become more pervasive in critical systems, safety and security are no longer “nice to have”- they are fundamental for trust, compliance, and business resilience.

Recently, OpenAI and Anthropic have published updates on their ongoing, voluntary collaborations with two government standards and security bodies; the US Centre for AI Standards & Innovation (CAISI) and the UK AI Security Institute (AISI), with concrete outcomes that matter for enterprises.

This post summarises what’s new, what has been achieved, and what these developments mean for companies building or deploying AI-driven products in 2025.

Key Highlights of the Collaboration


Joint Red-Teaming of Agentic and Frontier Systems
CAISI conducted red-teaming on OpenAI’s agentic systems, including “ChatGPT Agent”, discovering novel vulnerabilities that could allow attackers, under certain conditions, to bypass security protections, impersonate users, or gain unauthorised remote access.

These vulnerabilities were addressed rapidly; fixes were deployed within one business day of being reported. Biosecurity and Biological Misuse Safeguards

UK AISI has been working with OpenAI to test safeguards against biological misuse, including in both ChatGPT Agent and GPT-5. The testing involved the grant of early access to internal prototypes, variants with certain guardrails removed, access to chain-of-thought monitoring from internal safety models, and somewhat aggressive “probe” testing (removing or disabling some mitigations during testing) to expose vulnerabilities.

These efforts are ongoing and iterative, not tied to a single release; they include frequent feedback loops between OpenAI and UK AISI. Strengthened Safeguard Architectures and Feedback Processes

Both companies have allowed CAISI and AISI deeper access to system designs, to guardrail architectures, and to classifier or safeguard prototypes. This includes versions with weaker protection so that red-teamers can identify weak points.

Rapid feedback loops have been central: discoveries of security issues or vulnerabilities are reported and fixed quickly, and improvements are integrated in subsequent versions.

Detection &Mitigation of Specific Attack Types
Examples of vulnerabilities found include prompt injection, obfuscation/encoding (cypher-based or character substitution) to bypass filters or classifiers, universal jailbreaks that exploit combinations of weak points, etc.

Improvements have included enhancing classifier robustness and restructuring safeguard architecture to address entire classes of weaknesses rather than just patches for individual exploits.

Office meeting on AI Safety, planning for the future.

Enterprise Implications

For large companies, product owners, solution architects, and compliance leads, here are the key takeaways:

Higher Vendor Evaluation Expectations
When procuring AI systems, require evidence of external red-teaming, security testing, and collaboration with recognised safety bodies.

Pre-Deployment Security Testing
Build security testing into your AI product lifecycle before launch, including threat modelling, guardrail and classifier testing.

Rapid Response Infrastructure
Establish processes to patch vulnerabilities quickly, ideally within hours or days, to protect trust and compliance.

Biosecurity and Domain-Specific Safeguards
If operating in sensitive sectors such as biotech or pharmaceuticals, ensure safeguards against biological misuse are in place and thoroughly tested.

Transparency Around Safeguards
Choose vendors who can provide clarity on how their safety systems and mitigations work.

Governance and Risk Management
Involve leadership, legal, security, and ethics teams in overseeing AI deployments. Include internal audits, external reviews, and compliance monitoring.

How Accelerai Supports Enterprises in This Environment

At Accelerai, we understand that the safety, security, and robustness of AI systems are now central to enterprise AI adoption. Here are ways we help:

Vendor assessment and due diligence: Evaluate third-party providers’ security practices; ensure their models and products have been through rigorous red-teaming and external testing where possible.

Security architecture consultation: Help design safeguard systems (classifier design, guardrails, mitigation layers) from the outset, not as bolt-ons.

Testing & adversarial simulation: Assist in running structured red-teaming, misuse case modelling, and domain-specific “what if” scenarios.
Governance, compliance & risk management advisory: Help you build internal oversight, workflows, and policies that align with emerging standards.

Continuous monitoring & feedback loops: Ensure your deployed AI systems have monitoring, logging, incident response plans, and periodic reassessments to adapt to new threats.

The recent updates from OpenAI and Anthropic in September 2025 show that collaboration with national AI security bodies is maturing.

The work with US CAISI and UK AISI is not just theoretical, it has already surfaced real vulnerabilities, led to rapid fixes, strengthened safeguard architectures, and raised expectations across the industry.

For enterprises seeking to build, procure, or embed AI into core operations, the bar for safety, security, and transparency is rising.

Organisations that anticipate these expectations and build them in from product strategy through deployment will not only reduce risk but also gain a competitive advantage in trust, regulatory compliance, and brand reputation.

Related articles

Contact us

Talk to us about your AI development project

We’re happy to answer any questions you may have and help you determine which of our AI services best fit your needs.

Our Services:
What happens next?
1

We look over your enquiry

2

We do a discovery and consulting call if relevant 

3

We prepare a proposal 

Talk to us about an AI Project (Suggested)

Use Streamline to define your AI project faster, clearer, and smarter than any form. Intelligent data gathering.

Use Traditional Form
By sending this message, you agree that we may store and process your data as described in our Privacy Policy.