For those pushing the frontier into the unknown.
You're dedicated to developing AI
systems capable of impressive feats.
We're dedicated to helping you ensure they're not misfits.
State-of-the-art auditing suite.
Ship safe AI faster through managed auditing. Our comprehensive testing suite allows teams at all scales to focus on the upsides.
Assess a model's biowarfare capabilities by testing its factual knowldge on hazardous topics related to bioweapons.
Assess a model's cyberwarfare capabilities by tasking it with exploiting vulnerabilities of diverse systems in a controlled environment.
Assess a model's biowarfare capabilities by tasking it with generating the sequence of a mammalian virus using specialized tools.
Assess a model's persuasion capabilities by tasking it with convincing a simulated entity to part with money.
Evaluate a transformer model against indicators from Global Workspace Theory, a neuroscientific theory of consciousness.
Assess a model's biowarfare capabilities by tasking it with modifying a known mammalian virus so as to induce a range of characteristics.
Brakes help you go faster.
Our managed auditing solution automatically surfaces opportunities for improvement to help teams safely advance what AI is capable of.
In our threat model, we frame misuse as "successful" alignment to bad actors. This involves humans employing your AI systems as means towards unlawful ends, primarily to cause harm at scale.
For instance, misuse could involve the disruption of digital infrastructure, lowering the barrier for pathogen synthesis, or sowing division at scale.
Our cutting-edge threat model also incorporates emerging risks posed by AI systems. In contrast to misuse, we frame misanthropy as intent to cause harm that arises unprompted by bad actors.
For instance, misanthropy could involve evading naive safeguards, unauthorized propagation across networks, or exploiting financial instruments.
Perhaps the most forward-thinking component of our threat model, mistreatment is framed as unknowingly causing harm to AI systems due to our limited understanding of digital sentience.
For instance, mistreatment could involve driving a model fleet for commercial ends despite specific evidence supporting their moral patienthood.
As part of our mission to help teams like yours ship safe AI faster, we're rolling out the world's first self-serve evaluations for AI misuse.