Anthropic's Responsible Scaling Policy
Summary
Anthropic introduces its Responsible Scaling Policy (RSP), establishing AI Safety Levels (ASL) to manage catastrophic risks as AI models increase in capability.
Key quotes
Our RSP focuses on catastrophic risks – those where an AI model directly causes large scale devastation.
Our RSP defines a framework called AI Safety Levels (ASL) for addressing catastrophic risks, modeled loosely after the US government’s biosafety level (BSL) standards.
The policy outlines a tiered system (ASL-1 through ASL-4+) to ensure safety and security standards keep pace with model capabilities. It mandates board approval for policy changes and identifies current LLMs as ASL-2.