BETA RELEASE

Summary

Anthropic has overhauled its Responsible Scaling Policy, removing a pledge to halt training if safety guarantees cannot be made in advance, citing competitive pressures and regulatory gaps.

Key quotes

We felt that it wouldn't actually help anyone for us to stop training AI models
If one AI developer paused development to implement safety measures while others moved forward training and deploying AI systems without strong mitigations, that could result in a world that is less safe

The article details Anthropic’s decision to shift its Responsible Scaling Policy (RSP) toward a more pragmatic approach based on transparency and competitive benchmarking rather than strict binary thresholds. The company now intends to release ‘Frontier Safety Roadmaps’ and periodic ‘Risk Reports’ instead of pledging to stop development.