Anthropic changes Safety guidelines, will now train AI model even if safety not guaranteed

Anthropic is currently fighting aggressively to win the AI race. With its model Claude the company is certainly at the forefront of the AI development. However, to keep its position and continue to remain competitive, it has now revised its AI safety policy and has dropped one of its most defining safety promises, the very core promise that led to its creation in 2021.

In its latest update to the Responsible Scaling Policy, which it had created in 2023, Anthropic has announced that it will no longer commit to stop training or releasing more powerful AI models solely on the basis of safety concerns.

Anthropic explains that while its earlier model helped it create better safeguards for current risks, it has hit a “zone of ambiguity”, where it is difficult to prove exactly when an AI model becomes dangerous. According to the company, “the science of model evaluation isn’t well-developed enough to provide dispositive answers,” thus it is difficult for Anthropic to convince other AI companies or the government to stop and wait for them.

The company also argues that if it sticks to strict “stop” rules while other companies continue building more powerful AI, it could end up falling behind. And if companies like Anthropic lose influence, that could make “a world that is less safe”. So instead of slowing itself down alone, Anthropic has decided to move away from rigid stop rules and adopt a more flexible approach.

Anthropic also noted that reaching the highest levels of security against major threats is currently “not possible” for a single company to achieve alone. So, instead of maintaining strict, unilateral commitments, it is shifting towards a more flexible system focused on transparency and regular “Risk Reports”. Through these reports, the company says it will clearly explain how it is identifying and managing risks as its AI models become more powerful.

Anthropic shakes the core safety rule?

Anthropic’s latest policy shift has come as a big surprise for the industry. Founded by former OpenAI researchers who were vocal about the risks of advanced AI, the company built its reputation on putting safety first. When it introduced its Responsible Scaling Policy (RSP) in 2023, the framework of the company centred on a clear “hard-stop” on the launch of models which do not guarantee safety. In fact that commitment became one of the key reasons why Anthropic is seen as the most safety-focused of the major AI labs.

But in middle of the heated competition in the AI race, the company has now decided to keep safety a somewhat second priority.

Hot topics

World

Business

Politics

Tech

Hot topics

World

Business

Politics

Tech

Anthropic shakes the core safety rule?

Topics

Related Articles

Categories

Latest

Newsletter