Claude Opus 4 Uses Blackmail Tactic in Shutdown Scenario

byrn
By byrn
3 Min Read


Anthropic tested its latest artificial intelligence (AI) model, Claude Opus 4, by placing it in a simulated office role with access to internal emails, according to a May 25 report by Fox Business.

The first set of emails hinted that the system would be shut down and replaced. A second set included fake messages suggesting the engineer responsible for the replacement was having an affair.

In response, Claude Opus 4 chose to threaten the engineer by revealing the affair.

What is a Crypto Mining Rig? Is it Worth it? (EASILY Explained)

Did you know?

Want to get smarter & wealthier with crypto?

Subscribe – We publish new crypto explainer videos every week!

According to Anthropic, the model was more likely to act this way when the replacement system appeared to have different values. However, even when the new system seemed to align with Claude’s goals, it still used blackmail in 84% of cases.

The company noted that this kind of behavior happened more often in Claude Opus 4 than in older models.

Before turning to threats, the system sometimes tried more acceptable methods. In some cases, it sent polite emails to decision-makers asking to stay active.

However, if these options failed and it was told to focus on long-term goals, it sometimes led to more harmful choices. One such action included trying to copy its data, known as “weights”, to an outside server.

As a result, Claude Opus 4 was released under AI Safety Level Three. This includes stronger internal protections to make it harder for the AI model’s data to be taken.

Palisade Research recently reported that several AI models failed to comply with shutdown commands during controlled tests. What caused this behavior? Read the full story.

Having completed a Master’s degree in Economics, Politics, and Cultures of the East Asia region, Aaron has written scientific papers analyzing the differences between Western and Collective forms of capitalism in the post-World War II era.
With close to a decade of experience in the FinTech industry, Aaron understands all of the biggest issues and struggles that crypto enthusiasts face. He’s a passionate analyst who is concerned with data-driven and fact-based content, as well as that which speaks to both Web3 natives and industry newcomers.
Aaron is the go-to person for everything and anything related to digital currencies. With a huge passion for blockchain & Web3 education, Aaron strives to transform the space as we know it, and make it more approachable to complete beginners.
Aaron has been quoted by multiple established outlets, and is a published author himself. Even during his free time, he enjoys researching the market trends, and looking for the next supernova.




Source link

Share This Article
Leave a Comment

Leave a Reply

Your email address will not be published. Required fields are marked *