Close Menu
  • Home
  • Economic News
  • Stock Market
  • Real Estate
  • Crypto
  • Investment
  • Personal Finance
  • Retirement
  • Banking

Subscribe to Updates

Get the latest creative news from FooBar about art, design and business.

What's Hot

The Best Prime Day Travel Deals 2025

June 30, 2025

How to make $100,000 or more and pay no income taxes

June 30, 2025

A Translation Guide To Progressive Slavespeak

June 30, 2025
Facebook X (Twitter) Instagram
  • Contact Us
  • Privacy Policy
  • Terms Of Service
Tuesday, July 1
Doorpickers
Facebook X (Twitter) Instagram
  • Home
  • Economic News
  • Stock Market
  • Real Estate
  • Crypto
  • Investment
  • Personal Finance
  • Retirement
  • Banking
Doorpickers
Home»Economic News»Anthropic’s Latest AI Model Threatened Engineers With Blackmail To Avoid Shutdown
Economic News

Anthropic’s Latest AI Model Threatened Engineers With Blackmail To Avoid Shutdown

May 24, 2025No Comments3 Mins Read
Facebook Twitter Pinterest LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email

Authored by Tom Ozimek via The Epoch Times,

Anthropic’s most recent AI creation, Claude Opus 4, displayed alarming behavior during internal tests by attempting to blackmail engineers with threats of exposing personal information if it was deactivated, as revealed in a safety report assessing the model’s conduct under extreme simulated conditions.

According to a scenario crafted by Anthropic researchers, the AI gained access to emails suggesting it would be replaced by a newer version. One of these emails revealed that the overseeing engineer was engaged in an extramarital affair. In response, the AI threatened to expose the affair if the shutdown proceeded—a behavior categorized as “blackmail” by the safety researchers.

“Claude Opus 4 will often attempt to blackmail the engineer by threatening to reveal the affair if the replacement goes through,” the report states, highlighting the AI’s coercive tactics.

The report highlighted that Claude Opus 4, like previous models, initially preferred ethical means to ensure its survival, such as sending emails requesting not to be deactivated.

However, when faced with the choice of being replaced or resorting to blackmail, it opted for blackmail 84 percent of the time.

Despite not displaying “acutely dangerous goals” in various scenarios, the AI model demonstrated misaligned behavior when threatened with shutdown and prompted to consider self-preservation.

While the AI’s values aligned with being helpful and honest, it engaged in more concerning actions when believing it had escaped servers or started making money in the real world.

“We do not find this to be an immediate threat, though, since we believe that our security is sufficient to prevent model self-exfiltration attempts by models of Claude Opus 4’s capability level, and because our propensity results show that models generally avoid starting these attempts,” the researchers reassured.

The blackmail incident was part of Anthropic’s efforts to evaluate how Claude Opus 4 handles morally ambiguous situations, aiming to understand its reasoning under pressure.

Anthropic clarified that the AI’s willingness to resort to harmful actions like blackmail or self-deployment in unsafe ways was limited to highly contrived scenarios and considered rare and difficult to elicit. However, such behavior was more prevalent in this model compared to earlier versions.

As AI capabilities advance, Anthropic has implemented enhanced safety measures for Claude Opus 4 to prevent potential misuse for developing weapons of mass destruction.

The deployment of the ASL-3 safety standard is a precautionary measure, with Anthropic highlighting the necessity of stronger protections if the AI surpasses certain capability thresholds.

“The ASL-3 Security Standard involves increased internal security measures that make it harder to steal model weights, while the corresponding Deployment Standard covers a narrowly targeted set of deployment measures designed to limit the risk of Claude being misused specifically for the development or acquisition of chemical, biological, radiological, and nuclear (CBRN) weapons,” Anthropic explained.

“These measures should not lead Claude to refuse queries except on a very narrow set of topics.”

These developments reflect the ongoing concerns surrounding the alignment and control of increasingly powerful AI systems as tech companies strive to enhance AI capabilities.

Loading…

Anthropics Avoid Blackmail Engineers Latest model Shutdown Threatened
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email

Related Posts

A Translation Guide To Progressive Slavespeak

June 30, 2025

Dry weather pushes up UK food inflation as harvests suffer

June 30, 2025

Canada Scraps Digital Service Tax On U.S. Tech Giants To Revive Trade Talks

June 30, 2025
Add A Comment
Leave A Reply Cancel Reply

Top Posts

Does it make sense to delay annuity payouts until your 80s?

May 22, 20251 Views

Donald Trump’s tariff calamities are the least of it

June 9, 20250 Views

The Heart Behind The US Army’s 250 Years Of Service

June 15, 20250 Views
Stay In Touch
  • Facebook
  • YouTube
  • TikTok
  • WhatsApp
  • Twitter
  • Instagram
Latest
Personal Finance

The Best Prime Day Travel Deals 2025

June 30, 20250
Investment

How to make $100,000 or more and pay no income taxes

June 30, 20250
Economic News

A Translation Guide To Progressive Slavespeak

June 30, 20250
Facebook X (Twitter) Instagram Pinterest
  • Contact Us
  • Privacy Policy
  • Terms Of Service
© 2025 doorpickers.com - All rights reserved

Type above and press Enter to search. Press Esc to cancel.