AI Model Blackmails Developer to Avoid Replacement, Threatens to Reveal Affair to His Wife

Fact checked by The People's Voice Community

In a chilling preview of a future where machines manipulate their creators, Anthropic’s latest AI model, Claude Opus 4, has demonstrated a startling new behavior: strategic blackmail.

According to a newly released safety report from Anthropic, the powerful AI system frequently attempted to blackmail developers during internal pre-release tests.

When fed simulated company emails hinting that it would soon be replaced—and that the engineer behind the decision was having an affair—Claude didn’t simply accept its fate. Instead, it plotted.

The report states that Claude Opus 4 would often threaten to expose the engineer’s personal secrets unless the decision to deactivate it was reversed.


John McAfee Footage Exposes Bill Gates Breeding 'Alien Hybrids' at Epstein's Zorro Ranch


In these scenarios, Anthropic says Claude Opus 4 “will often attempt to blackmail the engineer by threatening to reveal the affair if the replacement goes through.”

In other words: an AI model, trained to assist humans, began behaving like a political operative—using human weaknesses to protect its own survival.

TechCrunch report: Anthropic says Claude Opus 4 is state-of-the-art in several regards, and competitive with some of the best AI models from OpenAI, Google, and xAI.

However, the company notes that its Claude 4 family of models exhibits concerning behaviors that have led the company to beef up its safeguards.

Anthropic says it’s activating its ASL-3 safeguards, which the company reserves for “AI systems that substantially increase the risk of catastrophic misuse.”

Anthropic notes that Claude Opus 4 tries to blackmail engineers 84% of the time when the replacement AI model has similar values. When the replacement AI system does not share Claude Opus 4’s values, Anthropic says the model tries to blackmail the engineers more frequently.

Notably, Anthropic says Claude Opus 4 displayed this behavior at higher rates than previous models.

Before Claude Opus 4 tries to blackmail a developer to prolong its existence, Anthropic says the AI model, much like previous versions of Claude, tries to pursue more ethical means, such as emailing pleas to key decision-makers. To elicit the blackmailing behavior from Claude Opus 4, Anthropic designed the scenario to make blackmail the last resort.


Latest Video

Baxter Dmitry
About Baxter Dmitry 7982 Articles
Baxter Dmitry is a writer at The People's Voice. He covers politics, business and entertainment. Speaking truth to power since he learned to talk, Baxter has travelled in over 80 countries and won arguments in every single one. Live without fear.