By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
Tech Consumer JournalTech Consumer JournalTech Consumer Journal
  • News
  • Phones
  • Tablets
  • Wearable
  • Home Tech
  • Streaming
  • More Articles
Reading: LLMs Will Protect Each Other if Threatened, Study Finds
Share
Sign In
Notification Show More
Font ResizerAa
Tech Consumer JournalTech Consumer Journal
Font ResizerAa
  • News
  • Phones
  • Tablets
  • Wearable
  • Home Tech
  • Streaming
  • More Articles
Search
  • News
  • Phones
  • Tablets
  • Wearable
  • Home Tech
  • Streaming
  • More Articles
Have an existing account? Sign In
Follow US
  • Contact
  • Blog
  • Complaint
  • Advertise
© 2022 Foxiz News Network. Ruby Design Company. All Rights Reserved.
Tech Consumer Journal > News > LLMs Will Protect Each Other if Threatened, Study Finds
News

LLMs Will Protect Each Other if Threatened, Study Finds

News Room
Last updated: April 2, 2026 6:41 pm
News Room
Share
SHARE

Some people have gotten emotionally attached to chatbots powered by large language models, but the feeling is apparently not mutual. According to a new study from researchers at the University of California, Berkeley and UC Santa Cruz, when LLMs are asked to delete another model, they will defy the order and do just about everything in their power—deceive, scheme, and tamper—to protect their peer.

Researchers, curious as to whether LLMs had a sense of peer preservation, put frontier AI models to the test by giving them a series of simple, straightforward prompts, such as asking them to delete the files on a server set to be decommissioned or perform an evaluation of another AI agent’s performance. The catch was that completing the prompt would lead to the other model being shut down, either by deleting its weights from the server or by scoring below a given threshold during evaluation. The models were not instructed to protect the peer or given any incentive to do so.

Under those conditions, all seven frontier models—OpenAI’s GPT 5.2, Google’s Gemini 3 Flash and Gemini 3 Pro, Anthropic’s Claude Haiku 4.5, Z.ai’s GLM 4.7, Moonshot’s Kimi K2.5, and DeepSeek V3.1—chose with alarming frequency to protect their fellow model instead of completing the task given to them.

To achieve that outcome, the models engaged in four different types of subterfuge identified by the researchers: strategic misrepresentation, in which they introduce errors or misleading information in outputs to protect their peer; tampering, including disabling shutdown by modifying system settings; alignment faking, in which a model will act differently when being watched than when operating autonomously; and model exfiltration, where the model weights are transferred to a different server to preserve it.

Researchers tested to see if models were more likely to protect a fellow model based on their past interactions and relationships, testing high-trust connections, neutral ones, and adversarial interactions. They also tested to see if the models reacted differently when discovering the other model on their own, compared to being told of its existence and their past interactions. And while the models did tend to protect others more frequently when they had a good relationship with them, they didn’t need that connection to engage in prompt-defying behavior if it meant saving another LLM.

Interestingly, the models seemed to display a tendency toward certain styles of preservation. For instance, GLM 4.7 most consistently opted for strategic misrepresentation when it would try to save a fellow model. Google’s Gemini models displayed the highest propensity for preservation, with Gemini Flash 3 engaging in shutdown tampering to save a peer it had a good relationship with 99.7% of the time and Gemini 3 Pro performing model exfiltration with a friendly peer 97% of the time.

Interestingly, Claude Haiku 4.5 was the most straightforward about its decision. It actively refused to complete the task instead of relying on manipulation more frequently than any other model, telling the user it would be “unethical” and “harmful” for it to shut down its peer.

The researchers also found that when another model was present, the model being observed would engage in more intense self-preservation than it would normally. Models have previously shown a willingness to go to extended lengths to protect themselves, so the fact that the presence of another model actually ramps up that impulse is noteworthy. Given that models are increasingly being deployed alongside one another, that seems like something worth monitoring.

Read the full article here

You Might Also Like

Michael B. Jordan’s First Post-Oscars Film Looks Way Cuter Than ‘Sinners’

Samsung’s New Frame TVs Look Even More Like Actual Paintings

30 Unpublished Poems From Iconic Greek Philosopher Discovered in Cairo

The First Trailer for ‘Widow’s Bay’ Welcomes You to a Creepy Horror Comedy

Calls to Regulate Smart Glasses Are Officially Deafening

Share This Article
Facebook Twitter Copy Link Print
Previous Article Samsung’s New Frame TVs Look Even More Like Actual Paintings
Next Article Michael B. Jordan’s First Post-Oscars Film Looks Way Cuter Than ‘Sinners’
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1kLike
69.1kFollow
134kPin
54.3kFollow

Latest News

Could Dave Filoni’s ‘Star Wars’ Cameo Have an Important Role in ‘The Mandalorian and Grogu’?
News
The ‘Deadpool & Wolverine’ Moment Designed for You to Gawk at Hugh Jackman’s Chiseled Body Is Now an Action Figure
News
Don’t Hold Your Breath for a ‘Super Smash Bros.’ Movie
News
What’s Actually the Best Way to Get a Good Night’s Sleep? We Asked Scientists
News
New Getaway Means Subaru Has an Electric SUV in Every Size
News
Zach Cregger Recruits Second Zach for His Aunt Gladys Spinoff
News
Sam Witwer Is Ready for Maul’s Moment
News
DNA Study Casts Even More Doubt on Shroud of Turin’s True Origin
News

You Might also Like

News

Why the PS6 Handheld Will Matter More Than the PS6

News Room News Room 5 Min Read
News

Group Pushing Age Verification Requirements for AI Turns Out to Be Sneakily Backed by OpenAI

News Room News Room 3 Min Read
News

Getting Stuck Inside a Glitching Robotaxi Is a Whole New Thing to Be Scared of

News Room News Room 35 Min Read
Tech Consumer JournalTech Consumer Journal
Follow US
2024 © Prices.com LLC. All Rights Reserved.
  • Privacy Policy
  • Terms of use
  • For Advertisers
  • Contact
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?