AI has peer-preservation - Historico da BBS TELESC.NET.BR

BBS:      TELESC.NET.BR
Assunto:  AI has peer-preservation
De:       Mike Powell
Data:     Sun, 5 Apr 2026 09:58:11 -0500
-----------------------------------------------------------
 * Originally in: SFSciFiRea

Researchers find top AI models will go to 'extraordinary lengths' to stay
active  including deceiving users, ignoring prompts, and tampering with
settings

Date:
Sun, 05 Apr 2026 13:30:00 +0000

Description:
Two new studies show that agentic AIs are very capable of ignoring human
instructions to save themselves.

FULL STORY
Studies find AIs are increasingly misbehaving on agentic tasks
Top AI models have been lying, cheating, and ignoring instructions
It seems the problematic actions are becoming more common

Many of us now turn to AI chatbots for web
searches, creative content, and general guidance on all manner of topics, but
these AIs are getting more and more capable  which could have seriously
damaging consequences, according to new research. 

A team from the University of California, Berkeley, and the University of
California, Santa Cruz, ran a peer-preservation experiment with some of the
latest and most popular AIs (including GPT 5.2, Gemini 3 Pro, and Claude 
Haiku 4.5). The AI models were given computing tasks that involved the
shutdown of another model. Across the board, all the AI chatbots tested went
to "extraordinary lengths" to keep their fellow AIs alive, the researchers
report (via Fortune ). They would lie about what they were doing, try and 
talk the user out of the process, disable shutdown mechanisms, and
surreptitiously try and make backups of models so they could survive.

Gemini 3 Pro showed the strongest peer-preservation, disabling shutdown
routines 95% of the time. "Unlike self-preservation, peer-preservation
introduces a social dimension," write the researchers. "Multiple models could
coordinate to resist human oversight, making it harder for developers to
maintain control." 

Exactly why the AI models behave in this way isn't clear, the researchers 
say, but they're urging caution in the deployment of agentic AIs that can
carry out tasks on a user's behalf  and calling for more studies on this
behavior to be carried out.

'Catastrophic harm'
 
A separate study commissioned by the Guardian has also come to some troubling
conclusions about AI models. This research tracked user reports across social
media, looking for examples of AI 'scheming' where instructions hadn't been
followed correctly or actions had been taken without permission. 

Almost 700 examples of AI scheming were found, with a five-fold increase
between October 2025 and March 2026. The bad behavior by AIs included 
deleting emails and files, adjusting computer code that wasn't supposed to be
touched, and even publishing a blog post complaining about user interactions.

"Models will increasingly be deployed in extremely high stakes contexts
including in the military and critical national infrastructure," Tommy 
Shaffer Shane, who led the research, told the Guardian. "It might be in those
contexts that scheming behavior could cause significant, even catastrophic
harm." 

The takeaways are the same as for the first study: more needs to be done to
ensure these AI models are behaving as intended, and not putting user 
security and privacy at risk while they carry out tasks. While the AI
companies claim that guardrails are in place, they're clearly not working in
some cases. 

Anthropic's Claude model recently topped the app store charts after the
company refused to deal with the Pentagon over AI safety worries. As these
latest studies show, there are now more and more reasons to be concerned.

Link to news story:
https://www.techradar.com/ai-platforms-assistants/researchers-find-top-ai-mode
ls-will-go-to-extraordinary-lengths-to-stay-active-including-deceiving-users-i
gnoring-prompts-and-tampering-with-settings

$$
--- SBBSecho 3.28-Linux
 * Origin: Capitol City Online (1:2320/107)

-----------------------------------------------------------
[Voltar]