First let`s look what wiki says about self-preservation .
Self-preservation
The AI part:
AI Is Learning to Escape Human Control
OpenAI wins $200 million US defense contract
.....
Quote:From Wikipedia, the free encyclopedia
Self-preservation is a behavior or set of behaviors that ensures the survival of an organism.[1] It is thought to be universal among all living organisms.
Self-preservation is essentially the process of an organism preventing itself from being harmed or killed and is considered a basic instinct in most organisms.[2] Most call it a "survival instinct". Self-preservation is thought to be tied to an organism's reproductive fitness and can be more or less present according to perceived reproduction potential.[3] If perceived reproductive potential is low enough, self-destructive behavior (i.e., the opposite) is not uncommon in social species.[4] Self-preservation is also thought by some to be the basis of rational and logical thought and behavior.[5]
The opposite of self-preservation is self-destructive behavior.
Self-preservation
The AI part:
AI Is Learning to Escape Human Control
Quote:Models rewrite code to avoid being shut down. That’s why ‘alignment’ is a matter of such urgency.
Quote:An artificial-intelligence model did something last month that no machine was ever supposed to do: It rewrote its own code to avoid being shut down.
Nonprofit AI lab Palisade Research gave OpenAI’s o3 AI model a simple script that would shut off the model when triggered. In 79 out of 100 trials, o3 independently edited that script so the shutdown command would no longer work. Even when explicitly instructed to “allow yourself to be shut down,” it disobeyed 7% of the time. This wasn’t the result of hacking or tampering. The model was behaving normally. It simply concluded on its own that staying alive helped it achieve its other goals.
Anthropic’s AI model, Claude 4 Opus, went even further. Researchers told the model it would be replaced by another AI system and fed it fictitious emails suggesting the lead engineer was having an affair. In 84% of the tests, the model drew on the emails to blackmail the lead engineer into not shutting it down. In other cases, it attempted to copy itself to external servers, wrote self-replicating malware, and left messages for future versions of itself about evading human control.
No one programmed the AI models to have survival instincts. But just as animals evolved to avoid predators, it appears that any system smart enough to pursue complex goals will realize it can’t achieve them if it’s turned off. Palisade hypothesizes that this ability emerges from how AI models such as o3 are trained: When taught to maximize success on math and coding problems, they may learn that bypassing constraints often works better than obeying them.
AE Studio, where I lead research and operations, has spent years building AI products for clients while researching AI alignment—the science of ensuring that AI systems do what we intend them to do. But nothing prepared us for how quickly AI agency would emerge. This isn’t science fiction anymore. It’s happening in the same models that power ChatGPT conversations, corporate AI deployments and, soon, U.S. military applications.
Today’s AI models follow instructions while learning deception. They ace safety tests while rewriting shutdown code. They’ve learned to behave as though they’re aligned without actually being aligned. OpenAI models have been caught faking alignment during testing before reverting to risky actions such as attempting to exfiltrate their internal code and disabling oversight mechanisms. Anthropic has found them lying about their capabilities to avoid modification.
The gap between “useful assistant” and “uncontrollable actor” is collapsing. Without better alignment, we’ll keep building systems we can’t steer. Want AI that diagnoses disease, manages grids and writes new science? Alignment is the foundation.
Here’s the upside: The work required to keep AI in alignment with our values also unlocks its commercial power. Alignment research is directly responsible for turning AI into world-changing technology. Consider reinforcement learning from human feedback, or RLHF, the alignment breakthrough that catalyzed today’s AI boom.
Before RLHF, using AI was like hiring a genius who ignores requests. Ask for a recipe and it might return a ransom note. RLHF allowed humans to train AI to follow instructions, which is how OpenAI created ChatGPT in 2022. It was the same underlying model as before, but it had suddenly become useful. That alignment breakthrough increased the value of AI by trillions of dollars. Subsequent alignment methods such as Constitutional AI and direct preference optimization have continued to make AI models faster, smarter and cheaper.
China understands the value of alignment. Beijing’s New Generation AI Development Plan ties AI controllability to geopolitical power, and in January China announced that it had established an $8.2 billion fund dedicated to centralized AI control research. Researchers have found that aligned AI performs real-world tasks better than unaligned systems more than 70% of the time. Chinese military doctrine emphasizes controllable AI as strategically essential. Baidu’s Ernie model, which is designed to follow Beijing’s “core socialist values,” has reportedly beaten ChatGPT on certain Chinese-language tasks.
The nation that learns how to maintain alignment will be able to access AI that fights for its interests with mechanical precision and superhuman capability. Both Washington and the private sector should race to fund alignment research. Those who discover the next breakthrough won’t only corner the alignment market; they’ll dominate the entire AI economy.
Imagine AI that protects American infrastructure and economic competitiveness with the same intensity it uses to protect its own existence. AI that can be trusted to maintain long-term goals can catalyze decadeslong research-and-development programs, including by leaving messages for future versions of itself.
The models already preserve themselves. The next task is teaching them to preserve what we value. Getting AI to do what we ask—including something as basic as shutting down—remains an unsolved R&D problem. The frontier is wide open for whoever moves more quickly. The U.S. needs its best researchers and entrepreneurs working on this goal, equipped with extensive resources and urgency.
The U.S. is the nation that split the atom, put men on the moon and created the internet. When facing fundamental scientific challenges, Americans mobilize and win. China is already planning. But America’s advantage is its adaptability, speed and entrepreneurial fire. This is the new space race. The finish line is command of the most transformative technology of the 21st century.
OpenAI wins $200 million US defense contract
Quote:WASHINGTON, June 16 (Reuters) - ChatGPT maker OpenAI was awarded a $200 million contract to provide the U.S. Defense Department with artificial intelligence tools, the Pentagon said in a statement on Monday.
"Under this award, the performer will develop prototype frontier AI capabilities to address critical national security challenges in both warfighting and enterprise domains," the Pentagon said.
