Safe, Defensive Ais To Counter Rogue Ais. How Might Humanity Defend Itself Against

You might also like

Download as rtf, pdf, or txt
Download as rtf, pdf, or txt
You are on page 1of 2

Safe, defensive AIs to counter rogue AIs.

How might humanity defend itself against


rogue AIs that surpass human intelligence in many critical ways? Let us imagine
armies of AI “trolls” on social media. Individual trolls could learn from the online
presence of the people they are aiming to influence and engage in dialogue with those
targets to sway their political opinions or entice them to take certain actions.
Alternatively, consider cyberattacks involving millions of computer viruses,
coordinated in a manner impossible for a human team to devise. Such sophisticated
attacks would exceed our usual cybersecurity-defense capabilities, which have been
designed to counter human-driven attacks.

And there is rapidly growing concern that bioweapons could be developed with the
assistance of question-answering AI systems (as per Dario Amodei’s congressional
testimony), or even constructed directly by more autonomous AI systems in the
future. What types of policies and cybersecurity defenses could be established to
12

avert or counteract such dangers? When confronted with an enemy smarter than
humans, it seems logical to employ assistance that also surpasses human intelligence.

Such an approach would require careful implementation, however, as we do not want


these AI assistants to transform into rogue AIs, either intentionally because an
operator takes unwarranted control of the AI or because the operators lose control of
it. It is important to note that we currently lack the knowledge to construct AI systems
that are guaranteed to be safe—that is, systems that will not unintentionally become
misaligned and adopt goals that conflict with our own. Unfortunately, even slight
misalignment can result in the emergence of unintended goals, such as power-seeking
and self-preservation, which can be advantageous for accomplishing virtually any
other goal. For instance, if an AI acquires a self-preservation goal, it will resist
attempts to shut it down, creating immediate conflict and potentially leading to a loss
of control if we fail to deactivate it. From that point on, it would be akin to having
13

created a new species, one that is potentially smarter than humans. An AI’s self-
preservation objective could compel it to replicate itself across various computers,
similar to a computer virus, and to seek necessary resources for its preservation, such
as electrical power. It is conceivable that this rogue AI might even attempt to control
or eliminate humans to ensure its survival, especially if it can command robots.
It is therefore critical that we conduct research into methods that can reduce the
possibility of misalignment and prevent loss of control. Spurred by scientific
breakthroughs and the anticipated benefits of more powerful AI systems, both the
scientific community and industry are making rapid progress developing more
powerful AI systems—making research on countermeasures a matter of urgency. The
risk of dangerous power concentration (either at the command of a human or not)
escalates with the expanding abilities of these systems. As they inch closer to
surpassing human capabilities, the potential for significant harm grows

You might also like