Further notes on the āunaligned A.I.ā problem
A lot of dust is now being raised by media hype and corporate positioning about A.I.āsimilar to what we saw in the early days of the Internet. Behind all the dust clouds, though, thereās an active debate among techies and tech-adjacent types about the āA.I. apocalypseā that may lie in our future.
My previous post has more details, but anyway Iām referring to a future in which A.I. systems will be significantly more powerful than they are todayāmaybe capable of running entire industries, maybe capable of running everything. While these systems could displace most/all humans from the production side of the economy, they could also drive the costs of goods and services so low that anyone, on the strength of savings or a state subsidy, could live a comfortable life. (In other words, the āparadiseā depicted in films like Wall-E.) One catch is that these A.I. systems, if built with the same machine-learning design approaches used in modern ChatGPT-type systems, effectively will be advanced non-human intelligences with opaque cognitive processes. It might be as hard, or even harder, to train them to āalignā their values with human values as it is now with much more primitive systems. Thatās a problem because an unaligned A.I. is one that plausibly would have no compunction about doing away with humansājust as soon as it could survive without them.
We already know that current AIs are capable of pretty weird and unfriendly behavior. We know their mindset is inhuman and inherently difficult to train to do useful things while also obeying moral rules. We know we have no robust, foolproof way to instill a ādo not harm peopleā principle in them. It really is believable that one or more of them, when cognitively scaled up and given the opportunity, would try to exterminate some or all of us, as casually as you or I would spray Raid on some ants we had found in the kitchen.
Many A.I. and āA.I. ethicsā experts are thinking about this problem now. At least one prominent researcher, Eliezer Yudkowsky, has rather emotionally thrown his hands up in despair (see video above). He will keep thinking about the alignment problem, he says, but for now has no good solutionāand worse, has no confidence in the folks that currently control A.I. research.
My own view, fwiw (Iām not an A.I. expert though I have a technical background), is that the A.I. alignment problem isnāt the main problem here.
Alignment should be a soluble technical problem for an A.I. system if its architecture is designed with the need for alignment in mind. A key goal of this design approach would be to ensure that the A.I.ās motives and specific plans are always transparent. Itās like putting a speed governor on a carās drive systemāa relatively straightforward task, if you have a real-time readout from an accurate speedometer.
There is a deeper problem, thoughāa deeper problem that is also a general problem in societies that believe their cultures and technologies should be free to evolve where they will. Put simply, although many technologies have potentially hazardous side-effects, in Western societies hardly any of them are regulated so strongly that their hazards are effectively mitigated in every instance of the technology.
In the case of A.I., it should be technically possible, maybe even easy, to align a given system with training/hard-coding, assuming it has the right architecture. Enforcing the alignment of every A.I. system that presents a potential hazard, in order to cut the risk to zero, would be the real challenge. Even domestic enforcement would be tough, but international enforcementāagainst bad-actor states like Russia, China, and North Koreaācould be impossible without war-like cross-border interventions. And, again, weāre not talking about a technical issue of A.I. design. Weāre talking about the geopolitical issue of being able to control, regulate, and, if needed, destroy other countriesā A.I.s.
Itās easy to imagine that as A.I. develops in Western countries, domestic regulatory regimes will develop around it, perhaps modeled on existing regulatory systems covering nuclear reactors and the plutonium and other radioactive byproducts they generate. (The antiterrorism model is probably also applicable.) For the regulation of āforeign A.I.s,ā the system will probably resemble the modern arms control and anti-proliferation setup.
Modern arms control and antiproliferation efforts, so far, have been moderately successful in keeping nukes out of the hands of crazy states. Obviously, they have not been entirely successful: see Iran, Pakistan, N. Korea. Moreover, A.I. could be a lot harder to regulate than nuclear weapons. Nukes require very special materials and engineering knowledge. By contrast, even a future superintelligent A.I., in principle, might be able to use consumer-grade hardware that any moderately wealthy Dr. No type could obtain from Amazon.com and assemble undetectably on private property. Most importantly, the hazard from any instance of an advanced A.I. is potentially infinite from the human perspective, whereas the hazard from any single nuclear weapon (or even all of them) is much more limited.
So a plausible scenario is that Western and Western-allied governments will set up A.I. regulatory systems domestically, and, to the extent they can, a regulatory/antiproliferation system abroad. Presumably they will also take steps to counter or survive against specific WMD threats from A.I.s gone badāthreats that could really run the gamut of nightmares, including totally novel pathogens with human-exterminating potential. Despite all this effort, though, it seems unlikely that āthe good guysā will be able to mitigate the risk sufficiently within the system of nations that now exists.
On the other hand, as the awareness of the risk grows (possibly due to actual disasters), it should push Western governments to work together more and more tightly, to do whatever they can to extend A.I. regulationācoercively, if necessaryāto non-compliant individuals and organizations in the West, and to entire non-compliant countries outside the West. If the risk is as big, and as hard-to-mitigate, as I suspect, then the end result could be effectively a single, highly intrusive, all-surveilling World Government. Obviously, the risks from other hazardous techs will tend to drive things in the same direction. Even if the geopolitical changes donāt run all the way to that drastic outcome, people ultimately will be forced to recognize that the Westās naĆÆve belief in āfreedomā was always going to lead it towards a Leviathan-like unfree state.
***