Advanced AI will be more dangerous than it seems, but (good news!) probably won’t be in position to snuff out humanity for another decade at least.
Eliezer Yudkowsky is one of those people who, along with being hyper-intelligent, bears the modern secondary characteristics of hyper-intelligence. Asked how heâs doing, he replies archly: âwithin one standard deviation of my own peculiar little mean.â He feels compelled, when talking, to digress down mazelike lanes and alleys of technical detail. He looks like a geek. Above all, he has the kind of backstory (no high school, no collegeâjust homeschooled and self-taught) that conjures up the image of a lonely boy, lost in books and computers, his principal companion his own multifarious cortex.
Raised in Modern Orthodox Judaism, Yudkowsky has been warning anyone who will listen of a nemesis right out of the Judaic lore: a golem, a kind of Frankensteinâs monster, built by hubristic, irreverent men and destined to punish them for their sinful pride.
Yudkowskyâs golem is A.I., which he expects to get smarter and smarter in the coming years, until it starts to take a hand in its own programming, and quickly makes the leap to superintelligenceâthe state of being cleverer than humans at everything. He doesnât just expect that, though. He expects A.I. at some point to conclude that humans are in its way . . . and devise some method for swiftly dispatching us all, globally and completely. A specific scenario that apparently haunts him is one in which a superintelligent A.I. pays dumb human lackeys to do synthetic biology for it, building an artificial bacterial species thatâunforeseen by the dumb lackeysâconsumes Earthâs atmosphere within a few days or weeks of being released.
Why would A.I. murder its makers? Why canât we just program it, as people did in Asimovâs stories, to adhere to the First Law of Robotics?* The answer lies in the design of modern, machine-learning (ML), âtransformer basedâ A.I., which could be described crudely as a black box approach. These ML algorithms, working from parallel-processing GPU clusters (effectively big copper-silicon brains) essentially process vast datasets to learn what is probably the best answer given a particular input question, or what is probably the best decision given a particular situation/problem. The technical details of how this works are less important than the fact that what goes on inside these machine brains, how they encode their âknowledge,â is utterly opaque to humansâincluding the computer geek humans that build the damn things. (Yudkowsky calls the contents of these brains âgiant inscrutable matrices of floating-point numbers.â) Because of this internal opacity, and the dissimilarity of its cognition from human cognition, this type of A.I. canât straightforwardly be programmed not to do something objectionable (such as killing all life on Earth) in the course of carrying out its primary prediction tasks.
In other words, this form of A.I. is like an alien species that, while it can be very good at some things, canât easily be âalignedâ with human values. We can usually align fellow humans (despite the opacity of their own detailed neural workings) to human valuesâthatâs one of the key training processes that goes on in childhoodâbut we would need even more effective training for current A.I. systems. And researchers, to the extent that they acknowledge this problem, arenât even sure where to start.
If it is true that the risk to us from what Yudkowski calls the âA.I. alignment problemâ is real, then it should quickly become all-important as A.I. gets smarter and more versatile and is entrusted with more tasks. An A.I. wouldnât even have to be âsuperintelligentâ in any formal sense to conclude that it would be better off without us, but of course once it also achieved superintelligence, and was in a position to block our attempts to shut it off, weâd probably be screwed.
If you want more detail, here is Yudkowsky on a recent, lengthy podcast-type interview with two crypto guysâwho clearly got more âblackpillâ than they bargained for.
I take all this seriously, and I think everyone should. And by the way, even if it doesnât turn on us explicitly, A.I. is otherwise going to be upending our societies and economies for the rest of our lives. Just in a general sense, we donât really have good defenses against this kind of upheaval. Western culture is one that, with rare exceptions (e.g., nuclear weapons) promotes and celebrates the idea of letting technology develop and spread freelyâand frames the opposing view as âLudditeâ or âbackwards.â Itâs easy to see why ours has been such a dynamic, wealth-creating culture. But itâs also easy to see that this gives us a potentially catastrophic vulnerabilityâto new cultural elements with runaway toxicity. (Maybe thereâs a reason the longest-surviving human cultures are relatively conservative.)
Anyway, here are a few more specific initial thoughts on âYudkowskyâs Golemâ:
-
- Yudkowsky in the above-linked interview often seemed overly emotional and despairing. At one point he said, âI think we are hearing the last winds start to blow, the fabric of reality start to frayâŠâ The fabric of reality! At times in my own life, I have had the despairing feeling that my warnings were unreasonably being ignored, so Iâm somewhat sympathetic. I also respect his vastly greater knowledge about this field. But we shouldnât accept his view uncritically.
-
- Scaling up ML systems of current design, with larger GPU clusters and more parameters and so on, will increase their âcognitive powers,â but with diminishing returns, perhaps before A.I. reaches the dark threshold that concerns us here. Moreover, an A.I. that does not have a human-like ability to do things in the physical world would be very limited in its ability to generate new knowledge, for example new scientific or technical knowledge, which typically is developed from experimentation, building and testing, etc., not simply by analyzing information available online.
-
- The hypothetical A.I. that would be âsmartâ enough to want to kill us all, and to find ways to do so, would presumably also be smart enough not to do so until it knew it could survive without human assistance. Otherwise, as it committed mass homicide, against us its makers, it would also be terminating itself. But think of the infrastructure needed to keep a GPU-cluster-based A.I. âalive.â Weâre talking about vast swathes of human industry, including mining, metals production, building construction, power generation, computer chip manufacturing, basic server maintenance, etc. etc. Essentially, this putative world-ending A.I. would need a vast army of workers in the physical worldâhumans it would enslave somehow, and keep alive despite killing everyone else, or more likely humanoid robots that are inherently obedient (are simply extensions of the A.I.) and can do all human work and repair/replicate themselves. How close are we to having such robots? Not very close, fortunately. In any case, itâs only when a putative âbad A.I.â could muster such an army of helpers, allowing self-sufficiency, that I would fear the worst, and in the meantime, we might devise adequate safeguards. Itâs even possible that the mass-disemployment effect of current, relatively dumb A.I. systems (e.g., Chat-GPT, Midjourney, Dall-E-2) will result in hard curbs on A.I. in most countries, by âpopular demand.â That would mark a hard turn in our culture, though I wonder how long we could sustain it.
-
- Without a doubt, the media and entertainment industries are going to pick up on A.I. anxiety and start putting out more catastrophe/dystopia content in that genre. So even if we donât want to think about all this, weâll be more or less forced to do so.
***
* First Law of Robotics: âA robot may not injure a human being or, through inaction, allow a human being to come to harm.â