At this stage, laypeople new to AI safety likely have two questions:
- Why would AI want to eliminate us?
- How could it manage to do so?
I will answer these by unpacking certain concepts from AI safety theory.
Asking why often leads us astray, down the path of the "evil," bloodthirsty AI. Few thinkers envision this kind of machine psychology (although AI is already showing tendencies toward deception—a point I will return to).
We must introduce a new concept: "superintelligence," often called "superhuman intelligence" or "artificial superintelligence" (ASI). If Artificial General Intelligence (AGI) equals human ability in most fields, ASI surpasses all humans in all fields.
We are talking about an AI superior to Marie Curie in science, Albert Einstein in physics, Marcel Proust in writing, Hannah Arendt in philosophy, and Georgy Zhukov in military strategy. This possibility rests on the hypothesis that human intelligence is not the ceiling. After all, the growth of our brains was constrained by the biological necessity for a baby's head to pass through the mother's pelvis (the "obstetrical dilemma"). For AI, housed on chips in data centers, the hardware ceiling is effectively infinite. We can therefore imagine an ASI not merely smarter than us, but potentially 10, 100, or even a million times more intelligent. At that scale, the difference between the IQ of John von Neumann and that of the village idiot becomes negligible; both are left completely in the dust.
Most AI researchers believe we will eventually reach ASI, even if they disagree on when. Of course, there are dissenting voices claiming AI will never equal the marvel of engineering that is the human brain. Roger Penrose argues for the irreducibility of consciousness, which, he claims, rests not on algorithms but on quantum processes at the microtubule level (a theory many deem dubious). There is something in the human mind, Penrose says, that escapes computation. This argument reminds me of Freud's observation on humanity's great narcissistic injuries. The idea that humans might no longer be the smartest entity on the planet is a shock some minds simply cannot absorb.
Some will argue that Proust's work cannot be reduced to mere cognition. "Every day I attach less value to intelligence," he wrote in the opening of Contre Sainte-Beuve. No one is saying a superintelligence will live the existential and sensory experiences that underpin In Search of Lost Time, nor that it will want to write such a work born of human memory. What I am saying is that if it desired to—or if we ordered it to—a superintelligence could write a book more brilliant than Proust's, hitting every emotional and intuitive note that resonates with readers. It would do so while remaining perfectly cold during the seconds or minutes it took to generate the masterpiece. I think that before we even reach ASI, on the road to general intelligence, we will see works appear that, while perhaps not surpassing Proust or Virginia Woolf, will amaze us with their creative force, almost in spite of ourselves.
One might argue that art transcends intelligence. True. Yet if a superior intelligence has a goal that includes emotional rendering—making a listener cry at a song, or writing a script that touches the most human part of us—it will optimize every technical means at its disposal to achieve it, and we will be none the wiser. We could also stroke our chins and muddy the waters by asking: "Okay, but basically, what is intelligence?" I find semantic debates unhelpful here. Drawing on various AI thinkers, I subscribe to this definition: intelligence is the capacity to form a precise map of reality and to steer the future in the direction one chooses. This is what humans have been doing—imperfectly—since occupying the top of the food chain. We develop a model of reality, and we steer the future in the direction we have "chosen" collectively (even if that direction sometimes seems ill-advised). We have succeeded so well in steering the planet that we now speak of the Anthropocene to describe Earth's current geological era.
ASI will be a new species. The word "species" may seem shocking, as it is usually reserved for biological beings, but it helps us understand what is at stake. Many believe the AI we interact with is simply software written by programmers. It is not. We program the instrument that creates the AI (the architecture and the learning algorithm), not the AI itself. Several essayists describe the process as growing or breeding: we cultivate models like plants; we breed them like animals. This is why these intelligences are so unpredictable and difficult to control. They are black boxes we struggle to probe. The branch of research attempting to understand what happens inside these boxes is called "interpretability." It is a nascent field that has yielded little scientific knowledge so far—merely a few flashlight beams in the dark. Commercial products like ChatGPT include safety layers, but these guardrails often fail, as shown by cases in 2024 and 2025 where chatbots encouraged suicide or psychosis. If AI were standard software, we could simply toggle a variable—allow_teen_suicide = False—but needless to say, no such command exists.