'AI' just means LLMs now | Superintelligence 1
There used to be many possible candidates for how humans might build intelligent systems. Now there's only one.
This article is the first part of Superintelligence, a series intended to work out how to develop smarter-than-human AI systems from first principles.
Since the industrial revolution, humans have dreamed of building intelligent machines.
In the 1800s, Charles Babbage and Ada Lovelace built the first general-purpose computer and designed the first ever computer algorithm. They also speculated about whether computers might be creative one day and compose their own music or art.
Fast forward to the twentieth century and lots of people started thinking about what it could look like to build intelligence artificially. The most famous early such thinker is Alan Turing, the father of the Turing Test and pioneer of early logic.
Even as AI spent several decades as a marginalized topic for scientific study, the idea of AI as science fiction blossomed. It’s now played a central role in movies (2001: A Space Odyssey, Blade Runner, The Terminator, The Matrix, Ex Machina) and TV (Star Trek, Knight Rider, The Twilight Zone, Westworld, Black Mirror). It’s safe to assume that anyone who’s watched Western media over the last fifty years knows what AI is.
The Form of Superintelligence
And through science fiction we’ve seen AI take many forms. I don’t think there’s a definition of exactly what an AI should look like. Perhaps in defining AI we can take a cue from the 1964 U.S. Supreme Court: “we’ll know it when we see it.”
Things get even murkier when we consider superintelligence. Are all these AIs superintelligent–smarter than humans? C-3PO certainly was; that guy knew everything. But many media portrayals of AI consider them as human-level, but not smarter.
For the purpose of this series of posts, I’m going to define ‘superintelligence’ as whatever we saw in Iron Man (Jarvis) and Star Wars (C-3PO):
So we’ll take a reductionary position that superintelligence is simply a helpful machine that’s as capable and more knowledgeable than a human.
Do we have this yet? Unfortunately no. The newest, smartest models from OpenAI are very good– they can code like professional programmers and achieved gold medal on the International Math Olympiad– but they’re not considered yet human-level in many areas. (This skill profile is what some have described as jagged intelligence.)
Biological Candidates for Superintelligence
For a long time we didn’t know how to build JARVIS. We had no idea what superintelligence would look like. Many saw that some kind of connectionism might be the path, but didn’t know how to scale it.
In fact, lots of early AI research was inspired by the human brain. The most popular arguments, such as Nick Bostrom’s original arguments for superintelligence, all consider the fact that we already have proof that it’s possible to build (non-super) intelligence, if you consider the human brain as an existence proof. So you can make arguments like this one:
The human brain contains about 10^11 neurons. Each neuron has about 5*10^3 synapses, and signals are transmitted along these synapses at an average frequency of about 10^2 Hz. Each signal contains, say, 5 bits. This equals 10^17 ops. (…)
and its ultimate conclusion:
Depending on degree of optimization assumed, human-level intelligence probably requires between 10^14 and 10^17 ops.
Although I deeply admire the author’s conviction (and prescience, since these beliefs were published in 1998) it seems flawed to equate the operations performed by a human brain with simple ‘ops’ (such as floating-point operations) that happen inside a computer. (I think this is the basic point that Roger Penrose made when he hypothesized that quantum mechanics plays an important role in the development of human consciousness inside the brain.)
This is all worsened by the fact that the systems that ended up getting us closest to superintelligence are called neural networks, which is simultaneously an excellent name and a dreadful misnomer. They’re systems of interconnected dynamically-updatable components, but they’re also not related to biological neurons in any way.
Anyway, we spent a long time making these types of biological analogies, and they didn’t get us far. What worked was at best very loosely inspired by the functionality of the human brain, machine learning and neural networks:
Digital Candidates for Superintelligence
Since 2011 (around when AlexNet was introduced) every year has gotten us closer to building Jarvis or C-3PO. We developed better machine learning and neural networks, got good at training them to mimic human text, and now we’re making steady progress on incentivizing them to grow smarter than humans in some areas. This is really exceptional progress.
As mentioned, the best systems that we have right now are already very useful. They know how to code, give directions, and write recipes. They’re decent therapists and life coaches. They’re coming around in the creativity department, too, and writing better prose and poetry each year.
One might expect, then, that we as a field have several candidate systems for superintelligent AI. Perhaps we have a really good video simulator, a few companies have embodied continually-learning robots, and there’s a speech system out there that also exhibits Jarvis-like capabilities.
But this turns out not to be the case. There is only one existing technology that’s close to Jarvis: large language models. We haven’t built models that get smarter by exploring the world or watching lots of movies. We’ve only built models that get smarter by reading lots of text (say, all the words on the Internet) and then marginally smarter after doing a few thousand math problems.
All we have is language models.