AGI has long been the ultimate goal of many artificial intelligence researchers. That’s been the case even though there is no universally accepted definition of the term. It generally means AI that is as intelligent as humans, but there is a fierce debate over exactly how to define and measure “intelligence.”
In this case, Fridman had offered Huang a very unusual metric for AGI: Could AI start and grow a technology business to the point where it was worth $1 billion? Fridman asked if Huang thought AGI by this definition could be achieved within the next five to 20 years. Huang said he didn’t think that amount of time was necessary. “I think it’s now. I think we’ve achieved AGI,” he said. He then hedged, noting the company didn’t necessarily have to remain that valuable. “You said a billion,” Huang told Fridman, “and you didn’t say forever.”
Few AI researchers agree with the definition of AGI that Fridman offered Huang, which was both more specific (a company worth $1 billion), but also more narrow than most AGI definitions (which tend to refer to matching a vast range of human cognitive skills, not all of which might be needed to build a successful business.) But AI researchers also disagree with one another over what a better definition should be. The term remains stubbornly amorphous despite the fact that several leading AI companies, with collective market valuations of more than $1 trillion, say that AGI is what they are racing towards. Some computer scientists avoid using the term at all precisely because they say it is perpetually undefined and unmeasurable. Others say tech companies like using the term for completely cynical reasons—precisely because it is ill-defined, it’s easy for companies to build hype by claiming big strides towards achieving the fabled milestone.
The buzz over Huang’s AGI remarks only serves to highlight this quandary at the heart of the AI boom.
The taxonomy identifies 10 key cognitive faculties—including perception, reasoning, memory, learning, attention, and social cognition—that the researchers argue are essential for general intelligence. The framework then proposes evaluating AI systems across all 10 faculties and comparing their performance to a representative sample of human adults with at least the equivalent of a secondary education.
The paper’s key insight is that today’s AI models have a “jagged” cognitive profile: They may exceed most humans in some areas, like mathematics or factual recall, while dramatically trailing even average people in others, like learning from experience, maintaining long-term memories, or understanding social situations. An AI model would need to at least match median human performance across all 10 areas to be considered AGI, the Google DeepMind researchers suggest.
The researchers also announced a contest with a $200,000 prize pool on the popular machine learning competition site Kaggle for outside researchers to help build evaluations for the five cognitive faculties where existing benchmark tests are weakest.
Taken together, these new benchmarks represent a growing effort within the AI research community to replace vague definitions about AGI with something closer to scientific measurement. But as these researchers are the first to admit, the difficulty of defining intelligence is as old as the study of thinking itself—and has plagued artificial intelligence as a field from its very earliest days.
In 1950, before the term “artificial intelligence” had even been coined and when mathematicians and electrical engineers were just starting to build the first modern computers, the famed British mathematician and computer pioneer Alan Turing wrestled with the fact that it was extremely difficult to formulate a definition of intelligence.
Rather than attempting one, Turing proposed an assessment he called “the Imitation Game,” which later became better known as the Turing Test. It stipulated that a machine should be considered intelligent when it can hold a general conversation with a person, via text, and a second human judge, reading the exchange, cannot reliably determine which participant is the machine and which the human. It was, in essence, an “I’ll know it when I see it” approach to intelligence.
But the Turing Test soon proved problematic too. Eliza, a chatbot developed at MIT in the mid-1960s, was designed to mimic a psychotherapist. Most of its responses followed hard-coded logical rules; Eliza often answered users with questions such as “Why do you think that is?” or “Tell me more” to cover up its weak language understanding. And yet Eliza fooled some people into believing it understood them. Eliza came close to passing the Turing Test even though on almost every other measure it came nowhere close to human cognitive abilities. And, in fact, a more sophisticated chatbot called “Eugene Goostman” officially passed a live Turing Test competition in 2014, again without touching most human cognitive skills.
Today’s large language models converse far more fluently than Eliza ever could, they still cannot match humans across the full spectrum of cognitive abilities—they hallucinate facts, struggle with long-horizon planning, and cannot learn from experience the way a person does.
Compared to the Turing Test, the term “artificial general intelligence” is a relatively recent one. It was first coined in 1997 by Mark Gubrud, then a graduate student at the University of Maryland, who used the neologism in a 1997 paper he presented at a conference on nanotechnology. He used the phrase “advanced artificial general intelligence” to describe AI systems that could “rival or surpass the human brain in complexity and speed, that can acquire, manipulate, and reason with general knowledge, and that are usable in essentially any phase of operations where a human intelligence would otherwise be needed.” But the paper quickly vanished in obscurity.
Then, in the early 2000s, Legg—who would go on to cofound DeepMind—independently coined the same term. He was collaborating with computer scientists Ben Goertzel, Cassio Pennachin, and others on a book about potential ways to create machine learning systems that would be able to address a wide range of problems and tasks. They wanted a term that would distinguish the ambition of these systems from the narrow machine learning algorithms then in vogue, which, once trained, could only tackle a single, narrow task. Goertzel considered calling this more general AI “real AI” or “strong AI,” but Legg suggested “artificial general intelligence” instead, unaware of Gubrud’s earlier usage. He also suggested the term be abbreviated as AGI. This time, AGI took off.
In Goertzel’s book he defined AGI as “AI systems that possess a reasonable degree of self-understanding and autonomous self-control, and have the ability to solve a variety of complex problems in a variety of contexts, and to learn to solve new problems that they didn’t know about at their time of creation.”
The definition was useful for separating work on general AI systems from narrow machine learning ones, but it too contained a fair an unhelpful amount of ambiguity: What did “reasonable degree” mean? Which complex problems in which contexts counted towards the standard?
Legg would later compound this ambiguity by offering a more casual definition of AGI that was in some ways narrower (it didn’t talk about self-understanding, for instance) but equally vague. For instance, he told The Atlantic’s Nick Thompson last year, “I define an AGI to be an artificial agent that can do the kinds of cognitive things that people can typically do. I see this as the natural minimum bar.” But which things? And which people?
Questions like this have continued to swirl around AGI. Does the term mean software that matches the cognitive abilities of an average human? Or the abilities of the humans with the highest IQs? Or the best expert in each individual domain of knowledge? The Hendrycks and Bengio research paper, for instance, defines AGI as matching or exceeding “the cognitive versatility and proficiency of a well-educated adult.” The DeepMind paper proposes measuring against a representative sample of adults. Others have used less precise formulations.
Adding to the confusion, AGI is often conflated in public discussion with a concept AI researchers call “artificial superintelligence,” or ASI—an AI that would be smarter than all humans combined. Most AI researchers consider AGI and ASI to be separate milestones, and very different in degree of sophistication, but in the popular imagination the two frequently blur together.
If the academic debate over defining AGI has been long and nuanced, the corporate world has introduced definitions that are, to put it charitably, idiosyncratic. DeepMind became the first company to make the pursuit of “artificial general intelligence” a business goal. Legg put the phrase on the front page of the company’s first business plan when he, Demis Hassabis, and Mustafa Suleyman cofounded the company in 2010.
Five years later, OpenAI also made building AGI its explicit mission. Its original 2015 founding principles said that the new lab—at the time a non-profit—was dedicated to ensuring “that artificial general intelligence benefits all of humanity.” Three years later, when the lab first set up a for-profit arm, it published a charter that defined AGI “as highly autonomous systems that outperform humans at most economically valuable work.” Now, for the first time, AGI was being measured by financial metrics, not mere cognitive ones.
Despite being far short of the financial threshold for AGI in its contract with Microsoft, OpenAI CEO Sam Altman has often made statements that suggest OpenAI is close to achieving the AI milestone as measured by other benchmarks. In a post to his personal blog in January 2025 titled “Reflections,” Altman wrote that OpenAI was “now confident we know how to build AGI as we have traditionally understood it” and that the company was beginning to turn its aim towards superintelligence. In a subsequent essay titled “Three Observations,” he wrote that systems pointing toward AGI were “coming into view.” Yet, at other times, Altman has seemed to acknowledge AGI’s weakness as a concept. Around the same time as his “Reflections” blog post, Altman told a Bloomberg News interviewer that AGI “has become a very sloppy term.”
Microsoft has also chosen to ignore the financial definition of AGI it struck with OpenAI when it suited the company’s marketing purposes. In March 2023, a team of Microsoft researchers published a 154-page paper about GPT-4 provocatively titled “Sparks of Artificial General Intelligence,” arguing the model could “reasonably be viewed as an early (yet still incomplete) version” of AGI.
The paper was widely criticized for hyping the abilities of GPT-4 for commercial purposes. Even Altman distanced himself, calling GPT-4 “still flawed, still limited.”The new research and benchmarks from Google DeepMind and the Hendrycks-Bengio team makes some progress towards establishing a yardstick for AGI, one rooted in decades of study of human intelligence. And what’s clear is that today’s best AI models still don’t measure up to breadth and depth of human cognitive abilities.
Huang, the Nvidia CEO, knows this, just as he was no doubt fully aware of the social media frenzy and headlines he would generate by saying AGI had been achieved. We know Huang knows this because later in the same podcast in which he said “AGI is achieved” he also said that the popular OpenClaw AI agents, which can be powered by any of the top AI models from companies such as Anthropic and OpenAI, could never replicate Nvidia. “Now, the odds of 100,000 of those agents building Nvidia is zero percent,” he said.
Huang is not just Nvidia’s CEO. He is also the company’s founder and the person who has run the company for 33 years, piloting it past near-bankruptcy at one point, to see it now worth more than $4 trillion, making it one of the most valuable companies on the planet. In many ways, Huang is a singular genius. But he’s also a very human one. So maybe we need a new standard, not AGI but AJI—artificial Jensen intelligence. When AI reaches that level, the AI boosters on social media who breathlessly amplified Huang’s AGI claim will really have something to get excited about.



