“The impact has really exceeded all of our expectations,” John Jumper, the senior Google DeepMind scientist who leads the company’s protein structure prediction team, told Fortune. In 2024, Jumper and Google DeepMind cofounder and CEO Demis Hassabis shared the Nobel Prize for Chemistry for their work creating AlphaFold 2.
Learning how to use AlphaFold to make protein structure predictions is now taught as a standard tool to many graduate-level biology students around the world. “It is just a part of training to be a molecular biologist,” Jumper said.
The company ultimately solved the problem by using a Transformer, the same kind of AI that is the engine of popular chatbots such as ChatGPT. But instead of training the Transformer on text to output the next most likely word, the AI model was trained on a database of protein DNA sequences and known protein structures, as well as information about which DNA sequences seem to evolve together, as this provides clues to protein structure. It is then asked to predict the protein structure.
“Sometimes I have to pinch myself that, oh, it really worked out. There could be many, many ways why we could have failed,” Pushmeet Kohli, the vice president of research at Google DeepMind who leads its efforts to apply AI to science, said.
Kohli also said that AlphaFold proved that AI could not just make tech companies lots of money but could contribute to science and, ultimately, the betterment of humanity. “AlphaFold really confirmed the underlying principle and the vision that if we are developing this technology, this artificial intelligence, what is the most meaningful thing humanity can use that thing for? And I think science is the perfect use case for AI. I won’t say it’s the only use case, but it is definitely the most compelling use case.”
Proteins are long chains of amino acids that act as the engines of life, controlling most biological processes. How a protein functions is, in turn, dependent on its shape. When cells produce proteins, the amino acids spontaneously fold into tangled and twisted structures, with pockets and protuberances, and sometimes long, trailing tails.
The laws of chemistry and physics determine this folding. That’s why Nobel Prize-winning chemist Christian Anfinsen postulated in 1972 that DNA alone should fully determine the final structure a protein takes. It was a remarkable conjecture. At the time, not a single genome had been sequenced yet. But Anfinsen’s theory launched an entire subfield of computational biology with the goal of using complex mathematics, instead of empirical experiments, to model proteins. The problem is, there are more possible protein structures than there are atoms in the universe, so modeling them, even with high-powered computers, is fiendishly difficult.
Before AlphaFold 2, the only way for a scientist to know a protein’s structure with any confidence was through one of a few expensive and lengthy experimental processes. As a result, scientists had only managed to determine the structures for about 180,000 proteins prior to AlphaFold 2. Other computer-based methods for predicting a protein’s structure were only accurate about 50% of the time, which was little help to biochemists, especially since they had no way of knowing in advance when a prediction might be trustworthy.
Thanks to AlphaFold 2, there are now more than 240 million proteins for which there is a prediction of their structure. These include every protein that the human body produces as well as proteins involved in key human diseases, such as Covid, malaria, and Chagas disease.
Google DeepMind made AlphaFold 2 freely available to researchers to download and run on their own computers. But, to make its predictions even more accessible, it also established an internet-based server through which researchers could upload a DNA sequence for protein and get back a structure prediction. And Google DeepMind created structure predictions for almost every known protein and deposited these in a database run by the European Molecular Biological Laboratory’s European Bioinformatics Institute, which is located outside Cambridge, England.
So far, more than 3.3 million people have used AlphaFold 2 to date. The original AlphaFold work has been directly cited in more than 40,000 academic papers, with 30% of those focused on the study of various diseases. One study found that the AI model has contributed directly or indirectly to some 200,000 research publications. The tool has also been mentioned in more than 400 successful patent applications, according to data from Google DeepMind.
Jumper tells Fortune he’s been most gratified by the way scientists have been able to use AlphaFold to find keys to life processes “where they didn’t even know what to look for.” For instance, scientists recently used AlphaFold to help discover a previously unknown protein complex that is essential for allowing sperm to fertilize an egg.
Andrea Paulli, the biochemist at the Research Institute of Molecular Pathology in Vienna, Austria, who found that protein on the surface of sperm, told science journal Nature that her team uses AlphaFold 2 “for every project” because “it speeds up discovery.”
Among the discoveries AlphaFold has played a role in is determining the structure of a key protein at the core of low-density lipoprotein, or LDL, more commonly known as “bad cholesterol” and a major contributor to heart disease. That protein, called apoB100, had previously not been mappable because of its large size and its complex interactions with other proteins. But two scientists at the University of Missouri combined an imaging method—cryogenic electron microscopy—with AlphaFold’s predictions to find apoB100’s structure. That in turn may help scientists find better treatments for high cholesterol.
Other scientists have used AlphaFold to discover the structure of Vitellogenin, a protein that plays a key role in the immune system of honeybees. The hope is that knowing the protein’s structure may help scientists better understand the collapse of honeybee populations globally and perhaps come up with genetic modifications that could produce more disease-resistant bee species.
AlphaFold is likely to eventually have a major impact on drug discovery, although to date, it is difficult to assess how much difference the AI model has made. In one case, scientists did use AlphaFold to find two existing FDA-approved drugs that could be repurposed to treat Chagas disease, a tropical parasitic illness that infects up to 7 million people annually and results in more than 10,000 deaths per year.
Jumper said that to some extent it is AlphaFold 2’s successor AI models that are likely to play a more direct role in drug discovery than the original structure prediction tool. AlphaFold 3, for instance, predicts not just protein structures but several critical aspects of how proteins bind with one another and with small molecules. That is essential because most drugs are either small molecules that bind with a target site on a protein to change its function, or, in some cases are themselves proteins. Meanwhile, AlphaFold Multimer, an extension of AlphaFold 2, predicts protein-protein interactions that can also help with drug design.
Google DeepMind also created an AI model called AlphaProteo that can design novel proteins with specific binding properties. And the AI lab created a system called AlphaMissense that can predict how harmful single-point genetic mutations will be, which may help scientists understand the root cause of many diseases and potentially find treatments, including possible gene therapies.
Jumper said that he is personally interested in exploring whether large language models, such as Google’s Gemini AI, can play a role in science. Some AI startups have begun experimenting with LLMs that allow a scientist to specify the function of a protein and then the LLM spits out the DNA recipe for that protein. (These still have to be experimentally tested to see if they actually work.) But Jumper said he is somewhat skeptical of how well these kinds of LLMs work at designing very novel proteins. Jumper said he also knows that some people have created essentially chatbot front-ends to AlphaFold, but he said this was “not that interesting.”
Instead, he said, what excites him is the idea of using the power of LLMs to develop new hypotheses and design novel experiments to test them. DeepMind has created a prototype “AI scientist” based on Gemini that can do some of this. But Jumper said he thinks the concept has much more potential. “The really exciting dataset and the really big dataset is the entirety of the scientific literature,” he said.



