It was a process that had become routine for Maxim Topaz.
The associate professor at Columbia University’s School of Nursing had grown accustomed to having artificial intelligence tools help polish scientific papers for grammar, formatting, and other details. But a few weeks after submitting his latest research, the academic journal he was due to publish in came back with questions about a reference. The AI tool Topaz had used had silently inserted a fabricated source into his work.
“I felt deeply embarrassed,” Topaz, who leads a team at Columbia developing AI applications in healthcare, told Fortune.
“I’m an AI researcher. I know about hallucinations,” he said. “If this is happening to me, an AI expert, what happens to other people?”
That near-miss sent Topaz on an investigation to find out how often experts were getting subtly fooled by AI. The answer, it turns out, is a lot.
“It’s very reasonable that AI is highly associated with them now,” he said.
Over the past three years, the rate of fabricated references in biomedical literature has grown more than 12-fold. In 2023, one in 2,828 papers contained at least one fake reference, a rate that had risen to one in 458 by last year. Over the first seven weeks of 2026, the researchers found, one in 277 papers had at least one non-existent reference.
“I’m thinking this is just the tip of the iceberg,” Topaz said.
Hallucinations happen when an AI model prioritizes word patterns over accuracy. They are often harmless, but the stakes are different when AI errors begin infiltrating academic literature, as hallucinations risk undermining the scientific process.
Medicine is a field that builds on itself. Clinical trials cite earlier studies; systematic reviews then aggregate those trials, and medical guidelines finally cite those reviews. Doctors and nurses rely on those guidelines when they decide how to treat patients. A fabricated study planted at the start of that process doesn’t stay there.
“This is the evidence chain, that’s how we care for and treat people. If you put the fictional study at the bottom of the stack, the whole structure inherits it,” Topaz said.
“We’ve already seen paper mill articles included in systematic reviews informing clinical guidelines,” he added. “When a guideline paper cites a paper with a partially fictional references list, the evidence-based chain for treatment decisions is compromised.”
The book carried blurbs from prominent journalists, including Nicholas Thompson, The Atlantic’s chief executive, and a foreword by Maria Ressa, the Nobel Peace Prize–winning reporter from the Philippines. It arrived, according to the Times, “to great fanfare.”
Rosenbaum’s book contained more than a half-dozen misattributed or entirely invented quotes, apparently generated by AI tools he had disclosed using in his acknowledgments. In a statement to the Times, Rosenbaum recognized the errors, calling the episode “a warning about the risks of AI-assisted research and verification.”
The vast majority of papers tracked by Topaz contained only one or two fabricated citations, out of the several dozen references academic studies usually need to publish, suggesting most cases of AI hallucinations in research are unintentional.
Topaz said AI itself is not necessarily the villain, and he gladly uses it in his own work. “The problem is unverified AI output entering the permanent record,” he said. “The fix is not to stop using the tools, it’s to build verification into the workflow.”
“The longer we wait to put verifications in place, the harder it becomes to clean up,” he added.
AI hallucinations don’t care how well-versed in a subject users are. The mistakes are designed to look real, and they’re getting better at hiding. The more consequential the field—be it medicine, law, or journalism—the more dangerous errors become when they aren’t caught.



