Hello and welcome to Eye on AI. In this edition…Meta hires Scale AI founder for new ‘superintelligence’ drive…OpenAI on track for $10 billion in annual recurring revenue…Study says ‘reasoning models’ can’t really reason.
Frontier AI models may be reaching the point now where they may be able to automate algorithmic research and development, Davidad says. “The idea is, let’s take that capability and turn it to narrow AI R&D,” he tells me. Narrow AI usually refers to AI systems that are designed to perform one particular, narrowly-defined task at superhuman levels, rather than an AI system that can perform many different kinds of tasks.
The challenge, even with these narrow AI systems, is then coming up with mathematical proofs to guarantee that their outputs will always meet the required technical specification. There’s an entire field known as “formal verification” that involves mathematically proving that software will always provide valid outputs under given conditions—but it’s notoriously difficult to apply to neural network-based AI systems. “Verifying even a narrow AI system is something that’s very labor intensive in terms of a cognitive effort required,” Davidad says. “And so it hasn’t been worthwhile historically to do that work of verifying except for really, really specialized applications like passenger aviation autopilots or nuclear power plant control.”
This kind of formally-verified software won’t fail because a bug causes an erroneous output. They can sometimes break down because they encounter conditions that fall outside their design specifications—for instance a load balancing algorithm for an electrical grid might not be able to handle an extreme solar storm that shorts out all of the grid’s transformers simultaneously. But even then, the software is usually designed to “fail safe” and revert back to manual control.
ARIA is hoping to show that frontier AI modes can be used to do the laborious formal verification of the narrow AI controller as well as develop the controller in the first place.
It’s not clear if this plan will work. For every transformational DARPA project, many more fail. But ARIA’s bold bet here looks like one worth watching.
With that, here’s more AI news.