From OpenAI to Nvidia, researchers agree: AI agents have a long way to go

Welcome to Eye on AI! AI reporter Sharon Goldman here, filling in for Jeremy Kahn, who is on holiday. In this edition…General Services Administration approves OpenAI, Google, Anthropic for federal AI vendor list…Consequences of AI spending boom on U.S. economy…Clay AI raises $100 million at $3.1 billion valuation.

Only in the Bay Area does spending a Saturday geeking out about AI agents—alongside 2,000 students, researchers, and tech insiders crammed into UC Berkeley—feel like a totally normal weekend plan. As I picked up my badge at the day-long Agentic AI Summit and watched the line snake through the student union lobby, it felt less like an academic conference and more like Silicon Valley’s version of a buzzy New York brunch spot.

This was certainly due to the speaker lineup, which was stacked with top AI researchers and scientists, including Jakob Pachocki, chief scientist at OpenAI; Ed Chi, VP of research at Google DeepMind; Bill Dally, chief scientist at Nvidia; Ion Stoica, cofounder at Databricks & Anyscale, as well as a UC Berkeley professor; and Dawn Song, a pioneering UC Berkeley professor focused on AI security.

The popularity also might have been due to the buzzy topic—AI agents, generally defined as an AI-powered system that can complete tasks, mostly autonomously, using other software tools. Think not only suggested a vacation itinerary, but also booking the flight and making the hotel reservation.

But despite the hype, the overall message at the Agentic AI Summit was cautious and grounded: Agents may be the buzziest trend in AI right now, but the tech still has a long way to go, they said. AI agents, unfortunately, aren’t always reliable. They may not remember what came before.

Google DeepMind’s Chi, for example, stressed the gap between what agents can do in curated demos versus what’s still needed in real-world production environments. Pachocki highlighted concerns around the safety, security, and trustworthiness of agentic systems, particularly when they’re integrated into sensitive applications or operate autonomously.

“I still don’t think agents have really lived up to their promise,” said Sherwin Wu, head of engineering at OpenAI API. “Certain more generic cases have worked, but my day-to-day work doesn’t really feel that different with agents.”

Today’s AI agents may still have growing pains, but given the crowded UC Berkeley ballroom, the industry maintains its eye on the prize: AI agents that can reliably operate in the real world. The payoff, they believe, will be well worth the wait.

With that, here’s more AI news.

source

From OpenAI to Nvidia, researchers agree: AI agents have a long way to go

Latest News

S&P 500 will return just 3% a year for the next decade, top strategist warns

Venezuela has the world’s largest proven oil reserves, but it can’t solve for the Strait of Hormuz ‘math problem’

McDonald’s newest $3 value menu is sounding an alarm about America’s K-shaped economy

A gaming CEO asked ChatGPT how to avoid paying a $250 million bonus. It didn’t work

We influence 20 million users and is the number one business and technology news network on the planet

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.