Google’s AI infrastructure boss warned the company needs to scale up its tech to accommodate a massive influx of users and complex requests being handled by AI products—and it may be a sign that fears of a bubble are overblown.
This refers to Google’s ability to ensure that Gemini and other AI products depending on Google Cloud can still work well when queried by a skyrocketing number of users. That’s different from compute, or the physical infrastructure involved in training AI.
A Google spokesperson told Fortune that “demand for AI services means we are being asked to provide significantly more computing capacity, which we are driving through efficiency across hardware, software, and model optimizations, in addition to new investments,” pointing to the company’s Ironwood chips as an example of its own hardware driving improvements in computing capacity.
Now, the users are here, said Shay Boloor, chief market strategist at Futurum Equities. But as each company ratchets up its AI offerings, serving capacity is emerging as the next major challenge to tackle.
“We’re entering the stage two of AI where serving capacity matters even more than the compute capacity, because the compute creates the model, but serving capacity determines how widely and how quickly that model can actually reach the users,” he told Fortune.
Yet Google and its competitors are still facing an uphill battle, he added, especially as AI products start to deal with more complex requests, including advanced search queries and video.
“The bottleneck is not ambition, it’s just truly the physical constraints, like the power, the cooling, the networking bandwidth and the time needed to build these energized data center capacities,” he said.
However, the fact that Google is seemingly facing so much demand for its AI infrastructure that it is pushing to double its serving capacity so quickly might be a sign that gloomy predictions made by AI pessimists aren’t entirely accurate, said Boloor.
“This is not like speculative enthusiasm, it’s just unmet demand sitting in backlog,” he said. “If things are slowing down a bit more than a lot of people hope for, it’s because they’re all constrained on the compute and more serving capacity.”



