top of page
Search

Raising Llamas After the Gold Rush

Generative AI's Impact on Tech Hardware


Nvidia reported its second-quarter 2023 earnings and spectacularly blew past its previous bold guidance. The company is dominating the AI hardware market from silicon to software. While other companies are also jumping into the AI gold rush, it will be difficult for them to catch up to Nvidia. But if we look past the current boom cycle in cloud infrastructure, there is an even larger opportunity for AI hardware at the Edge. And that field is wide open for new entrants.


RCD Advisors' forecast for AI hardware growth was already bullish before OpenAI publicly debuted ChatGPT 3.5 in November 2022. It is tempting to say that it was predicted based on some complex proprietary model. But it really was just an extension of a trendline. The logic wasn't all that complicated. It started with the historical growth of computing infrastructure and added an upside guided by previous innovation cycles. Sometimes, back-of-the-envelope forecasts are all that are needed.


But now forecast models are unhinging. One financial firm speculates that investments in AI could peak as high as 4% of GDP in 2032, with a large percentage of that investment in hardware. In many ways, the current generative AI hype cycle is reminiscent of the short-term cryptocurrency bubble in 2018 (which Nvidia was also a part of). At the time, the frenzy in Bitcoin mining generated demand for ASICs and infrastructure equipment. That bubble crashed in 2019 as installed hash rate capacity caught up with mining demand.


Although there are similarities, Generative AI is not cryptocurrency. The demand for machine learning is not a short-term supply/demand problem. Yes, there are capacity constraints at TSMC for Nvidia's latest Hopper processor (both for the 4N wafer fab and the CoWoS advanced packaging). Yes, Nvidia is capitalizing on its leading position by more than doubling the price of Hopper processors (~$25k) compared to previous generations. So, it is possible that the boom may go away when supply, and competition, catch up with demand.


However, there is also a fundamental "not good enough" problem. Larger models (175Bn parameters and more) trained on larger data sets (trillions of tokens) improve the quality of the generated response. The hardware costs required to train these models are proportional to the model's size and data set. If those generated responses can improve productivity, more hardware will be needed as models and datasets grow. That is the narrative Nvidia, and many other hopeful AI processor suppliers are selling.


But there may be a twist to this narrative. Like anything else, AI is resource-constrained. Once the initial gold rush is over, the trajectory for performance will likely shift away from ever-larger models. AI computing engineers have always focused on improving hardware efficiency through various forms of parallelism. Based on the short 9-month history of LLM's evolution, we already see a shift to training on smaller, more efficient models. And if that is the trajectory, then the AI hardware growth driver will shift away from the data center to the Edge.



The Edge Trajectory


In 2018, Peter Levine, a general partner at the venture capital firm Andreessen Horowitz, outlined a vision for machine learning at the Edge. In his blog post, he describes how, over the last 60 years, each computing generation oscillated from centralized to distributed, like a pendulum of a grandfather clock, back and forth. The prediction was that over the next decade, the total available market (TAM) for computing would grow beyond the potential users (world population of 7.8Bn) to installed machines and sensor nodes (~100s of Billion).


Indeed, the multiplier for Tech hardware content on Edge computing was massive. But then Covid happened. And the benefits of 5G networks (to connect machine interfaces) were overpromised. By 2022, Edge computing as a growth driver for tech hardware receded into the junkyard of other quaint keynote bullet points.


But that all changed earlier this year (2023) with the introduction (or rather, leaked version) of Llama. Llama is Meta's open-source LLM generative AI model. Llama was unique because it was available in three different parameter sizes. However, Meta trained it on a data set that was 4x larger than ChatGPT. The smallest model could fit on a PC, unleashing a flourishing of open creativity among the programming community. Google's leaked "moat" memo described the phenomenon.


Before Llama, the innovation trajectory was larger models on more powerful computing hardware. Higher-value silicon and quick replacement cycles drove the Tech hardware opportunity.


After Llama, pursuing bigger models that consume considerable resources is increasingly viewed as a losing investment race. The innovation trajectory is changing to smaller models that users could iterate quickly and fine-tune with proprietary data. Large-scale institutions could train base models on large supercomputers. Then, smaller enterprises (and individuals) could fine-tune with proprietary data sets afterward. The Tech hardware opportunity may shift away from the massive infrastructure aggregation to the Edge, where smaller and faster models are trained on millions (or billions) of devices.


Llama eats Hopper
Sources: AI generated using Dall-E; Hopper card added

The PC supply chain immediately seized on this vision and painted a picture for generative AI models on every desktop. In Intel's recent quarterly investor conference call, CEO Pat Gelsinger mentioned AI almost 60 times (In Q1, it was 15 times). Likewise, PC companies like Lenovo have begun to paint forecasts based on an "AI catalyst."


But it doesn't end there. Once models and data are optimally partitioned, training and inference can happen on every car dashboard, every smartphone, and every machine or sensor node. Indeed, Meta is working with Qualcomm to bring Llama models to the Edge. Indeed, Edge computing designers have developed several architectures to install this vision.


How will this innovation trajectory impact the tech hardware industry? As is usually the case, the simple storyline hides lots of nuances.


Despite some consternations, at least for now, RCD Advisors believes AI is additive to cloud capex budgets. Existing infrastructure capex spending is driven by continuing workflow migration to the cloud. That doesn't change with AI, although it could speed up as compelling AI features emerge. Moreover, AI capex costs are massive. AI-as-a-service models will have to emerge to generate revenue and fund this capex. Microsoft already charges $10 per month per user to integrate ChatGPT into Teams. Microsoft's potential AI hardware (servers, networking, interconnect cables) spending could grow to approximately $5.5Bn from this revenue stream. This estimate assumes roughly 300M monthly users and 15% capex intensity. Add Google, Meta, AWS, and others, and the AI infrastructure hardware spending on the cloud could reach $20Bn per year after the initial gold rush.

Aside from the scaling costs, the biggest problem with Cloud-based AI is that it doesn't provide differentiation for its users. Like the IT business investment arguments in the early 2000s, no one has a competitive advantage if everyone has access to the same Cloud AI tools.


But that can change with distributed AI models. Distributed models allow enterprises to train on proprietary data sets and fine-tune model parameters for a competitive advantage. The rise of distributed AI models will eventually shift some AI processing away from the cloud and back to the enterprise on the Edge.


And that's a good thing for Tech hardware. While there may not be any physical differences in where a server sits, there are differences in market dynamics. Enterprises have much less bargaining leverage than hyperscalers. There are more of them, and each runs infrastructure less efficiently than cloud providers. Lower utilization and higher pricing will add a few growth percentage points above and beyond the existing infrastructure market trend line.


Finally, moving small models onto devices like PCs, smartphones, and automobiles should drive faster replacement cycles. How fast depends on power dissipation, costs, and compelling features (like a flawless voice assistant with zero latency.) Our instincts suggest extending trendline growth accounts for these added AI features. After all, at some level, every new electronic device has more or improved features than the device it replaces. The AI on-device cycle could be another item on the long list of innovative features in new PCs, driver dashboards, and smartphones. But there will be novel use cases that, at this point, we can't predict (For example). One under-appreciated upside would come from embedded computing applications. The PCs embedded inside ATMs, MRIs, and POS systems may have the most to gain from AI systems in the near future.


Our estimates for AI Tech hardware growth are shown below. It includes a device "upside" in addition to the infrastructure hardware. Our retainer clients have access to more depth and background, including a breakdown of the impact across supply chain component markets.



Slide revised on September 5th


If you find these posts insightful, subscribe above to have them delivered to your email. If you would like to learn more about the consulting practice, contact us at info@rcdadvisors.com.

bottom of page