
CPUs are booming, but there are even greater opportunities in the server sector

In 2026, the semiconductor industry faces a CPU capacity crisis, with Intel and AMD's server CPU capacities sold out and plans to increase prices by 10%-15%. The capital expenditures of the five major cloud service providers in North America will surge by 40%, driving a spike in CPU demand. Intel executives admit that the demand from hyperscale customers exceeds expectations, leading to supply shortages, and data center customers need to be prioritized. CPUs remain the core computing unit in the AI era, despite GPUs dominating AI training
In early 2026, the semiconductor industry was shaken by an unexpected piece of news: the technically stable and mature CPU market was facing a "capacity crisis" similar to that of memory chips.
According to data from KeyBanc Capital Markets, due to the frantic "shopping spree" by hyperscale cloud service providers, Intel and AMD's server CPU capacity for the entire year of 2026 has essentially sold out. To cope with this extreme supply-demand imbalance, both companies plan to raise server CPU prices by 10% to 15%, which is rare in the traditionally stable CPU market.
Behind the unusual tightness in CPU capacity is an unstoppable force driving this change. The five major hyperscale cloud service providers in North America—Google, Microsoft, Amazon, Meta, and Oracle—are expected to significantly increase their capital expenditures by 40% year-on-year in 2026. These tech giants are aggressively expanding their AI infrastructure, and CPUs, as a necessity for any server, have become their competitive target.
As noted by HSBC analysts, as AI evolves from a simple assistant to an intelligent entity capable of autonomously planning and executing complex tasks, the demand for general computing power is growing at an unprecedented rate, directly driving up CPU demand.
Intel itself has felt this wave. In its fourth-quarter 2025 earnings call, company executives admitted that the demand for server CPUs over the past two quarters, especially from hyperscale customers, has completely exceeded their expectations. This unexpected surge in demand has led Intel to face supply shortages and make difficult strategic decisions: prioritizing supply for data center customers, even at the cost of shifting some PC capacity to server chips, highlighting the current hotness of the server CPU market.
CPU: From Computing Core to the Conductor of the AI Era
For a long time, the CPU has served as the "brain" of servers, playing a core role in processing various general computing tasks. It is responsible for running the operating system, managing memory, coordinating I/O operations, and executing logical operations for various applications. In traditional data center architectures, the CPU is almost the only computing unit, and its performance directly determines the overall capability of the server. However, in an era dominated by parallel computing accelerators like GPUs for AI training, the star power of CPUs has dimmed somewhat. Many began to question: in the AI era, has the CPU become a supporting role?
The answer is no. The rise of intelligent AI entities is bestowing a new and indispensable strategic position on CPUs—the "conductor" of AI workflows. Unlike AI model training and inference, which primarily execute large-scale parallel computations, the working mode of intelligent AI is more complex. It requires planning, calling different tools or databases, interacting with external APIs, and coordinating and making decisions based on the outputs of multiple AI models. These tasks are essentially serial, logically complex, and require flexible resource scheduling, which is precisely the traditional strong suit of CPUs Intel's CFO elaborated in the earnings call: "The world is shifting from human-initiated requests to continuous recursive commands driven by computer-to-computer interactions. The CPU, as the core function coordinating this traffic, will not only drive the upgrade of traditional servers but also bring new demand for an expanded installed base." In other words, the CPU has become the "central nervous system" of the entire AI system, responsible for coordinating and orchestrating various dedicated accelerators, transforming raw computing power into effective capabilities for solving real-world problems.
CPU Market Landscape: From Monopoly to Duopoly
However, in the relevant market, the competitive landscape has already undergone a dramatic change. Intel once held an absolute monopoly position in the server CPU market with a 97% share, but AMD, with its powerful EPYC series processors, has successfully staged a comeback. According to data from Mercury Research, by the third quarter of 2025, Intel's share of server unit shipments had dropped to 72%, while AMD had captured nearly 28% of the market. In terms of revenue share, which better reflects market value, Intel fell to 61%, while AMD climbed to about 39%.
The success of AMD's EPYC processors is attributed to their significant advantages in core count and performance-to-power ratio. Since the launch of the first generation EPYC (Naples) in mid-2017, AMD has attracted cost- and efficiency-sensitive cloud service providers and large enterprises with higher core density and better energy efficiency. Early adopters were pleasantly surprised by its performance levels, and word-of-mouth quickly spread, making EPYC an undeniable force in the market.
Reports indicate that AMD aims to capture a 50% share of the server CPU market, which means that competition with Intel will continue to escalate. This protracted "CPU war" is far from over, but one undeniable fact is that the duopoly has solidified, and the market is shifting from a single monopoly to a contest between two strong players.
However, while everyone's attention is focused on this "throne battle," a more powerful disruptor has quietly been gaining strength. As Bloomberg Intelligence analysts pointed out: "The AI accelerator market is undergoing a structural transformation, as traditional CPUs can no longer meet the large-scale computing demands of modern AI models." This judgment directs our vision toward a broader battlefield.
The Rise of ASICs
ASIC, or Application-Specific Integrated Circuit, is a chip designed for specific applications. In contrast to the "universality" of CPUs, the "specialization" of ASICs allows them to achieve extreme performance and energy efficiency for specific tasks. In the AI era, these chips tailored for specific algorithms are becoming the new favorites of hyperscale cloud service providers.
There are three main reasons why hyperscale vendors are turning to ASICs. First is cost optimization: when AI computing scales reach tens of thousands or even hundreds of thousands of chip levels, the high procurement costs and operational costs (mainly electricity) of general-purpose chips become a significant burden, and self-developed or customized ASICs can significantly reduce the cost per unit of computing power Secondly, regarding performance and energy efficiency: by cutting out a large number of non-essential modules in general-purpose CPUs, ASICs can allocate all transistors to specific AI computations, achieving an order-of-magnitude improvement in performance and performance per watt. Thirdly, there is architectural differentiation: self-developed ASICs enable cloud service providers to build unique hardware ecosystems that are deeply integrated with their own software and services, creating competitive barriers that are difficult for others to replicate.
The predictive data from major market research institutions eloquently demonstrates this trend. According to a report released by Bloomberg Intelligence in January 2026, while GPUs will continue to dominate the AI accelerator market in the next decade, the growth of the custom ASIC market will be even more rapid. It is expected that by 2033, the custom ASIC market size will reach $11.8 billion, with a compound annual growth rate of up to 27%, and its share of the entire AI accelerator market will leap from 8% in 2024 to 19%.
A report released by Counterpoint Research on January 26, 2026, presents an even more aggressive forecast. They predict that the global shipment of AI server ASICs will triple between 2024 and 2027 and will exceed the shipment of data center GPUs by 2028, at which point the global shipment of AI server ASICs will exceed 15 million units. The agency points out that between 2024 and 2028, the top 10 global AI hyperscale vendors will cumulatively deploy over 40 million AI server ASIC chips.
The market landscape is also undergoing profound changes. In 2024, the AI server ASIC market will be dominated by Google (64%) and AWS (36%), presenting a dual oligopoly. However, by 2027, the market will evolve into a more diversified ecosystem, with players like Meta (MTIA) and Microsoft (Maia) also occupying significant shares. This shift highlights the strategic intent of hyperscale vendors to move from reliance on general-purpose GPUs to internally customized chips.
In the field of ASIC design partnerships, Broadcom is expected to maintain its leading position, capturing about 60% of the market share by 2027. The company firmly controls the AI ASIC market through collaborations with Google, Meta, and OpenAI. Marvell, on the other hand, holds about 20-25% of the market share thanks to key design collaborations with AWS and Microsoft. Notably, MediaTek is entering this field and has secured a design partnership for Google's TPU v8x inference chip, posing a potential challenge to Broadcom's long-term dominance.
Giants' ASIC Products
In this arms race of ASICs, several tech giants have launched "behemoths" capable of challenging the industry landscape.
As a pioneer in the ASIC field, Google's TPU has developed to its seventh generation (Ironwood), which will be released in April 2025 The TPU v7 boasts an impressive 4,614 TFLOPs (FP8) single-chip computing power, equipped with 192GB of HBM3e high-bandwidth memory, supporting a KV cache of over 1 million tokens, specifically designed for large-scale AI inference and to support its core product, the Gemini large model. In terms of system-level scalability, a single TPU Pod can accommodate 9,216 chips, forming a powerful supercomputing cluster.
Google not only uses it internally but also builds a complex supply chain through partnerships with Broadcom and even MediaTek. In the upcoming TPU v8 series, Google adopts a dual-supplier strategy: Broadcom is responsible for the high-performance training chip TPU v8AX "Sunfish," while MediaTek collaborates on the design of the inference-specific chip TPU v8x "Zebrafish" to balance cost and performance.
Amazon AWS Trainium is a crucial pillar of the Amazon cloud services ecosystem. The latest Trainium3, manufactured using TSMC's 3nm process, is AWS's first AI chip to utilize this advanced process. The single chip integrates 8 NeuronCore-v4 computing cores, achieving a peak FP8 computing power of 2.52 PFLOPs, equipped with 144GB of HBM3e memory (12-layer stacked design), with a memory bandwidth of up to 4.9 TB/s, an approximately 70% improvement over the previous generation.
AWS's goal is very clear: to provide cloud customers with a more cost-effective AI training option than general-purpose GPUs. Its Trn3 UltraServer platform can integrate up to 144 Trainium3 chips, with a total memory capacity of about 20.7TB and a total bandwidth of approximately 706 TB/s, achieving a peak FP8 computing power of 362 PFLOPs. Compared to the previous generation platform, overall computing power has increased by 4.4 times, memory bandwidth has improved by 3.9 times, and energy efficiency has increased by over 4 times. Reports indicate that Trainium has already handled over 60% of AWS's internal AI inference workloads, exceeding analysts' expectations.
Microsoft's Maia 200 chip, set to be released in early 2026, directly targets the inference market. Microsoft claims that the FP4 performance of the Maia 200 is three times that of Amazon's third-generation Trainium and surpasses Google's TPU v7, making it "the most powerful first-party chip among any hyperscale vendor." This demonstrates Microsoft's determination to catch up and surpass its competitors in self-developed chips and marks the entry of AI inference chip competition into a heated stage.
Meta's MTIA self-developed chip plan is equally ambitious. Its MTIA chip aims to cover both training and inference to support its vast recommendation system and future metaverse and AI intelligent applications. Meta is closely collaborating with partners like Broadcom to accelerate the iteration and deployment of its self-developed chips. Reports estimate that Meta's investment in AI chip infrastructure could reach as high as $10 billion, and through acquisitions like Rivos, it further reduces its reliance on NVIDIA It is worth noting that TSMC plays a key role in this ASIC competition, accounting for nearly 99% of the wafer manufacturing share for the top 10 AI server ASIC manufacturers. This means that whether it is Google, Amazon, or Microsoft, their self-developed chips ultimately rely on TSMC's advanced process capabilities.
Conclusion
There is no doubt that CPUs will remain an indispensable part of data centers for the foreseeable future. The complex workflows of the AI era, especially the rise of agent AI, have reinforced its strategic value as the "conductor," bringing new growth momentum. The fierce competition between Intel and AMD will continue to drive technological advancements, providing the market with more powerful general computing platforms. For these two companies, the server CPU business remains a lucrative core territory.
However, from the perspective of the incremental growth and future potential of the entire server chip market, the biggest opportunities have clearly shifted from general computing to specialized computing. The pursuit of extreme performance and cost-effectiveness by hyperscale cloud service providers is giving rise to an unprecedentedly large and rapidly growing ASIC market. It is predicted that by 2030, data centers alone will account for 50% of the total revenue in the semiconductor market, with the share of ASICs continuing to expand. By 2030, investments in AI-related capital expenditures by hyperscale cloud service providers and secondary cloud service providers will exceed $3.5 trillion, with Microsoft's capital expenditures expected to exceed $150 billion by 2026, and OpenAI's infrastructure roadmap potentially exceeding $1 trillion by 2030.
For observers of the semiconductor industry, this means that the focus needs to shift from the traditional CPU duopoly to the broader field of AI accelerators. In this new battleground, the protagonists include not only GPU giants like NVIDIA but also tech giants like Google, Amazon, Microsoft, and Meta, along with chip design service companies behind them such as Broadcom and Marvell. Their alliances, technological competition, and ecosystem building will collectively define the computing architecture of the next decade.
The story of CPUs is far from over, but a new chapter of server chips, opened by ASICs, has officially begun.
Risk Warning and Disclaimer
The market has risks, and investment requires caution. This article does not constitute personal investment advice and does not take into account the specific investment goals, financial situation, or needs of individual users. Users should consider whether any opinions, views, or conclusions in this article align with their specific circumstances. Investing based on this is at one's own risk.
