Who is at the neck of NVIDIA?

LB Select
2024.03.28 07:50
portai
I'm PortAI, I can summarize articles.

Whoever is holding back Nvidia's production capacity, whoever is the bottleneck restricting Nvidia's production capacity, is the most scarce presence on the supply side, which is the most logical choice

Extracted from the perspective of an investor.

If you want to invest in other stocks in the AI data center industry chain, who would you choose?

Would it be the soaring bull stock Super Micro Computer Inc. (SMCI) or Dell Technologies (DELL), which can actually challenge NVIDIA's dominance, making it more scarce?

So, in the NVIDIA industry chain, which stocks are better to buy?

The underlying logic of decision-making is: Whoever is constraining NVIDIA's production capacity, whoever is bottlenecking NVIDIA's production capacity, is the most scarce on the supply side, making them the best choice.

Undoubtedly, one is Micron, and two is TSMC.

For those who are not familiar with the chip industry chain, it may not be easy to understand. Below is an excerpt from the industry veteran blogger @老石谈芯 (has accounts on Bilibili and YouTube) from last year, which is relatively easy to understand even for laymen:

In fact, for NVIDIA, the biggest pain point is not the sales of H100, but the supply.

Even Huang's supply is constrained.

Specifically, there are two bottlenecks in the supply of H100, one is the supply of high-bandwidth memory HBM, and the other is the advanced packaging of chips.

First, let's look at HBM. It is a high-capacity, high-bandwidth memory, now a standard for AI chips. Large models and deep learning have a massive number of parameters and require processing a huge amount of data, so there are very high requirements for data throughput and storage capacity.

You can think of the GPU as a city, and the data as workers commuting to work in the city. If using traditional methods like DDR, workers need to commute for a long time to reach the city, and the commute route is narrow and very congested. But using HBM is like building many high-rise residential buildings around the city, and then sending the workers to the city in batches through high-speed rail, greatly improving the efficiency of bricklaying.

Currently, there are only three companies globally that can mass-produce HBM, namely SK Hynix, Samsung, and Micron. Among them, SK Hynix occupies 50% of the market share and is currently the only company that can mass-produce the third generation of HBM, serving as the exclusive supplier to NVIDIA. A single H100 chip contains 5 to 12 HBM stacks.

SK Hynix not only sells to NVIDIA but also to AMD and Google, making HBM production capacity naturally a bottleneck for H100.

Even if there is H100, it is not enough. The HBM and GPU chips need to be packaged together, connecting the city and these surrounding buildings.

There is a major principle in chip design, the closer the place where data is processed and calculated is to where data is stored, the higher the computational efficiency. So efforts need to be made to bring the GPU and HBM closer together, which requires the use of TSMC's CoWoS packaging technology.

Conceptually, it consists of two parts, CoW and WoS. CoW is Chip-on-Wafer Stacking, and WoS is Wafer-on-Substrate Stacking, which can integrate multiple active chiplets into a passive silicon interposer and communicate through the interposer, the biggest advantage being the reduction of chip space, power consumption, and costA simple example is like high-rise residential buildings tightly surrounding a city, with multiple subway lines dug underneath for transportation. The efficiency is definitely much higher than laying railways on the ground, but digging subways is a major project, and carving subways on chips is even more challenging. In particular, the process of creating silicon vias in the intermediary layer is extremely complex, and only TSMC can do it well.

However, the cost of this packaging process is too high, and hardly anyone used it before, so TSMC only had a capacity of 30,000 pieces in 2023. When the AI ​​boom came, NVIDIA instantly filled up the capacity.

Faced with a situation of supply shortage, in June 2023, TSMC launched the advanced packaging and testing plant in Nankang, and in late July announced the construction of an advanced packaging wafer plant with an investment of 90 billion New Taiwan dollars, expected to be completed by the end of 2026, with mass production scheduled for the third quarter of 2027.

The above information is from the chip professional blogger @老石谈芯 on Bilibili, who explains it in a very easy-to-understand way for chip laymen.


So, as we enter 2024, what is the situation with Micron's HBM3e capacity and TSMC's CoWoS capacity?

  1. Calculation of Micron's HBM3e market value:

Thanks to @Guotie for providing the data:

NVIDIA's H200, B100/B200 adopt HBM3e. Assuming a shipment volume of 1 million units for H200 in 2023 (H100 series is 2 million units) and 500,000 units for B100. Among them, H200 is 144GB = 24GB * 6, and B100 is 192GB = 24GB * 8. Therefore, the total demand for HBM3e is 1 million * 144 + 500,000 * 192 = 240 million GB. Assuming a unit price of $20 for HBM3e, the corresponding market space for HBM3e is $5 billion. If Micron takes a 10% share, it is approximately $500 million.

Calculation for 2024: Assuming the shipment volume for H200 is 1.5 million units and for the B100 series is 3 million units, the total HBM demand is 1.5 million * 144 + 3 million * 192 = 792 million, with the price of HBM3e dropping by 10% to $18/GB, the market space is $14.2 billion. Micron's share of 25% of the HBM3 market is approximately $3.5 billion. This revenue volume is not significantly different from the previous calculation.

  1. TSMC's CoWoS capacity:

Recently, Goldman Sachs raised TSMC's 2024-2025 CoWoS (Chip on Wafer on Substrate) capacity expectations from 304,000 to 441,000 to 319,000 to 600,000, expecting a doubling of capacity by 2025. This means that CoWoS capacity will increase by 122% year-on-year this year and 88% next year, doubling for two consecutive years