Wallstreetcn
2023.09.04 08:15
portai
I'm PortAI, I can summarize articles.

Who has grasped the lifeline of NVIDIA?

Sellers of water also have their worries.

The AIGC wave, widely regarded as the fourth industrial revolution, is bringing disruptive changes to human production methods. With GPUs that are as scarce as gold, NVIDIA has become the "water seller" of AI, with its stock price soaring by 240% this year.

However, due to the high demand, NVIDIA GPUs have encountered production bottlenecks. The latest H100 chip has already been sold out, and now orders placed will have to wait until Q1 or even Q2 of 2024 to be fulfilled.

The root cause lies in the severe shortage of GPU components, which in turn affects the supply.

Taking the H100 chip as an example, its most critical components are: 1) Logic chip; 2) HBM memory chip; 3) CoWoS packaging.

The core logic chip has a size of 814 square millimeters and is mainly supplied by TSMC's most advanced Fab 18 in Tainan. The process node used is "4N," which is actually 5nm+. Currently, TSMC's utilization rate of the 5nm+ capacity is less than 70% due to the weak PC, smartphone, and non-AI-related data center chip markets. Therefore, there is no problem with the supply of logic chips.

Next to the central logic chip in the H100 is 6 pieces of HBM (High Bandwidth Memory), which is a type of DRAM memory chip based on 3D stacking technology. It can be vertically stacked like floors in a skyscraper, stacking multiple DDR chips together, connecting them through Through Silicon Vias (TSV), and using TCB bonding to achieve higher bandwidth, wider bus width, lower power consumption, and smaller size.

Memory chips are crucial for GPU performance, especially for high-performance GPUs used in AI training. Inference and training workloads are memory-intensive tasks. With the exponential growth of the number of parameters in AI models, the size of the model is pushed to the TB level by the weight alone. Therefore, the ability to store and retrieve training and inference data from memory determines the upper limit of GPU performance.

As a pioneer of HBM, HBM supply is almost monopolized by the South Korean memory chip manufacturer SK Hynix, with a market share of over 95%. It is also the only manufacturer capable of producing HBM3, which is used in various models of the H100.

The supply of HBM3 is currently quite scarce. Earlier, there were reports that NVIDIA and AMD requested samples of the next-generation HBM3E chip, which is not yet in mass production, from SK Hynix. NVIDIA has asked SK Hynix to supply HBM3E as soon as possible and is willing to pay a "premium." However, with major storage chip manufacturers investing heavily in increasing HBM3 production capacity, the tight supply situation may ease this year. According to recent media reports, Samsung Electronics signed an agreement with NVIDIA on August 31 to supply HBM3 to the latter after passing final quality tests. Supply is expected to begin as early as next week.

Earlier, Citigroup also revealed in a report that Samsung will start supplying HBM3 to NVIDIA in the fourth quarter.

Another major bottleneck lies in CoWoS packaging.

HBM and CoWoS packaging are complementary technologies. HBM has high requirements for the number of solder balls and the length of short traces, which requires advanced CoWoS packaging technology to achieve high density and short connections that cannot be achieved on PCB or even packaging substrates.

Currently, almost all HBM uses CoWoS packaging technology. TSMC is the main supplier of NVIDIA GPU CoWoS packaging. However, due to explosive demand growth, even with full capacity, TSMC is struggling to bridge the supply-demand gap. Therefore, TSMC has opened three new factories in Zhunan, Longtan, and Taichung, with the Zhunan factory covering an area of 14.3 hectares, larger than the total of other packaging factories.

Some market analysts believe that TSMC is actively increasing its advanced packaging capacity to meet the growing demand for its advanced packaging solutions.

In addition, Colette Kress, CFO of NVIDIA, recently revealed that NVIDIA has developed and certified capacity from other suppliers in key processes such as CoWoS packaging. It is expected that the supply will gradually increase in the coming quarters as NVIDIA continues to work with suppliers to increase capacity.