The Past and Present of Trillion-Dollar Nvidia

In the wave of artificial intelligence, NVIDIA is undoubtedly a company at the forefront, becoming the first chip manufacturer with a market value of 1 trillion US dollars.

NVIDIA, standing at the forefront of the artificial intelligence wave, has become the first chip manufacturer with a market value of $1 trillion. One key reason for NVIDIA's success is its highly sought-after chip products in the field of artificial intelligence, namely the A100 chip and the next-generation H100 chip. Currently, these high-end chips and their corresponding graphics cards are in short supply.

The H200, which was released in November last year and is expected to be available in the second quarter of 2024, is bound to cause a buying frenzy.

In a recent podcast, Zhang Yi, a senior researcher at Microsoft Asia, exclaimed that it is strange that the whole world cannot produce enough A100 chips. A year ago, almost no one expected this situation.

The A100 chip launched by NVIDIA in 2020 is now in high demand, and the H100, which has gained popularity with ChatGPT, is being snapped up by major companies. This has propelled NVIDIA's performance and its stock price to new heights.

Brannin McBee, the founder and CEO of Core Weave, a startup in the field of artificial intelligence, couldn't help but exclaim that the H100 is one of the scarcest engineering resources on Earth. This statement alone gives us a glimpse of NVIDIA's current prosperity.

But with countless chips in the world, why is NVIDIA's chip the only one that has become a unique player in the field of artificial intelligence? And how did NVIDIA, a company that has always dominated the graphics card market, establish such a dominant position in the field of deep learning and artificial intelligence?

Microsoft's Two Advances

In 1999, NVIDIA, which was just starting to gain recognition, introduced the concept of GPU. Prior to this, CPU manufacturers, including Intel, firmly believed that graphics processing was the job of the CPU, and the more tasks the CPU could handle, the better. The idea of separating graphics work onto a separate processor seemed unappealing.

At that time, Japanese game developers had the most influence in the graphics application field. Japanese console CPUs were powerful, and most of the development work was focused on the CPU, so there wasn't much market space for GPUs.

The turning point came when Microsoft, dissatisfied with the dominant position of Japanese manufacturers in the industry, developed DirectX, a standardized API graphics interface. This allowed a large number of graphics functions to be ported from the CPU to the GPU. In addition, the launch of Microsoft's Xbox, with its well-coordinated CPU and GPU, broke the monopoly of CPU chips in the industry.

NVIDIA was one of the few hardware companies that followed Microsoft's lead at the time and made significant progress in the GPU field.

Later, Microsoft drove another revolution by introducing unified rendering technology, which merged the vertex calculation and subsequent rendering steps of graphics rendering. It collaborated with ATI, another well-known company in the graphics card field, to successfully apply this technology in the GPU Xenos.

Unintentional Success

Unified rendering was just a step in graphics applications, but it brought NVIDIA a completely different development path, which can be said to be the starting point for NVIDIA's later development in GPU and even its involvement in the field of deep learning.

After seeing the unified rendering architecture, NVIDIA decisively rebuilt its previous GPU architecture. Its GPU stream processors were carefully grouped into small stream processors that can run independently, solving the problem of stream processors being bound and forced to be idle.

This laid the foundation for NVIDIA's revolutionary CUDA architecture. Because NVIDIA's stream processors are independent and standard units, they are easy to control and schedule, allowing tasks that could only be processed serially to be processed in parallel. This greatly reduces the difficulty of programming.

At the same time, NVIDIA's competitor ATI, due to its early lack of investment in hardware architecture changes, continued to use the previous serial design, resulting in increasing sunk costs, making innovation increasingly difficult and expensive, and eventually being successfully squeezed out of the graphics card market by NVIDIA.

Subsequently, in 2017, NVIDIA introduced the concept of Tensor Core computing units, which are specifically designed for deep learning and support lower precision calculations, thereby greatly saving model computing power.

This dedicated acceleration unit largely squeezed the space for CUDA to process deep learning, but at the same time caught NVIDIA's competitors off guard, making AI-specific chips no longer attractive. As a result, NVIDIA's GPUs coincidentally became the most recognized hardware in the field of AI.

Betting on the Trend

In 2003, NVIDIA, known for its "rapid iteration and continuous trial and error," embarked on an unpopular project. It developed a Soc chip that integrated an ARM-based CPU with its own GPU.

Since the Soc chip, NVIDIA has released chips every few years. In 2015, it launched the Tegra K1, which used a standard Arm CPU and its own Kepler architecture GPU, but due to unsatisfactory power consumption and heat dissipation, it was torture for most users.

However, industry insiders highly appreciate these setbacks. An investor once pointed out that while NVIDIA holds onto the basic market of GPUs, it constantly extends its reach into new areas and lets countless people who buy its graphics cards share the cost with it.

He also praised that although many of NVIDIA's things, such as CUDA, did not see practical applications for a period of time, it established a complete ecosystem in the trial and error process and successfully stood at the forefront when a new trend emerged.

This is also one of the reasons why NVIDIA's GPUs defeated other chips and successfully captured the AI dividend. On the one hand, GPUs are more versatile and better suited to changes than dedicated chips; on the other hand, NVIDIA has a complete ecosystem, making its GPUs the most suitable choice at present.In fact, when AI suddenly broke out, companies in the industry found that GPUs were the best choice for running generative AI models efficiently. It is unlikely for a GPU originally designed for gaming to switch to running AI programs. Currently, only NVIDIA's GPUs are capable of running AI models.

And there is a little surprise in NVIDIA's story.

In 2016, NVIDIA released the first deep learning supercomputer, DXG-1. Remarkably, NVIDIA CEO Jensen Huang seemed to foresee the future and donated the first DXG-1 to OpenAI, which was still a startup at the time.

In 2022, OpenAI exploded onto the scene with the revolutionary ChatGPT, igniting the concept of artificial intelligence and propelling NVIDIA to become a hot commodity in the chip industry. This serendipitous connection is both awe-inspiring and a testament to Huang's foresight.

The Difference Between CPU and GPU

Both CPUs and GPUs are processors for computation. They consist of three components: the arithmetic logic unit (ALU), the control unit, and the cache unit. However, the proportions of these components differ significantly. In a CPU, the cache unit accounts for about 50%, the control unit for 25%, and the ALU for 25%. In a GPU, the cache unit accounts for about 5%, the control unit for 5%, and the ALU for 90%.