Wallstreetcn
2023.10.12 04:29
portai
I'm PortAI, I can summarize articles.

Can AMD really break Nvidia's monopoly? AMD has risen more than 10% in the past two weeks, but...

NVIDIA's CUDA ecosystem shuts AMD chips out, breaking the monopoly is not an easy task for AMD.

Hardware and software are working together, and AMD is striving to catch up with NVIDIA, currently facing a crucial issue of ecosystem monopoly.

At the end of September, Sharon Zhou, co-founder and CEO of Lamini, tweeted on the X platform, stating that she has been using over 100 AMD (Advanced Micro Devices) chips to support her AI startup's products over the past year.

This tweet has once again sparked market excitement. In order to catch up with NVIDIA, AMD has been making frequent moves this year, and its stock price has risen by about 10% in the past two weeks.

In June of this year, AMD released its latest GPU Instinct MI300 at its new product launch event, claiming that the highest HBM density provided by MI300X is 2.4 times that of NVIDIA's AI chip H100, and its HBM bandwidth is 1.6 times that of H100. This means that AMD's chips can run larger models than NVIDIA's chips, and the upcoming Instinct MI300XA, which will be released later this year, is considered a strong competitor to NVIDIA's H100.

However, in terms of software, NVIDIA's CUDA ecosystem has shut AMD chips out, making it difficult to break NVIDIA's monopoly.

AMD's software challenge: Crossing the CUDA ecosystem barrier

In terms of hardware parameters, AMD Instinct MI300A has already caught up with or even surpassed the level of NVIDIA's H100. The remaining challenge is to improve the software ecosystem, mainly by making it compatible with NVIDIA's self-developed CUDA ecosystem.

NVIDIA's CUDA software and its chips form a closed ecosystem, making it difficult for AMD's ROCm software to become popular. In addition, NVIDIA has a significant advantage in other software components, such as drivers that connect the operating system and hardware.

Zhou stated in an interview that although her startup has only been established for a year, her co-founder Greg Diamos has spent years optimizing AMD chips for Lamini's development.

Therefore, if an AI application of a startup is equipped with NVIDIA chips, it is basically impossible for them to switch to AMD chips because it would mean "throwing away all the code and starting from scratch."To better illustrate the difficulty of crossing ecosystems, Zhou pointed out that NVIDIA has a "decade-long advantage" in the CUDA ecosystem.

However, this does not mean that AMD chips have no advantages.

Firstly, the AMD MI300A chip is the first to combine CPU and GPU, which can accelerate the speed of training calculations, while NVIDIA's Falcon Shores plan has not yet been implemented.

Secondly, the MI300A chip has 128GB of memory, which is larger than the 80GB memory of the H100, meaning that developers can load larger and more complex artificial intelligence models on a single chip, instead of splitting them across multiple chips. Splitting the model would slow down training and running speeds and consume more power.

In addition, many startups are also working to make non-NVIDIA chips compatible with the ecosystem. For example, Lamini has been dedicated to simplifying the difficulties of building artificial intelligence models on AMD GPUs; Modular is building software to allow developers to train and run models on different types of hardware.

On Tuesday, AMD announced plans to acquire a startup called Nod.ai to enhance its artificial intelligence software development capabilities and facilitate the deployment of artificial intelligence models for AMD chips.