AMD: The future lies in AI moving towards the edge, with MI300 expected to generate revenue of $3.5 billion by 2024.

Wallstreetcn
2024.03.07 11:57
portai
I'm PortAI, I can summarize articles.

AMD believes that the market size of AI chips will reach $400 billion in 2027, calculated based on customers' long-term demand. More and more users are writing models on open-source ecosystems, narrowing the gap with Nvidia.


Author: Ge Jiaming

Source: Hard AI

AMD has been sprinting all the way in this AI chip frenzy, and Wall Street is still enthusiastically dubbing it as "NVIDIA's strongest challenger." On March 6th, AMD closed nearly 2.7% higher, once again hitting a new high for the week. So far this year, AMD's stock has surged by nearly 52%, and over the past year, it has soared by 156%.

Market research firm Jon Peddie Research (JPR) bluntly stated in a report that in the fourth quarter of 2023, AMD snatched away NVIDIA's discrete graphics card market share, with a MoM increase of 2 percentage points, reaching 19%. The shipment volume of discrete graphics cards increased by 17% MoM and a staggering 117% YoY. Although NVIDIA still holds the market leader position with an 80% market share, it decreased by 2% MoM.

Recently, AMD's Hu Jin participated in the Morgan Stanley TMT Conference. We have summarized and organized this event to share with everyone the latest developments of AMD in the chip field.

In Hu Jin's view, the $400 billion AI chip market size in 2027 is not a random figure but a calculated result from the bottom up. Based on customer feedback on long-term demand and judgment on the scale of chips needed to meet the demand, the team also considers what the global GDP means and what it signifies if a significant amount of labor costs are saved.

AI starts in data centers and will move towards the edge, reaching personal devices like PCs. For AMD, one of the most important things is to deploy AI PCs. AI PCs will drive different replacement cycles. Once the NPU is integrated into the computer, it can perform many applications locally without the need to go to the cloud, truly enhancing productivity.

NVIDIA iterates every year, how does AMD view competition? It is quite clear, according to Hu Jin. Since the launch of MI300, NVIDIA has been accelerating the pace of product launches, which is undoubtedly a good thing for the market. The tech industry needs competition, and the market should expect AMD to have a similar pace of updates.


At the Morgan Stanley TMT Conference in 2024, Hu Jin expressed confidence in AMD's roadmap, stating that the supply chain can support a market demand of $3.5 billion. However, the key to any new technology lies not only in the supply but also in customer adoption. It is essential for AMD's roadmap to align with customer needs.

The server market is expected to decline in 2023 due to inventory digestion and AI expenditure pressure. Customers prioritize AI and extend the depreciation of traditional services. Almost all cloud computing customers have extended their depreciation periods. New workloads are driving customers to expand data center space and power. Continuing to use old servers results in high operating costs. The depreciation time of servers cannot be extended indefinitely; they must be upgraded. Therefore, the server market is expected to improve, and AMD will continue to gain market share and achieve higher growth in the server market.

Establishing an AI ecosystem is a gradual process. Hu Jin mentioned that AMD focuses on the open-source ecosystem. AMD's ROCm 6 (open-source software stack) has made significant progress, supporting PyTorch, JAX, Trident, and Defender frameworks. Customers can write models based on these frameworks to run MI300X.

Hu Jin shared insightful views at the 2024 Morgan Stanley TMT Conference:

  • AMD's prediction that the AI chip market will reach $400 billion by 2027 is based on customers. What we truly want to discuss is not just the $400 billion figure but the unusual technological trends and how AMD can gain more market share in this wave.
  • AI starts in data centers and will move towards the edge, PC, and other personal devices. One of the most crucial things for AMD is to layout AI PC, which will drive different replacement cycles. Once the NPU is integrated into computers, many applications can be completed locally, significantly improving efficiency.
  • Since the launch of MI300 by AMD, NVIDIA has been accelerating the pace of product launches. It is expected that AMD will also increase its update speed. Competition in the tech industry is beneficial. At TSMC, we have established a solid long-term partnership, ensuring a stable supply chain to support a market demand of $3.5 billion. However, the key to any new technology lies not only in the supply but also in customer adoption. This requires our roadmap to align with customer needs.

In the high-tech field, no investment is certain, but you need to assess the market trajectory to allocate internal resources reasonably for investing in hotspots.

The MI300 has become one of AMD's fastest-growing products in history. It is expected to generate revenue exceeding $3.5 billion by 2024, with significant quarterly growth.

The server market has actually declined due to inventory digestion and AI expenditure squeeze. However, continuing to use old servers results in relatively high operating costs. The server market is expected to rebound in 2024.

ROCm 6 has made significant progress, supporting PyTorch, JAX, Trident, and defender frameworks. If customers write models based on these frameworks, they can run MI300X, narrowing the gap.

Full Interview of Jean Hu at the 2024 Morgan Stanley TMT Conference

Jean Hu, Executive Vice President and Chief Financial Officer of AMD Joseph Lawrence Moore, Research Department Executive Director at Morgan Stanley

Moore: Welcome back, I am Joseph Lawrence Moore, a semiconductor researcher from Morgan Stanley. Today, I am delighted to have Jean Hu, the CFO of AMD, here to discuss with us.

Moore: Let's start with the topic of AI. We can't seem to get away from AI lately. You predict that the AI chip market will reach $400 billion by 2027, which has sparked widespread discussion in the market. Given that AMD's views on the market have always been conservative, such an expectation stands out. Could you please share the basis of this forecast and your confidence in it?

Jean Hu: Thank you for having me. Indeed, in December last year, we revised our expectations for the AI chip market, increasing the market size from $150 billion in 2027 to $400 billion. At that time, this number did surprise some people, but now it seems that the surprise was unnecessary.

If you pay attention to the recent pace of generative AI advancements, you will see a significant demand for AI chips. We have witnessed the emergence of new models and applications, as well as major companies' capital expenditures on AI infrastructure.

Many are now discussing investing up to a trillion dollars in AI infrastructure to enhance our future work efficiency and unlock the potential needed to change our way of life. From a long-term market opportunity perspective, what we have planned is actually a framework understanding of the market trend, representing an incredible technological trend. When we talk to cloud computing customers, they tell us they need to run large language models to make them better, more accurate, and reduce illusions. They need the models to answer questions better and improve productivity, which requires a very large computing cluster.

In the enterprise, we are starting to see evidence of gradually increasing productivity. We hear people talking about very specific numbers, with productivity sometimes increasing by 30%, 40%, or even reaching 100%. For these applications, you need different models, different applications, which in turn require different computing clusters.

Another thing is that AMD is a staunch supporter of AI. We believe that artificial intelligence starts in the data center and will move towards the edge, such as AI PCs. When you consider all these different AI application areas and large models, you can see that the pace of innovation in model applications is accelerating, but this innovation speed is still constrained by the shortage of GPU computing resources.

So when we put all this together, the market's development trajectory actually aligns with our views, right?

If you look at the accelerating AI market, it is still very small in 2022. But by 2023, it could really exceed $40 billion. This year, it will continue to double or grow even more. I think what we really want to discuss is not whether it's $400 billion or $300 billion, but that the development trajectory of this technological trend is unusual.

Moore: Understood, that's very helpful. Perhaps you can talk about the efforts AMD has made to seize this opportunity, tell us about today's MI300, what capabilities it has, and which markets it can impact?

Hu Jin: In December last year, we released the MI300, and in just three months, we have received strong support from cloud computing customers. This includes Microsoft, Meta, Oracle, as well as OEM partners such as Dell, AMD, HP, and others. The entire ecosystem has given our product high praise.

In the fourth quarter of last year, our GPU revenue exceeded expectations, reaching $4 billion, largely due to the growth of MI300. MI300 has become one of the fastest-growing products in AMD's history, and we have a wide customer base, not only cloud customers but also enterprise customers. Therefore, how to use MI300 in the future is very interesting, as the current applications of these products are very rich, covering both inference and training.

Our cloud computing customers are not only using the product for internal workloads, such as GPT-4, but also for third-party workloads. Since the release of MI300, we have been continuously expanding and deepening customer engagement, and we are very satisfied with the feedback from customers and the adoption of the product. By 2024, we expect our revenue to exceed $3.5 billion, achieving significant growth every quarter, which is truly exciting. Moore: When it comes to these numbers, the growth from 20 billion US dollars to 35 billion US dollars is a very impressive growth rate. The development of the ecosystem takes time, including software support. How do you view this rapid growth?

Hu Jin: Indeed, even we are surprised by this growth. Since the acquisition of Canadian graphics chip manufacturer ATI Technologies in 2006, AMD has been involved in the GPU field for over a decade, developing a powerful CPU and GPU platform.

When Lisa Su joined AMD, our investment focus was on building the CPU and GPU platforms. In fact, AMD's experience in GPU investment is similar to NVIDIA's. If you look at the MI300, it didn't just come out of nowhere. From 2020 to 2023, we successively launched the MI100, MI200, MI210, MI250, and then the MI300.

On the other hand, it's not just about GPU hardware, software is equally important. Over the years, AMD has been investing in ROCm software to support High-Performance Computing (HPC) and cloud computing applications. Initially, it was used for HPC applications. In recent years, the team has accelerated progress, focusing on cloud applications. This didn't happen overnight, but rather through a long process, which is now accelerating. We have accelerated the adoption of MI300 through close collaboration with major cloud computing customers.

Moore: Could you discuss the supply chain situation of this product from different perspectives? Firstly, this complex product requires very advanced packaging equipment. Additionally, I remember you mentioned an 8-month lead time for manufacturing, which is quite long, but you were able to achieve sales of over 35 billion US dollars quite quickly. Could you explain this situation to us?

Furthermore, many in our industry are focusing on your shipment data, making price assumptions and believing that this is the total revenue for the year, thus arriving at a very high revenue figure for this year. Can you talk about whether this way of looking at AMD's opportunities is correct?

Hu Jin: Regarding the MI300, perhaps I can briefly introduce the background. The MI300A uses three eight-core Zen 4 small chips and multiple CDNA3 small chips, suitable for tasks such as Artificial Intelligence (AI) and High-Performance Computing (HPC) execution. AMD was able to mass-produce products with a three-chip design a decade ago, possibly the first company to do so. Similarly, in chip packaging technology, AMD initially collaborated with TSMC to jointly develop advanced packaging technology, not to mention the contribution of ASML.

So when you look back at the long history of cooperation between AMD and TSMC, you will find that this has been tremendously helpful for us. Our three-chip design products have been mass-produced on TSMC's production line for some time. Of course, the MI300 is the most complex chip we have designed. We work closely with TSMC and the entire supply chain system to advance product production. Taiwan Semiconductor Manufacturing Company (TSMC) and AMD have a very close partnership, as you may know. I believe that for our team, they can deliver outstanding performance under very tight conditions, and their supply level can indeed provide products worth over $3.5 billion.

Lisa Su takes a long-term perspective when considering business, as the manufacturing cycle of the MI300 product lasts up to 7 months, with a large number of customers involved, thus ensuring AMD's future success.

Different customers have different product certification processes, so the promotion speed of MI300 may be somewhat unpredictable. I think it will be a gradual process, not an overnight success. However, MI300 is not only related to this year's performance but more importantly, it seizes a huge long-term market opportunity. As a participant in this AI market, AMD needs to seize the opportunity and ensure consistency in supply.

Moore: So, is this year really laying the foundation for it?

Hu Jin: Yes, we are still in the early stages. If you look at the huge market opportunity, we have made long-term preparations.

Moore: Regarding people using supply chain data to estimate your revenue numbers. Will all the products you produce this year be sold this year? How should we view this issue?

Hu Jin: The semiconductor ecosystem's supply chain manufacturing cycle is quite long, it takes time. Having production capacity does not mean you can immediately deliver all products to customers. It's a process, right? So I think trying to derive revenue numbers from supply chain data is quite naive, these are two different things.

Moore: So, do you think that if the AI market grows from over $100 billion this year to $400 billion in 3 years, your goal is to at least match this pace?

Hu Jin: I believe we do have highly competitive products. We now have software and network partners. This is a huge market, and we are the leading players in this market. We see this as a great opportunity, becoming the biggest growth driver for AMD in the coming years.

Moore: I mean, $3.5 billion is a significant number, and NVIDIA may have already noticed that you are reaching this scale. They are about to launch the B100, and they have mentioned that gross margins will decline in the second half of the year. Do you anticipate, as you become more and more important, how you will deal with counterattacks from competitors?

Hu Jin: I think in any large tech market, you will face competition. Especially in this market, which is not only very large but also has very diverse demands. Different customers have different needs, especially as the open-source ecosystem continues to evolve, not everyone is writing their models on CUDA, but more and more people are developing on the open-source ecosystem.

So we believe that for a large market with very diverse segmented demands, you need more than one participant. And GPUs are not easy to make. Both AMD and NVIDIA have long been investing in GPUs, so competition in the market is a good thing. Moore: You mentioned ecosystem support. Initially, you invited experts from Hugging Face and Python, which was one of the most convincing moves. How is your progress in ecosystem support now?

Hu Jin: When AMD launched MI300, they mentioned three strategic pillars, including having the most competitive GPU, a powerful software ecosystem, and a complete network. This means that AMD is committed to introducing competitive products in the GPU field, while focusing on building a software ecosystem and developing the network to provide comprehensive solutions. This strategic positioning aims to strengthen AMD's competitiveness in areas such as Artificial Intelligence (AI) and High-Performance Computing (HPC), providing customers with more comprehensive and efficient solutions.

This is actually for a reason. We realize that we are the second-largest player in the market, and what we want to do is lower the entry barrier, making it easier for customers to use. By focusing on the open-source ecosystem, all frameworks, models, and libraries provide better GPU support for our customers.

If you think about it, since the launch of ChatGPT, we have seen rapid development of the open-source ecosystem. All large language models and many frameworks, we believe more and more people are developing on the open-source ecosystem. For us, ROCm 6 has made significant progress, supporting PyTorch, JAX, Trident, and defender frameworks. If customers write models based on these frameworks, they can run them on MI300X. Currently, there are approximately 500,000 models on Hugging Face that can run on MI300X, and we believe we have narrowed the gap.

We will continue to make progress to make ROCm more competitive. Over the past decade in the GPU market, we have a similar history to NVIDIA, so we have been able to update ROCm to a level with powerful features. So sometimes when you want to bring CUDA to MI300, the efficiency is quite high, and we will continue to make progress in this area.

Moore: I want to focus on the additional $23 billion in revenue. You have mentioned training chips several times. People may think that the biggest application initially targeted by AMD is inference. But obviously, you may see more in training than expected. Can you talk about this?

Hu Jin: We are talking about MI300X because its memory bandwidth and capacity are still the largest. So we definitely have a significant advantage in inference. But even in training, it is quite competitive. When we observe customers, we do see demand in both inference and training.

You can imagine that training happened earlier while inference is just starting to grow. So we launched the product to address market growth, but in reality, it may be more focused on inference. However, our customers, namely enterprise customers, use MI300X for training, which is very exciting. I believe that we can continue to work closely with our customers, and we will see more and more opportunities.

Moore: I do want to discuss your other businesses, perhaps starting with servers. You had a tough year last year, especially with the shift towards AI, causing many server upgrade services to be postponed. There are signs indicating that this situation is still ongoing, right?

You see Amazon extending the depreciation period for servers, the market is warming up as there seems to be a growth in the need for servers. But in terms of those large discretionary upgrades, at least I haven't really seen them returning yet - AMD's core business has been performing well. Can you talk about what you are seeing and your expectations for 2024? Do you think we will see those large upgrades returning at some point?

Hu Jin: You are right, the overall server market actually declined in 2023. The reasons are as you mentioned: inventory digestion and AI spending. Customers prioritize AI and extend the depreciation period.

I think almost all cloud customers have extended the depreciation period. But I think the most important thing we see is that when it comes to computing, servers continue to be the most efficient and cost-effective platform for many traditional computing, including critical task applications.

And these workloads are actually expanding data centers, currently facing shortages in space and power. If you continue to use old servers, the operating costs are actually quite high. You can definitely extend it to a certain lifespan, but you must upgrade.

We do believe that the market background in 2024 will be much better. When you consider potential refresh cycles, cloud and enterprise customers are considering upgrades because they need more space and power, and they also want to ensure that data center operating costs are efficient.

So for our server business, the Genoa series currently has the best total cost of ownership in the market, and the Turin series we will launch in the second half of the year will further advance in terms of performance per watt and performance per dollar. Therefore, we do believe that through upgrades, we can actually seize the opportunity to gain more market share. This is a very important year for us. Last year, even though the server market declined significantly, our server revenue actually grew. This year, we believe the market will turn around, we will definitely continue to gain share and grow faster in the server market.

Moore: I might want to ask about the Genoa and Turin servers. Initially, they started relatively slow, everyone faced issues with PCI Gen 5 and DDR5, and funds were reallocated elsewhere. But if you look at the data, Genoa is now developing very well, and the same goes for Bergamo. Can you talk about what turned the situation around?

Hu Jin: Yes, compared to the previous generation, Genoa is actually a new socket, meaning it introduces new technological standards and interfaces, including support for DDR5 memory and PCI Express Gen 5 interface, so for customers, they have to upgrade memory and interfaces. When we pass this stage, Genoa is actually the fastest in terms of both quantity and value among the past 4 generations, showing significant growth. If you look at our third and fourth quarters last year, our server revenue in the second half of the year was nearly 50% higher than the first half.

This has also driven our revenue growth, with a year-end market share exceeding 31%, the highest in our history. What's most important is that customers really like them. For cloud customers, our market share is quite high. Now, we see a greater momentum from second-tier cloud customers, where our previous share was lower, but they usually adopt new generation technologies 1 to 2 years later than large-scale companies in the U.S. So we see growth there. The same goes for enterprises. We see that Genoa, with its high cost performance, excites customers.

Moore: Can you talk about the development of Bergamo? Is it a processor with more cores and more suitable for cloud-native applications? Is it part of the server roadmap you will continue to focus on?

Hu: I think Bergamo has a unique feature that sets it apart from Genoa: it is a true cloud-native product. We see that Meta actually adopts Bergamo on all its platforms, from Instagram and WhatsApp to Facebook.

I believe if applications are similar to it, we do see other cloud customers, especially second-tier cloud customers, really liking Bergamo, which may have better cost performance when used for certain workloads. This is definitely our focus. In fact, it is quite important for the future.

Moore: We need to send a card to Mark Zuckerberg to thank him for openly discussing all these semiconductors he uses, which is really helpful. There is some controversy about the next-generation market share. I remember you mentioned in October that Turin has started providing samples to cloud customers. I don't think your competitors have started sampling their products yet, and it is slot-compatible, while your competitors need to make conversions. So if we maintain this environment, you are in a quite advantageous position in terms of next-generation market share, which seems very promising?

Hu: Yes. Turin is not only slot-compatible, but the overall cost improvement is also quite significant. We will continue to increase the number of cores. So far, the feedback from customers has been very positive. Therefore, considering that Genoa is still in the rapid adoption phase, we are still selling Milan because some customers do need Milan. We can see the opportunity for Turin to be adopted in the future, which will help us gain more market share. From the perspective of market share, we are confident in the total cost of ownership we can provide to customers. Moore: What are your thoughts on the AI PC in 2024? In the long run, there has been a lot of discussion around AI PC, what has AMD achieved in this regard? What capabilities do they have?

Hu Jin: Yes, the PC market was full of challenges last year, with some improvement in the second half. I believe what we are seeing is a more normalized inventory and a more balanced sales, driven by end-market demand. We do expect the second half of the year to typically perform better seasonally.

I think for AMD, the most important point is AI PC. We believe AI PC will drive future replacement cycles. Last year, we were actually the first company to launch AI PC, introducing the Ryzen 7040 and selling millions of AI PCs. The role of AI PC is that once the NPU is integrated into the PC, it can execute many AI applications locally without the need to go to the cloud, truly enhancing work efficiency and user experience.

AMD is enhancing the capabilities of its Hawk Point processor and plans to launch a product called Streak in the second half of the year. As more AI applications emerge, especially next year, we expect the adoption of AI PC to increase, which is very exciting. I believe this will not only help grow our market share but also improve gross margins and contribute to the operating profit margin of the entire client business.

Moore: Can you talk about the profitability of the PC market? I mean, you faced quite open promotional activities from Intel, which had a negative impact on you. It seems they are now in the past. Do you expect this situation to happen again? Or how do you view the profitability of this niche market?

Hu Jin: The profitability of the PC market or client business is crucial for AMD. Our approach is that in a downturn cycle, we actually optimized investments, focusing on managing operating expenses and reducing the operating expenses of the client business. So in terms of investment levels, we are now very well prepared.

Once we are able to increase revenue, we believe we can use it to improve profits because in a downturn cycle, the gross margin has been affected, but it has stabilized. In the future, typically for the PC market, once you have higher revenue, once you have digested channel inventory, the gross margin tends to be more stable and rise.

I think when you consider the entire client business, we see it as a market or business where we should drive a very high operating profit margin success model. The gross margin may not be as high as the company average because it is always somewhat consumer-driven. But the operating profit margin should be much higher. So I think this is our focus, to increase the operating profit margin.

Moore: Xilinx's embedded business declined by about 40%, which is very serious by historical standards. Do you think this business may be approaching the bottom? Hu Jin: Our view is that we are going through a bottoming process in the first half of this year. In the second half, we expect to see a recovery, but it will be a slower one because some markets, such as the communication sector, remain weak not only due to inventory but also because it involves capital expenditure and product cycles, with 5G being in the later stages of its cycle.

The industrial market, to some extent, is also digesting inventory and weakness. So I believe the key to our embedded business is that Xilinx is a great franchise. In this downturn cycle, we have actually continued to win more designs, especially when combined with AMD's embedded processors, significantly expanding our design wins. I think many of the revenue synergies we are driving will manifest in the long term.

Q: When AMD mentioned a market size of $400 billion, was it customer-driven, with customers telling you about their demand growth? Customer A wants 100,000 units. Customer B wants 50,000 units, aggregating them and multiplying by an average selling price? Or is it based on, well, NVIDIA might offer $200 billion. Is this an estimate driven by supply? What is the process for arriving at this estimate?

Hu Jin: Our approach to estimating the potential market size is actually bottom-up. It is based on customer feedback on their long-term needs and our understanding of the chip content and memory content required to meet customer demands in collaboration with customers.

So, in fact, all forecasts involve a series of assumptions. Our thinking is based on the unit quantities, old costs, and our own forecasts and estimates based on customer feedback. Remember, it is crucial for us to set a trajectory because in the high-tech field, nothing is certain.

But if you know the trajectory of the market, you can allocate internal resources to invest in these markets. This is our fundamental belief, how we can roughly capture the trajectory of the market. Therefore, global unit sales include not only large cloud customers and enterprises but also countries. Additionally, as we believe in customer demand for computing or GPU, technology, packaging, and high-bandwidth memory, the average selling price (ASP) will also increase. Therefore, the ASP will increase. At the same time, we also include customer ASIC.

We have heard about Google GPU and TPU and other customers discussing internal chips. We also have quite a significant forecast for these opportunities. So it is a bottom-up process, but our team has also done some work to check what that $400 billion means for global GDP and productivity improvement. Even when we talk about artificial intelligence, significant labor costs are actually saved, what does it mean for labor costs. So we have indeed done multifaceted triangulation to ensure that our envisioned trajectory stays on course.

Q: Perhaps you can talk about the prospects of artificial intelligence in China. There are concerns that some chips customized for China may not get approval, so just curious how you view this opportunity? Hu Jin: I believe that the revenue we are currently discussing, as well as the revenue in 2024, largely comes from non-Chinese customers. We do ship MI210 that complies with export controls to China. We are working with customers to see if we can provide derivative products of MI300 to support Chinese clients.

Question: I have two questions. The first one is, can you talk about how many MI300 units will be available next year? Could you also elaborate more on the competition? NVIDIA is talking about the pace of new releases every year. They are all set to introduce new technologies. So how will we compete? Yes, what is our strategy in this regard?

Hu Jin: Thank you for your question. I think competition is really good for the market. Since we launched MI300, we have seen NVIDIA continue to accelerate the pace of product releases, and you should expect us to do the same.

If you look at AMD's history, from MI100 to MI300, it's actually about 3 to 4 years. We have introduced multiple products. We are leaders in three-tier technology. We also have a strong partnership with TSMC in packaging. Therefore, we are confident in our roadmap, and we are also working with customers.

The key to any new technology is not just introducing it, but also getting customer adoption. They also need resources to ensure they can use the new technology. So for us, it's about working with customers to align our roadmap with their needs. You should expect us to continue driving competition in this regard. I believe the success of our AI business is not just about 2024. We are really focused on long-term success and revenue trajectory based on winning designs and customer engagement, and we are very satisfied with our long-term success. Therefore, I think you should expect us to continue enhancing the competitiveness of our roadmap. We do believe that AI, especially data center GPUs, will be AMD's biggest growth driver in the future.