Apple lacks a strategy in generative AI and large-scale language models? The reality may not be so.

Recently, Apple has faced many disadvantages. On one hand, Huawei's Mate60 Pro was released ahead of schedule, and on the other hand, renowned investment firm Needham Securities stated that Apple lacks a strategic approach in generative AI and large-scale language models (LLM), falling behind Amazon, Google, and Microsoft in AI competitions.

However, the reality may not be as it seems.

According to media reports on Wednesday, Apple has been increasing its investment budget in the AI sector, with training costs reaching millions of dollars per day.

Although it wasn't until July of this year that media revealed Apple had established the Ajax framework for developing large language models and was secretly developing its own large language model called "Apple GPT," Apple had been paying attention to generative AI much earlier than the public imagined.

Four years ago, Apple's AI chief, John Giannandrea, authorized the formation of a team to develop conversational AI (i.e., large language models), demonstrating Apple's commitment to this field.

Several Apple employees have stated that although Giannandrea expressed skepticism about the potential uses of AI language models driven chatbots, Apple was not completely unprepared for the future explosion of language models.

Million-dollar Investment in Apple GPT with 200 Billion Parameters Led by a Chinese Team

It is reported that Apple's Foundational Models team, also known as the Conversational AI team, is currently led by former Google engineer Pang Ruoming. Pang Ruoming graduated from Shanghai Jiao Tong University with a bachelor's and master's degree and joined Apple in 2021 after working at Google for 15 years.

The team currently consists of 16 members, some of whom are engineers who have worked at Google for many years. Although the team is small, training LLM requires extremely high computing power, costing millions of dollars per day.

In addition, it has been reported that there are at least two other Apple teams researching language and image models. One team is focused on visual intelligence, developing software capable of generating "images, videos, or 3D scenes," while the other team is researching multimodal AI that can process text, images, and videos. Apple plans to integrate LLM into the Siri voice assistant, allowing iPhone users to use simple voice commands to automate tasks that involve multiple steps. For example, this technology would enable users to tell Siri to create a GIF animation using their five most recent photos and send it to a friend. However, currently, iPhone users have to manually complete this process.

This is similar to Google's improvement of their voice assistant. But Apple believes that its improved Ajax GPT language model is better than OpenAI's GPT 3.5 and is expected to be released next year with the new version of the iPhone operating system.

Apple, which has always been closed off, has launched an open-source movement.

It should be noted that developing LLM may be relatively easy, but integrating it into products is more challenging. Unlike some cloud-based competitors, Apple prefers to run software on devices to enhance privacy protection and work efficiency. However, Apple's LLM (including Ajax GPT) is quite large, and due to its size and complexity (over 200 billion parameters), it is difficult to install on an iPhone.

There is a precedent for scaling down large models, such as Google's PaLM2, which has various sizes, including models suitable for devices and standalone use.

Some analysts believe that although Apple's plans are still unclear, they may choose a smaller LLM for privacy reasons.

This brings us to Pang Ruoming.

According to people familiar with Pang Ruoming, his research achievements in neural networks have gained a large following. Neural networks are a subset of machine learning that involves training software to recognize patterns and relationships in data, similar to how the human brain works. Some of Pang Ruoming's notable research involves how neural networks collaborate with mobile processors and how to use parallel computing to train neural networks. Parallel computing is the process of breaking down larger problems into smaller tasks that multiple processors can compute simultaneously.

Pang Ruoming's influence on Apple can be seen from AXLearn, the internal software developed by his team over the past year for training Ajax GPT. AXLearn is a machine learning framework that enables fast training of machine learning models. Part of AXLearn is based on Pang Ruoming's research and optimized for Google Cloud Tensor Processing Units (TPUs).

AXLearn is a branch of JAX, an open-source framework developed by Google researchers. If Apple's Ajax GPT is compared to a house, then AXLearn is the blueprint, and JAX is the pen and paper used to draw the blueprint. The data Apple uses to train the large language model is currently not public.

Reportedly, in July of this year, Apple's Foundational Models team quietly uploaded the code of AXLearn to the code repository GitHub, allowing the public to use it to train their own large language models without having to build everything from scratch. Apple's reasons for publicly releasing the AXLearn code are still unclear, but the company typically does so in the hope that other engineers can improve the model. Prior to Gianandrea's joining Apple, the decision to open up the source code for commercial use is unusual for the traditionally secretive Apple.

Aggressively Poaching from Google and Meta

Apple is also actively "poaching" from Google and Meta's AI teams.

Since the AXLearn code was uploaded to GitHub in July, it has been improved by 18 people, at least 12 of whom have joined Apple's machine learning team in the past two years. Among these individuals, 7 have previously worked at Google or Meta.

Wall Street News previously mentioned that the "Android of large models" is also facing challenges, with internal power struggles at Meta and half of the Llama core team leaving.

Apple's AI training budget is "millions of dollars per day," and the core language model team is led by Chinese professionals.

Million-dollar Investment in Apple GPT with 200 Billion Parameters Led by a Chinese Team

Aggressively Poaching from Google and Meta