Track Hyper | Edge AI Model Deployment: How does Apple do it?

Wallstreetcn
2024.04.30 10:11
portai
I'm PortAI, I can summarize articles.

The closed Apple business empire has opened a crack

Author: Zhou Yuan / Wall Street News

AI has become the new technological focus leading the "rebirth" of the smartphone industry lacking technological innovation.

Apple, the first to implement the AI voice assistant "Siri" on the device side, has changed its attitude towards AI after 2024, abandoning its intentional neglect of AI in the past two years and showing more interest in AI.

Recently, in its press release for the new MacBook Air, Apple explicitly mentioned that it is the "world's best consumer-grade laptop for AI," a term that has been very rare in the past two years. Previously, Apple seemed to avoid using the term "AI" and often used ML (Machine Learning) instead.

Unlike many domestic counterparts, Apple promotes the implementation of AI technology on the device side through a "paper-first" approach.

In March, Apple's Siri team published a paper titled "A Multimodal Approach to Device-Directed Speech Detection Using Large Language Models," mainly discussing simplifying "Hey Siri" to "Siri" in 2023, and further simplifying it to seamlessly integrate human-machine conversations with Apple phones.

This is just a small move by Apple to promote the implementation of AI on the device side, considering that Siri was launched in 2011.

The real demonstration of Apple's layout and achievements in AI technology on the device side was on April 24th: Apple introduced OpenELM. This is a new open-source series of large language models (LLMs) that can run text generation tasks entirely on a single device without the need for cloud servers.

In other words, OpenELM is the large-scale model deployment of AI phones on the device side that domestic smartphone manufacturers are talking about. Recent news has been continuously released, mainly related to the iOS 18 to be launched at Apple's WWDC (Worldwide Developers Conference) in June this year, and what device-side AI features will be embedded.

In 2024, Apple truly began to implement its device-side AI strategy. Although Apple has never described it this way, from a practical perspective, Apple has actually started "AII in AI."

Following Microsoft to Promote AI Model Slimming

As the pioneer of defining new products for smartphones and the creator of the mobile internet industry, Apple's software layer receives less attention than hardware but is equally important in terms of technological iteration.

On April 24th, Apple released the OpenELM (Open-source Efficient Language Models) series models on Hugging Face, the world's largest AI open-source community. This is Apple's most important move in the AI field in the past year OpenELM has a total of 8 models: 4 pre-trained models and 4 instruction-adjusted models, with parameter counts of 270 million (0.27B), 450 million (0.45B), 1.1 billion (1.1B), and 3 billion (3B) respectively.

The so-called parameters refer to the number of artificial neural connections in the Large Language Model (LLM). Generally, the more parameters, the stronger the performance and the more functions.

From the perspective of parameter size, it is easy to see that the OpenELM model is actually designed for edge AI.

What is pre-training? This is a method for generating coherent text in LLM, which belongs to predictive practice; instruction adjustment is a method to make LLM produce outputs with stronger relevance to specific user requests.

A paper published by Apple's AI team pointed out that the benchmark test results of the OpenELM model were run on a workstation equipped with an Intel i9-13900KF CPU and an NVIDIA RTX 4090 GPU, running Ubuntu 22.04; at the same time, Apple also conducted benchmark tests on a MacBook Pro equipped with an M2 Max chip, 64GB RAM, and running MacOS 14.4.1.

The core advantage of the OpenELM model lies in its hierarchical scaling strategy, which significantly improves the model's accuracy by effectively allocating parameters in each layer of the Transformer model.

According to recent test result statistics, OpenELM scored 84.9% in 10 ARC-C benchmark tests, 68.8% in 5 MMLU tests, and 76.7% in 5 HellaSwag tests.

This is not Apple's first move in AI software.

In October 2023, Apple quietly released the open-source language model Ferret with multimodal capabilities. Compared to the model on April 24th last year, the technical framework is relatively more complete, covering data processing, model building, training/adjustment, and optimization.

Whether it is a coincidence or for other reasons, on April 23rd, Microsoft also released the Phi-3 Mini model that can run entirely on smartphones (iPhone 15 Pro): with a parameter size of 3.8 billion (3.8B), its performance can rival models like Mixtral 8x7B and GPT-3.5.

More importantly, both the Phi-3 Mini model and the OpenELM model can run entirely on the edge of smart terminals without the need for internet connection.

This indicates that Apple is officially starting to deploy AI LLM on the edge, with its smallest model having only 0.27B parameters, which is less than 10% compared to domestic edge LLM models Domestically, in order to achieve localized operation of LLM on the terminal side, it is usually necessary to increase the compression ratio of LLM to "squeeze" LLM into limited memory space (12GB-24GB). Apple, on the other hand, directly reduced the parameter size of LLM, but the training and inference accuracy did not decrease accordingly.

Although in March, Apple introduced the MM1 large model with a parameter size of up to 30B (multi-modal large language model) - the Forret model. However, from the LLM large model framework open-sourced by Apple, it is evident that Apple is vigorously promoting the "slimming plan" for LLM.

Unprecedented action with unclear intentions

Obviously, since October 2023, Apple has begun to promote the landing of AI technology on terminals, aiming to "run artificial intelligence locally on Apple devices." The paper published by Apple in January this year, "LLM in a flash: Efficient Large Language Model Inference with Limited Memory," more clearly demonstrates Apple's efforts towards this goal.

Through the OpenELM model, Apple showcases its technological and goal framework in the AI field: OpenELM is designed for terminal devices, which can optimize Apple's current multi-terminal experience - currently demonstrated on Apple laptops; secondly, it balances performance and efficiency on a small scale of LLM; thirdly, it is open-source.

Nevertheless, it is currently unclear whether Apple's self-developed LLM or some technical frameworks will be integrated into iOS 18, which is set to be released at WWDC 24 in June this year. This is because Apple is still in contact with Google and OpenAI, and it is not ruled out that these competing AI technologies will be integrated into iOS.

It is currently difficult for the outside world to know the content of Apple's communication with Google and OpenAI, and it is also unknown which company Apple will reach a commercial AI technology cooperation with. In addition to these two well-known technology companies, Apple is also in contact with an AI technology startup called "Anthropic."

Apple's promotion of technical cooperation with partners helps accelerate Apple's entry into the chatbot field (contact with Google mainly focuses on the robot Gemini chat), while also mitigating risks. By outsourcing generative AI functions to another company, Tim Cook may be able to reduce certain responsibilities of the Apple platform.

In fact, the attention to the open-source OpenELM model is not only because it is an "efficient language model" launched by Apple, but also because this model reduces the parameter size and can be deployed locally on smart terminals without the need for cloud networking.

Is this preparing for AI phones?

AI phones are considered a major technological revolution in the domestic industry, but currently, in terms of user experience, AI phones have weak perception and seem to have no difference from "traditional" smartphones The position of Apple Inc. in the smartphone industry goes without saying, so what exactly is Apple's edge AI like? What technology framework is being used? What kind of amazing AI experience can it bring? This is what the industry is looking forward to.

It is worth mentioning that at the 2024 Apple shareholders meeting, Cook stated that Apple will make "significant progress" in the generative AI field this year. In addition, Apple has traditionally built its business empire with a closed ecosystem of "integration of software and hardware", but this time it has chosen to open-source the edge AI technology framework, which is an unprecedented change.

What does this change really mean? I'm afraid we'll have to wait until WWDC 24 to find out