KUAISHOU joins the battle of large-scale model applications

Flexing muscles

Author | Liu Baodan

Editor | Zhou Zhiyu

Over a month ago, Byte released the Dou Bao large model family, directly bringing large model prices into the "millime era." Now, Kuaishou has also brought out its bottom-line large model matrix, targeting tool applications.

On July 6th, at the "New AI·New Applications·New Ecosystem" forum, Kuaishou's large models made their first collective appearance, with multiple new features of products such as video generation large models and image generation large models officially released.

Gai Kun, Senior Vice President of Kuaishou and Head of the Main Station Business and Community Science Line, stated that Kuaishou has built a large model matrix centered around Kuaishou's language large models, recommendation large models, and visual generation large models, covering multiple aspects such as content understanding, distribution, and generation, and deeply serving Kuaishou's commercial ecosystem.

Kuaishou has launched a self-developed + application large model matrix for the AI era, focusing on improving Kuaishou's own business efficiency and performance growth. Kuaishou has finally begun to make a move in the large model market.

Debut

After much anticipation, Kuaishou's large models have finally made their debut.

At the event, Kuaishou announced that the self-developed visual generation large model product Ke Ling AI Web version has officially launched. This is the third major upgrade that Ke Ling AI has received within a month, meaning that ordinary users can log in to the website to try out the functions.

The basic model of Ke Ling AI has also been upgraded, introducing a clearer high-definition version, as well as new editing capabilities such as start and end frame control, camera control, etc. At the same time, the length of videos that creators can generate in a single session has been increased to 10 seconds, the longest duration achievable for users in the industry.

According to the introduction, based on real-world physical rules, Ke Ling's generated videos exhibit movie-level quality and dynamic effects, even simulating large-scale physical movements, breaking the limitations of traditional video generation technologies and receiving praise both domestically and internationally.

Wan Pengfei, Head of Kuaishou's Visual Generation and Interaction Center, stated that in the latest release of Ke Ling AI large models, the capabilities in running generation, generation duration, physical laws, video quality, command response, image-to-video, video controllability, and other seven aspects have been further upgraded, allowing for the generation of higher-definition and more controllable 10-second and longer videos in a single session.

Since the official release of the text-to-video function on June 6th, Ke Ling has experienced rapid development. During the CVPR (IEEE Conference on Computer Vision and Pattern Recognition), Ke Ling introduced multiple new features such as image-to-video and video continuation.

At this forum, Kuaishou also officially announced that Ke Tu will be open-sourced.

Gai Kun explained that Ke Tu large models integrate Kuaishou's deep accumulation in the field of large language models, becoming the most knowledgeable Chinese text-to-image model through training on billions of Chinese language corpora. Its comprehensive performance surpasses open-source models such as SDXL/SD3 and closed-source models like Midjourney, setting a new benchmark for image generation in the Chinese context.

Regarding the open-source initiative, Gai Kun stated that this move aims to stimulate industry vitality and build a more prosperous text-to-image large model community ecosystem Prospects

From the beginning, KUAISHOU has been clear that the core goal of the large model is to serve the scenarios and commercialization within the KUAISHOU ecosystem.

This is mainly reflected in two aspects: content production, where KUAISHOU aims to create a "new generation AIGC creation, material tools" and a low-threshold, intelligent content production experience; and content consumption, where KUAISHOU plans to upgrade the content understanding and distribution system to enhance user consumption experience.

The former mainly serves the commercial efficiency of KUAISHOU. It is understood that the video script generation tool based on the KUAISHOU large model, combined with digital human technology, helps KUAISHOU's commercial advertisers to generate videos and live content at low cost, and improves lead conversion efficiency.

KUAISHOU data shows that the consumption peak of AIGC marketing materials reached a milestone of over 20 million in a single day in June this year, demonstrating the huge potential of large models in commercial scenarios.

Liu Xiaotou, responsible for KUAISHOU's commercial external circulation and AI commercial products, revealed that in the past six months, nearly 20,000 merchants have achieved intelligent operations on the KUAISHOU platform with the help of large model capabilities. Compared to January this year, the number of active AIGC customers in June increased by 8 times, the monthly GMV scale increased by 64 times, and the platform's AIGC advertising revenue scale increased by 12 times.

In terms of content production, the larger market prospects come from C-end users and related industries including short dramas.

According to Gai Kun, as of now, over 500,000 users have applied for the test qualification of Kelin, with a video generation quantity of 7 million. Especially works created by users through "Kelin" such as "resurrecting old photos" have become popular across the internet due to their touching power.

At the annual performance conference in March, KUAISHOU's founder and CEO Cheng Yixiao stated that after the company launched its AI strategy in 2023, it has been gradually advancing the research and training of self-developed large models. For WenSheng videos, KUAISHOU initiated special research and development at the end of last year.

"This is a huge opportunity for the short video ecosystem. In the future, KUAISHOU will integrate generation models and producer tools to continuously help creators reduce the threshold of creation, and improve the quality and efficiency of short video production," emphasized Cheng Yixiao.

In the industry aspect, KUAISHOU's large model has already been used in short drama production. The first domestic AIGC original fantasy short drama "Shanhai Qijing: Pibo Zhanlang", which is supported by Kelin's deep technology, has released a trailer and will be launched soon.

In response, Zhang Di, Vice President of KUAISHOU and head of the large model team, said, "Maybe half a year ago, no one could imagine using AIGC to produce movies, (now it's) here!" In his view, utilizing AI technology can significantly improve the efficiency of short drama production, production, and operation.

In terms of content consumption, the SIM large model plays a bigger role in recommendations. According to Gai Kun, with a scale of one trillion parameters, this model has become one of the world's leading recommendation systems. Its next-generation architecture ACT is expected to add 400 million minutes of user viewing time to the KUAISHOU App daily, significantly enhancing user stickiness and activity For the future, Kuaishou is very confident. Gai Kun stated that Kuaishou will continue to increase its investment in the field of AI and vigorously promote technological innovation