Explore the turbulent phase of AI large models, with key talent departures from major tech companies like Tencent, ByteDance, and Alibaba. Discover insights into the challenges faced by the industry, including technological bottlenecks, commercialization struggles, and the upcoming phase of consolidation.
In the blink of an eye, it's been two years since ChatGPT sparked the wave of large AI models, but the huge business opportunities that many people had been eagerly awaiting have yet to materialize.
Unlike the widespread acclaim of last year, the AI large model entrepreneurship is now entering a turbulent phase. Since the second half of this year, frequent reports have emerged of key technical talents from major AI companies in China leaving, signaling a period of instability.
First, Huang Wenhao, co-founder of Zero One Technology, resigned. Then, Zhou Chang, the technical lead of Alibaba's Tongyi Qianwen large model, shifted to ByteDance. Other departures include Liu Wei, the technical lead for Tencent's Hunyuan model, and Yan Shuicheng, an AI expert from Kunlun Wanwei.
The movement of talent is a barometer of the AI industry’s development. Behind these departures lie multiple challenges faced by large AI models, such as the slowing pace of technological iteration and less-than-ideal commercialization prospects. Everyone is actively adjusting, looking for what they believe to be the right path or direction.
The AI large model industry, currently in a period of development confusion, is undergoing a restructuring of technology, capital, and talent. A quiet reshuffle is taking place, and in the future, industry consolidation will become increasingly obvious.
In fact, this kind of story plays out during every wave of technological advancement. One industry consensus is that, after intense competition, only a handful of companies working on large models will remain as key players. Only those deeply involved in the development of unicorn companies will become the final winners.
This is a brutal race with unpredictable outcomes. Being part of it means giving it your all.
Turbulence
In the competition for cutting-edge technology, talent is arguably the most important competitive edge. In the rapidly evolving AI large model space, talent is key to whether the underlying technology and products can keep up, and ultimately, to joining the top tier of the industry.
Several investors told Wall Street Insights that in this wave of AI large models, the most important factor they consider when evaluating investment projects is still the talent team, as this determines whether the company can sustain its technological iteration capabilities.
However, whether at major corporations or AI startups, the talent that flocked to these companies during the initial hype is now making new choices, either voluntarily or involuntarily, due to the challenges posed by reality.
Wall Street Insights has confirmed that Liu Wei, a distinguished scientist at Tencent and one of the technical leads for the Hunyuan large model, recently left Tencent. There are reports indicating that Liu Wei has started a new venture in Singapore, focusing on the video generation field.
Recently, Kunlun Wanwei also announced that Yan Shuicheng will no longer serve as the head of the 2050 Global Research Institute, but will instead become the institute’s honorary advisor. As an expert in computer vision and machine learning, Yan Shuicheng had joined Kunlun Wanwei in September of the previous year, where he helped build the 2050 Global Research Institute from scratch and led deep research on next-generation model architectures and Agents.
Amid this talent turmoil, more people are choosing to leave AI startups for larger companies, or move from one major corporation to another.
As things stand, ByteDance, which has been actively preparing its large model research institute, seems to be the biggest winner in this wave of talent movement.
Qin Yujia from Mobi Intelligence, after being reported to have left, joined ByteDance’s large model research institute in the second half of 2024. In August of this year, Huang Wenhao, co-founder of Zero One Technology, joined ByteDance’s model algorithm team, Seed, reporting to ByteDance’s large model head, Zhu Wenjia. In October, Zhou Chang, the technical lead for Alibaba’s Tongyi Qianwen large model, was also reported to have joined ByteDance.
It is worth mentioning that Zhou Chang’s departure even led to a lawsuit. On November 13, there were reports stating that Zhou Chang violated a non-compete agreement, and Alibaba had filed for labor dispute arbitration.
Not long ago, Yang Zhilin, the founder of Moon's Dark Side, commented on the phenomenon of talent returning to larger companies, stating that this is normal. “The industry has entered a new phase. Initially, many companies were involved, but now, fewer companies are working on it. Going forward, what everyone is working on will gradually become different. I think this is an inevitable trend.”
Training large models requires substantial investment, and even major companies must make trade-offs. At the beginning of the year, the release of the “text-to-video” model Sora sparked a global competition in AI video generation. However, OpenAI announced that the update for Sora would be delayed due to a shortage of computing power, meaning it has yet to be made publicly available.
Clearly, without clear application scenarios and commercial returns, "Sora-like" video generation models will not become a major focus for Tencent. In this context, Liu Wei, who wanted to make strides in video generation, naturally sought new opportunities.
A domestic investment leader told Wall Street Insights that talent movements have also been frequent overseas this year, mainly due to the short-term technical bottlenecks and slow commercialization of large AI models. In the future, many AI startups in China will face the risk of a broken funding chain and may be acquired by larger companies.
Additionally, Shen Meng, executive director of Changchun Capital, told Wall Street Insights that behind the frequent talent flow is the lack of deep R&D innovation in domestic large models, which has lowered the barriers to movement between teams. It also reflects the industry's restlessness and the bubble in the number of models.
The Future
Artificial intelligence, as a discipline, has been around for more than 60 years, with multiple waves of technological revolutions. At the start of some previous waves, such as the AI wave in 2016, it was as heated as the current AI large model trend. Tech companies back then also used every tactic to recruit top AI talent.
However, capital's patience is far from enough to sustain AI scientists' research. When AI technology fails to deliver commercial returns, whether it's internet giants or AI startups, they start returning to rationality and reassessing the "value" of AI talent, leading to faster talent turnover.
History tends to repeat itself. After a period of hype, the AI large model industry will eventually enter a phase of elimination.
In October 2022, ChatGPT triggered a global wave of large AI models, sparking a “battle of models” in China. Startups sprang up like mushrooms after the rain, and internet giants rushed in with slogans like “All in AI.”
However, after a year or two of exploration, more and more companies have realized that only a few lucky survivors will make it through to the end.
Baidu's founder, Li Yanhong, previously candidly pointed out that, just like many previous technological waves, after the initial excitement fades, the bubble of generative AI technology is inevitable. Then, when the technology does not meet the high expectations from the excitement phase, disappointment sets in.
Li Yanhong predicted that, during the AI bubble burst phase, pseudo-innovations that cannot meet market demands will be eliminated. Afterward, 1% of the companies will emerge and continue to grow, creating enormous value for society. “Right now, we’re just experiencing this stage. The industry is calmer and healthier than last year.”
All AI large model teams now stand at a crossroads of choices.
The most noticeable event in the domestic large model industry this year was the wave of price cuts in the first half of the year. Zhang Peng, CEO of Zhipu, believes that this phenomenon is because companies can’t find differentiated value, so they are forced to compete on price.
Zhang Peng revealed that many industry leaders with self-developed large models have recently changed direction. They realized that it's not as easy as building a team and using an open-source model to run it. Instead, it's better to purchase models.
Additionally, in early October, there were reports that several companies in the "AI Six Tigers," such as Zhipu AI, Zero One Technology, MiniMax, Baichuan Intelligence, Moon's Dark Side, and Jiyue Xingchen, have decided to gradually abandon pre-trained models, reduce the size of their pre-training algorithm teams, and shift their focus to AI applications.
Yang Zhilin believes that pre-training still has half a generation or even a full generation of potential, and that this space will be released next year. The leading models next year will push pre-training to a more advanced stage. The most important focus now is reinforcement learning, which is still scaling, just in a different way.
At the same time, Moon's Dark Side has also chosen to streamline its business, focusing on perfecting one product. Yang Zhilin revealed that Moon’s Dark Side will decide which business has the highest potential for growth based on the situation in the U.S. market. The focus will be on the most promising projects, particularly those most aligned with the mission of AGI.
The ultimate goal of AI research is to achieve Artificial General Intelligence (AGI).
“Rome has always existed, but the path to it is different,” said Kang Zhanhui, director of Tencent's Machine Learning Platform, recently. "Everyone is thinking about AGI, and the next two or three years will be a good time to plan, but the path each company takes may differ. For instance, Tencent has chosen to pursue a hybrid expert model (MoE)."
However, no matter what path is chosen, all AI pioneers face the common challenge of high computational costs without a commercial return model to offset these costs in the short term.
Shen Meng, executive director of Changchun Capital, told Wall Street Insights that large models will enter a painful phase of survival of the fittest, where technologies and products that advance towards core technological foundations will have a greater chance of gaining market recognition.
This is a once-in-a-century technological revolution, but without sufficiently mature technology and reliable business models, large AI models are likely to follow the same path as VR and the Metaverse in recent years—after the initial hype, they fizzle out and leave little more than disappointment behind.
Now the elimination phase has begun. Before the “iPhone moment of AI” arrives, all companies will need to show tremendous patience and keen insight to face the brutal challenge ahead.