"I'm excited about the huge productivity that's going to be created, but worried about the impact it's going to have on the world," he says. Lao Wang and I have been on the startup road for nearly 20 years, and since he is determined to embrace this big wave, I must support it."
Since ChatGPT's star rose, it has not only caused an uproar in the tech community, but also ignited the entrepreneurial passion in the venture capital community. The large-scale pre-training model (hereinafter referred to as "big model") circuit of domestic enterprises has also begun to enter the stage of white-hot competition. In the past, Baidu, Byte and other giants fought for power, and in the later, a hundred schools of thought contended with each other. They began to compete with each other by taking advantage of their own application scenarios and data advantages.
Next, Data Ape takes a look at the many players who have recently jumped into the big model space.
Iron racetrack, water players
The big model track is really "hot" and a lot of big names have entered the race. There is no doubt that this wave of AI entrepreneurship is an elite race. The "Hero post" by Wang Huiwen, the former co-founder of Meituan, directly opened the prelude of domestic ChatGPT entrepreneurship.
1. Wang Huiwen, former co-founder of Meituan: "AI hero Post"
Wang Huiwen left Meituan after the entrepreneurial intention, after thinking about Crypto, Web3, finally decided to march into ChatGPT. "Even if there is only one person, I will go," Wang said solemnly on the social media platform on Feb 13, 2023. This is the signal that Wang Huiwen is coming to OpenAI. Subsequently, he published a high-profile "AI hero post" on Jike platform, inviting AI research and development talents to build China's OpenAI and Beijing Light Year Beyond Technology Co., LTD. Wang Huiwen admitted that his personal investment of 50 million US dollars, and personal body does not take the shares, the capital accounted for 25% of the shares, 75% of the shares for inviting top research and development talents. One idea, one company, has attracted $230 million in subscriptions for the next round before it takes shape.
2. Go out and ask founder Li Zhifei: Build China's OpenAI
It is understood that when Wang released the "AI hero post", Li Zhifei, the founder and CEO who went out to ask, was sitting opposite her. Li Zhifei said he would definitely participate. Zhifei Li, who used to be a scientist at Google headquarters, has nearly 20 years of research and industry experience in the field of artificial intelligence speech semantic technology. In 2012, he returned to China and founded the artificial intelligence company Qiaowen.com, which is a rare search product except Baidu and Sogou. Its project valuation once reached 1 billion dollars. Li Zhifei led a team that trained a large model of GPT-3, the Chinese version of UCLAL, in 2020. There is no doubt that the big model is his main battleground.
Zhifei Li said, "We see the promise of a universal cognitive model. The problems of language, knowledge, logic, planning, all saw the possibility of being solved." He also drew attention from the tech circle recently by announcing that he would start a business in the field of large models and build China's OpenAI.
3. Amazon Li Mu and Alex Smola quit: Master and apprentice together to start a business again
On March 7, the news that Li Mu Dashen suspected to have joined the big model business was instantly swept the screen on the social network. His startup project is called Boson.ai. According to the official website, the content of entrepreneurship is related to the application of large models. He co-founded with Li Mu's mentor, another former Amazon AI guru: Alex Smola. According to his linkedin profile, the former Amazon VP and distinguished scientist is the new company's CEO. According to Alex's linkedin page, "We're working on something big. If you are interested in the scalable base model, please contact me." Notably, on the company's GitHub page, Amazon's chief scientist Li Mu also contributed code.
4. Zhou Bowen, former head of JD AI: Big models are not exclusive to Dachang
On March 1, Xianyuan Technology, a technology service company focused on enterprise innovation and digital intelligence, announced that it had completed angel round financing of hundreds of millions of yuan. This round of financing by Qiming Venture Capital led investment, Jingwei Venture Capital with investment. Zhou Bowen, the founder of Xanyuan Technology, has a lot going for him. He has more than 20 years of research experience in natural language generation, dialogue and interactive artificial intelligence. Proposed in 2016, "natural language representation mechanism of self-attention integration of multiple mechanisms" is one of the core ideas of Transformer architecture. Before founding Xianyuan Technology, Zhou Bowen was the former senior vice president of JD Group, chairman of the group's technical committee, president of Cloud and AI, and director of JD Artificial Intelligence Research Institute, which is simply understood as the "head" of JD AI. Prior to JD.com, Zhou served as director of the Basic Research Institute for Artificial Intelligence at IBM Research's U.S. headquarters.
5. Wang Xiaochuan, former CEO of Sogou: OpenAI's success is, above all, a triumph of technological idealism
Last month, Wang Xiaochuan, the former CEO of Sogou, made a wechat post in which he hinted that he would end up in the "China OpenAI" war. Wang won a gold medal at the International Olympic Informatics Competition in 1996 and entered Tsinghua University. After graduating from Sohu, he led the launch of Sogou Search in 2004 and became China's youngest Internet executive at the age of 27. These technical talents gradually translated into actual technical prowess -- the success of Sogou input was not only significant for Sogou, it also opened many doors for Chinese language AI as one of the first Chinese input methods. Wang uses the term "technological idealism" to define startups like OpenAI. It is not just an entrepreneurial project, but an exhaustive experiment in technological ideals.
In June 2022, Wang Xiaochuan set up an artificial intelligence technology company, Beijing Baifang Zhongzhi Information Technology Partnership, and held 80% of the shares. It is understood that when the media asked Wang Xiaochuan to confirm whether he would return to entrepreneurship and make large AI models, Wang Xiaochuan admitted that he was in "fast preparation".
6. Li Yan, the former AI core figure of Kuaishou, started his own business and joined the big model circuit.
Li Yan is an old employee of Kuaishou whose job number is around 75. He is also a core figure in the research and development of Kuaishou AI technology. In November 2015, with the support of Su Hua, then CEO of Kuaishou, Li Yan set up the first internal Deep Learning department DL (Deep Learning) group, with the goal of building an algorithm model to identify illegal video content.
Li resigned in 2021 and set up an AI company called Yuanshi Technology in the second half of 2022, focusing on the research and development of large multi-mode models. As early as 2018, Li Yan publicly stressed the importance of multimodal technology.
The ChatGPT fire spread to the entire venture capital world. So what is it about the big model that makes it such a hot startup track and attracts so many big names? What role does the big model track play in AI? We can look at the rest of the analysis with this question in mind.
Giant or startup?
With the popularity of ChatGPT developed by OpenAI, domestic and foreign giants are aiming at the track. Google, Baidu, JD.com, iFlytek and other domestic Internet companies have said that they already have a layout on ChatGPT and have related products to launch.
Now, the grand model circuit is crowded with players from various fields, such as magnate, magnate, overseas returnee/Dachang executive, small startup transformation, professor and soy sauce. The battle for the top spot on the big model circuit is well and truly under way, but the winner is still to be seen.
At present, Google, Microsoft, Amazon, Baidu, Ali, Tencent and other technology giants have significant advantages in the development of large models. They all have relevant strong technical resources and capabilities, and have carried out layout and investment in general large models.
Is it easy for startups to get a piece of the big boys?
In general, the watershed of the big model mainly focuses on four aspects: technology research and development, data and algorithm resources, commercialization ability, talent reserve and management ability.
First, in terms of technological research and development capabilities.
As far as we can see, the giants have stronger technology research and development ability and richer resources, and can invest more human, material and financial resources to carry out the research and development of large models. They have better data, algorithms, hardware and other technical support, and can more quickly advance the research and application of large models.
The technological gap between giants and startups depends in part on the leaders and latecomers in AI. Giant companies such as Google, Facebook and Microsoft have a lot of data, computing resources and technical experience in the field of artificial intelligence, so they have more ability to train and optimize large models and promote the development of artificial intelligence technology.
However, startups usually lack these resources, need to make more efforts in technology research and development, and face greater technical difficulties and uncertainties.
Secondly, in terms of data and algorithm resources.
The giants have richer and more sophisticated data and algorithm resources to better support the training and reasoning of large models. They are able to use their platform and business advantages to accumulate huge amounts of data, and on the basis of this algorithm development and optimization. However, startups usually cannot obtain such data and algorithm resources, so they need to accumulate data and optimize algorithms through their own efforts, which requires more time and energy.
Take OpenAI's GPT-3, released in 2020, which is a large model with 175 billion parameters. In terms of computing power, the training and use of artificial intelligence models require strong computing power, which requires a large number of high-performance Gpus to support. In terms of data, ChatGPT's training is known to use about 45 terabytes of data, which contains up to nearly 1 trillion words of text content.
But because large models require large amounts of computing and storage resources, financial and technical constraints can be a limiting factor for startups.
In addition, in terms of commercialization ability.
With stronger commercialization ability and more perfect commercialization channels, the giant can better apply the large model to the commercial field and realize commercial value. They can leverage their brand and user base to apply big models to search, recommendation, advertising and so on, and monetize them commercially.
However, startups usually lack such commercialization capabilities and channels, so they need to spend more time and energy to explore commercialization paths and expand business cooperation.
Finally, in terms of talent reserve and management ability
With a stronger talent reserve and more perfect management ability, the giant can better attract and manage high-end talents and build a more competitive team. They can attract more high-end talents through their own brand and reputation, and enhance team collaboration efficiency and innovation ability through their own management experience and system construction.
However, startups usually lack such talent reserve and management ability, so they need to make more efforts to build high-end team and improve management ability.
Although, as the technology continues to mature, more and more startups also begin to use cloud computing, distributed computing and other technologies to accelerate the training and optimization of large models, and constantly challenge the technology monopoly of giant companies. There are also many opportunities for startups in the field of AI, but more innovation and courage are needed to break down technological barriers and market monopolies.
In short, as an important technology in the field of artificial intelligence, large model has become a hot entrepreneurial field. Giant companies have more resources and technical experience, but startups can also constantly challenge technological monopolies and create new business value through innovation and courage.
Hit the hot spot, or are you really good at it?
During the National Two sessions this year, artificial intelligence has become a hot topic. In the motions and proposals submitted on behalf of the members, words such as artificial intelligence big model and ChatGPT appear frequently. Many deputies to the National People's Congress and members of the National Committee of the Chinese People's Political Consultative Conference focused on "how to develop China's own ChatGPT" and offered suggestions for the development of AI.
Wang Zhigang, minister of science and Technology, said the country has done a lot of laying out in large models of AI and there have been some achievements, but there may still be work to be done to achieve the effect of ChatGPT. "I also hope that our research institutes, enterprises and researchers can make further development and progress and make China's contribution to the international community."
Zhou Hongyi, a member of the National Committee of the Chinese People's Political Consultative Conference (CPPCC) and founder of 360 Group, suggested in his proposal that a collaborative innovation model featuring large technology enterprises and key scientific research institutions should be established to create China's "Microsoft +OpenAI" combination. In view of multiple technical routes, we will set up long-term open source projects of large-scale AI models at the national level, and create an open innovation ecology of open source crowdsourcing.
Liu Qingfeng, a deputy to the National People's Congress and chairman of IFlytek, submitted eight proposals, including speeding up the construction of China's cognitive intelligence grand model and carrying out relevant ethical research. He said China should "accelerate the construction of a large model of cognitive intelligence, let the industry enjoy AI dividends as soon as possible on autonomous and controllable platforms, and let everyone have AI assistants".
It can be seen that the country has also paid attention to the development of large models, but now the emergence of all kinds of people want to enter the big model race, whether it is hot spot or real strength?
Notably, Wang Huiwen, the former co-founder of Meituan, made bold statements that were not viewed as positive by Zheng Hongda, an analyst at Haitong Securities, who said, "Pure bullshit. What is $50 million enough for? Five million dollars for a big model, 10 sessions? People on the Internet don't know anything. They only know marketing. They don't feel comfortable at all."
On the whole, most of the people who announced to join the circuit are professionals with high reputation and strength in the field of artificial intelligence. They flocked to the large model circuit, mainly because the rise of large model circuit and related technological progress provide more opportunities for them.
Although there is no shortage of people to crash hot spots, due to the increase of data volume and model complexity, large model circuit puts forward higher requirements on computing resources, algorithm design, model optimization and other aspects. Therefore, the professionals who flock to the large model circuit must have deep theoretical knowledge of artificial intelligence, rich practical experience and powerful computing resources. The large model circuit has become an important track in the field of AI racing, and their strength and experience will inject more momentum into the development of the field, but it will take time to determine who will be the competition.