
For the past nine months, several Chinese artificial intelligence firms have been competing to join Alibaba and DeepSeek at the top of China’s AI industry. Their newest peer is Moonshot AI, a Beijing-based firm run by a U.S.-trained academic hotshot with a fondness for 1970s rock bands.

Moonshot AI rose to the fore over the summer with the release of a new large language model called K2, which powers the company’s Kimi chatbot and can handle complex tasks like coding and financial analysis. Earlier this month, that model shot to the top 10 of international leaderboards where users rank models based on quality.
“K2 is noticeably different in how it communicates,” says Andrew Carr, a former OpenAI researcher. “It is fresh, interesting, and most importantly not sycophantic. It will reason with you about a problem and push back in ways other models don’t.”
Moonshot’s rise underscores the self-propelling power of China’s AI ecosystem, in which many of the leading players allow their models to be “open-source” — meaning their underlying code is available for others to view and customize. Engineers at Moonshot AI say the K2 model relied heavily on that of DeepSeek, the Hangzhou-based firm which shocked the tech world earlier this year when it released a new model on a par with those produced by U.S. rivals.
K2 is noticeably different in how it communicates. It is fresh, interesting, and most importantly not sycophantic. It will reason with you about a problem and push back in ways other models don’t.
Andrew Carr, a former OpenAI researcher
The Wire is periodically highlighting China’s AI unicorns. We’ve previously looked at Zhipu AI (today going by Z.ai), Baichuan, Minimax, and StepFun — all of which now make open-source models. This week: Moonshot AI.

BLAST OFF
Yang Zhilin, Moonshot AI’s founder and chief executive, is no stranger to the computer science elite. He earned a bachelors from Tsinghua University in 2015 before moving to the U.S. for a doctorate at Carnegie Mellon University. He later worked at Google and Meta, and coauthored papers on computer reasoning and pattern recognition with Turing Award winners Yoshua Bengio, a professor at the University of Montreal, and Yann LeCun, the chief AI scientist at Meta.

Yang counts leading American and Chinese researchers among his mentors: Tang Jie, the co-founder of Z.ai, Ruslan Salakhutdinov, who directed AI research at Apple, and William W. Cohen, a scientist at Google DeepMind.
“His unique characteristic is that he was very strong scientifically but also very strong in terms of coding and implementation,” says Salakhutdinov, a celebrated scientist who advised Yang’s doctoral studies at Carnegie Mellon. Yang was, he adds, an “absolutely brilliant student.”
A paper Yang led called XLNet, which offered an improved way of training large language models, has been cited more than 10,000 times. Salakhutdinov says Yang could have pursued postdoctoral opportunities at Stanford or MIT, and that Apple tried to recruit him after his doctoral studies.
Apple did not respond to a request for comment.

But Yang returned to China in 2019, having finished his PhD in just four years — two years shorter than the U.S. standard. “If I don’t do a start up, I will always regret it,” Salakhutdinov recalls Yang saying.
Yang founded Moonshot AI in Beijing in 2023, alongside fellow Tsinghua alumni Zhang Yutao, Zhou Xinyu — with whom he played in a rock band called Splay — and Wu Yuxin. All four men are still at the firm, whose Chinese name translates to Dark Side of the Moon, a reference to a 1970s album by British rockers Pink Floyd. Its meeting rooms are named after bands like the Rolling Stones and Led Zeppelin; one is called Splay.
That October, the company announced its first LLM and chatbot, Kimi, named after Yang, who goes by Kimi in English. Five months later, it upgraded the model to be able to process two million Chinese characters, ten times as many as when it first launched.
“It was one of the hottest rockstars of 2024,” says Tony Peng, author of the Recode China AI newsletter. He says the company lost steam after DeepSeek made waves in January, but regained momentum thanks to K2’s cost effectiveness and potential for so-called ‘agentic use,’ where the model performs tasks on its own.

One of the company’s main breakthroughs with K2 was to stretch more intelligence out of each piece of training data, Yang said in an interview this summer. “With more parameters, the model can learn the same amount of data faster and more effectively.”
Kimi was originally closed-source, meaning that Moonshot AI kept the way it trained the model proprietary. But after DeepSeek shot to prominence with its cheaply trained open-source models, Moonshot AI tweaked Kimi and began releasing its models open-source as well. For K2, released in July, the company made its own model architecture freely available.
Chinese AI companies’ preference for open-sourcing has allowed for faster and wider diffusion of the technology, notes Kyle Chan, a postdoctoral researcher at Princeton University who studies China’s tech sectors. As of August, K2 has 50 million global users across its web and app versions, according to aicpb, a website that tracks the popularity of AI products worldwide.
“Once you see a leading company do it, it’s safer to follow that path than to not,” Chan says. “This happens in so many industries in China.”
Unlike some of its Chinese start-up peers that have developed LLMs for specific industries or tasks, Moonshot AI has maintained a generalist approach. It offers an online chatbot for free and charges fees to users who want to integrate its technology into their systems. It sells access in a unit known as tokens, equivalent to about four Latin characters or one Chinese. Users pay for both input (their questions) and output (the chatbot’s answers).
U.S. companies generally charge far more for tokens than their Chinese rivals. For the latest version of K2, Moonshot AI charges $3.10 per million tokens. By comparison, OpenAI charges $11.25 for its standard version of ChatGPT.
Peng expects that K2 could soon win market share from U.S. firm Anthropic, which earlier this month prohibited sales to Chinese companies. Anthropic models were “really popular among Chinese developers, especially if they wanted to create an AI agent application,” he says. “There was no comparable model until K2 was out.”

BRING ON THE CAVALRY
Moonshot AI is currently valued at $3.3 billion, according to Pitchbook, though it has not raised money since last year. Its backers include some of the largest tech companies and top venture capital firms in China — and Microsoft, which participated in a $1 billion round last February. The company has not publicly disclosed any financial results, though many top AI companies in the U.S. and China alike are not yet profitable.

It is unclear how much each investor holds in Moonshot AI — the Financial Times has reported that Alibaba invested $800 million in the company, citing unnamed sources. Moonshot AI’s Beijing entity is directly held by the founders, according to WireScreen, but it consolidates its other operations through a Hong Kong holding company. Hong Kong corporate records in turn show a single shareholder: Cayman Islands-registered Moonshot Al Ltd.
Moonshot AI did not respond to a request for comment.

Noah Berman is a staff writer for The Wire based in New York. He previously wrote about economics and technology at the Council on Foreign Relations. His work has appeared in the Boston Globe and PBS News. He graduated from Georgetown University.

