茉莉花新闻网

中華青年思想與行動的聚合地

中国AI初创公司DeepSeek是如何与硅谷巨头竞争的

CADE METZ, MEAGHAN TOBIN

2025年1月24日

DeepSeek的工程师说,他们的系统只需要约2000块美国芯片制造商英伟达生产的专用计算机芯片,而美国大公司的AI系统需要多达1.6万块芯片。 Marlena Sloss/Bloomberg

The day after Christmas, a small Chinese start-up called DeepSeek unveiled a new A.I. system that could match the capabilities of cutting-edge chatbots from companies like OpenAI and Google.

圣诞节的第二天,一家名为深度求索(DeepSeek)的中国小型初创公司发布了一个新的人工智能系统,其功能可与OpenAI和谷歌等公司的尖端聊天机器人相媲美。

That alone would have been a milestone. But the team behind the system, called DeepSeek-V3, described an even bigger step. In a research paper explaining how they built the technology, DeepSeek’s engineers said they used only a fraction of the highly specialized computer chips that leading A.I. companies relied on to train their systems.

能做到这点本已是一个里程碑。但这个名为DeepSeek-V3的大模型背后的团队描述了一个更大的进步。深度求索的工程师在介绍他们如何构建这个大模型的研究论文中写道,他们在训练该系统时只用了领先人工智能公司用的高度专业化计算机芯片的一小部分。

These chips are at the center of a tense technological competition between the United States and China. As the U.S. government works to maintain the country’s lead in the global A.I. race, it is trying to limit the number of powerful chips, like those made by Silicon Valley firm Nvidia, that can be sold to China and other rivals.

这些芯片是美中激烈技术竞争的核心。随着美国政府努力保持本国在全球人工智能竞争中的领先地位,它正在试图对能出售给中国以及其他竞争对手的高性能芯片(如硅谷公司英伟达生产的那些)进行限制。

But the performance of the DeepSeek model raises questions about the unintended consequences of the American government’s trade restrictions. The controls have forced researchers in China to get creative with a wide range of tools that are freely available on the internet.

但DeepSeek大模型的表现让人们对美国政府贸易限制的意外后果产生了质疑。美国的出口管制措施已迫使中国研究人员使用互联网上免费提供的各种工具来发挥创造力。

The DeepSeek chatbot answered questions, solved logic problems and wrote its own computer programs as capably as anything already on the market, according to the benchmark tests that American A.I. companies have been using.

据美国人工智能公司一直使用的行业基准测试,DeepSeek聊天机器人能回答问题、解决逻辑问题,并编写自己的计算机程序,其能力不亚于市场上已有的任何产品。

And it was created on the cheap, challenging the prevailing idea that only the tech industry’s biggest companies — all of them based in the United States — could afford to make the most advanced A.I. systems. The Chinese engineers said they needed only about $6 million in raw computing power to build their new system. That is about 10 times less than the tech giant Meta spent building its latest A.I. technology.

而且它的造价很低,挑战了只有最大的科技企业(它们全都在美国)才能制造出最先进的人工智能系统的普遍观念。中国工程师称,他们只花了约600万美元的原始计算能力就训练了新模型,不到科技巨头Meta训练其最新人工智能模型所耗资金的十分之一。

“The number of companies who have $6 million to spend is vastly greater than the number of companies who have $100 million or $1 billion to spend,” said Chris V. Nicholson, an investor with the venture capital firm Page One Ventures, who focuses on A.I. technologies.

“有600万美元资金的公司在数量上远远多于有1亿美元或10亿美元资金的公司,”风险投资公司Page One Ventures的投资人克里斯·尼科尔森说道,他主要投资人工智能技术。

Since OpenAI sparked the A.I. boom in 2022 with the release of ChatGPT, many experts and investors had concluded that no company could compete with the market leaders without spending hundreds of millions dollars on specialized chips.

自从OpenAI 2022年发布了ChatGPT,引发人工智能热潮以来,许多专家和投资者曾得出结论认为,如果不投入数亿美元购买人工智能专用芯片的话,没有公司能与行业领军者竞争。

The world’s leading A.I. companies train their chatbots using supercomputers that use as many as 16,000 chips, if not more. DeepSeek’s engineers, on the other hand, said they needed only about 2,000 specialized computer chips from Nvidia.

世界领先的人工智能公司用超级计算机来训练它们的聊天机器人,这些超级计算机需要多达1.6万个芯片,甚至更多。但DeepSeek的工程师却说,他们只用了约2000个英伟达生产的专用芯片。

The constraints on chips in China forced the DeepSeek engineers to “train it more efficiently so it could still be competitive,” said Jeffrey Ding, an assistant professor at George Washington University who specializes in emerging technology and international relations.

中国进口芯片受到限制,迫使DeepSeek工程师“更有效地训练大模型,以让其仍有竞争力”,乔治华盛顿大学专门研究新兴技术和国际关系的助理教授杰弗里·丁(音)说。

Earlier this month, the Biden administration issued new rules that aim to keep China from obtaining advanced A.I. chips through other countries. The rules build on multiple rounds of earlier restrictions that prevent Chinese companies from being able to buy or make cutting-edge computer chips. President Trump has not yet indicated whether he will the rules or rescind them.

本月早些时候,拜登政府颁布了旨在阻止中国通过其他国家获得先进人工智能芯片的新规则。新规则出台前,美国已采取了多轮限制措施,阻止中国公司购买或制造尖端计算机芯片。特朗普总统尚未表明他是否会继续实施或取消这些措施。

The U.S. government has tried to keep advanced chips out of the hands of Chinese companies over concerns they could be used for military purposes. In response, some firms in China have stockpiled thousands of chips, while others sourced them from a thriving underground marketplace of smugglers.

美国政府一直试图阻止中国公司获得先进芯片,因为担心这些芯片可能用于军事目的。作为回应,中国的一些公司囤积了大量这类芯片,另一些公司则在蓬勃发展的黑市采购走私芯片。

DeepSeek is run by a quantitative stock trading firm called High Flyer. By 2021, it had channeled its profits into acquiring thousands of Nvidia chips, which it used to train its earlier models. The company, which did not respond to requests for comment, has become known in China for scooping up talent fresh from top universities with the promise of high salaries and the ability to follow the research questions that most pique their interest.

DeepSeek由一家名叫幻方的量化股票交易公司运营。到2001年,它已将利润投入购买数千枚英伟达芯片,用于训练其早期模型。公司没有回复记者的置评请求,它在中国有一种名声,那就是以高薪和让人们能够探索最感兴趣的研究课题为承诺,吸引了刚从顶尖大学毕业的人才。

Zihan Wang, a computer engineer who worked on an earlier DeepSeek model, said the company also hires people without any computer science background to help the technology understand and be able to generate poetry and ace questions on the notoriously difficult Chinese college entrance examination.

曾参与早期DeepSeek大模型开发的计算机工程师汪子涵(音)说,公司也雇佣没有任何计算机科学背景的人帮助该技术理解并生成诗歌,并在做难度极大的中国高考试卷时获得高分。

DeepSeek does not make any products for consumers, leaving its engineers to focus entirely on research. That means that its technology is not hemmed in by the strictest aspect of China’s regulations on A.I., which require consumer-facing technology to comply with the government’s controls on information.

DeepSeek不制造任何消费者产品,而是让工程师全神贯注地做研究。这意味着其技术不受中国有关人工智能法规中最严格部分的限制,中国要求面向消费者的技术必须遵循政府对信息的控制。

The leading American companies continue to advance the state of the art in A.I. In December, OpenAI unveiled a new “reasoning” system called o3 that exceeds the performance of existing technologies, though it is not yet widely available outside the company. But DeepSeek continues to show that it is not far behind. This month, it released an impressive reasoning model of its own.

领先的美国公司继续推动人工智能的发展。去年12月,OpenAI公布了一款性能超过现有技术的名为o3的新“推理”系统,尽管该系统尚未在该公司以外得到广泛使用。但DeepSeek继续表明自己并不落后,它在本月发布了自己的一个推理模型,性能同样令人印象深刻。

(The New York Times has sued OpenAI and its partner, Microsoft, accusing them of copyright infringement of news content related to A.I. systems. OpenAI and Microsoft have denied those claims.)

(《纽约时报》已起诉OpenAI及其合作伙伴微软,称其侵犯了与人工智能系统相关新闻内容的版权。OpenAI和微软否认了这些指控。)

A crucial part of this rapidly changing global market is an old idea: open source software. Like many other companies, DeepSeek has open sourced its latest A.I. system, meaning that it has shared the underlying code with other businesses and researchers. This allows others to build and distribute their own products using the same technologies.

这个快速变化的全球市场的关键部分是一个存在已久的想法:开源软件。与许多其他公司一样,DeepSeek也将其最新的人工智能模型放入开源软件系统,这意味着它已经与其他企业和研究人员共享了基础代码,让其他人能用相同的技术构建和发布自己的产品。

While employees at big Chinese technology companies are limited to collaborating with colleagues, “if you work on open source, you work with talent around the world,” said Yineng Zhang, lead software engineer at Baseten in San Francisco who works on the open source SGLang project. He helps other people and companies build products using DeepSeek’s system.

虽然中国大型科技企业的员工只与自己的同事合作,但“如果你从事开源软件开发,你其实是在与世界各地的人才合作”,旧金山Baseten的首席软件工程师张一能(音)说,他为开源的SGLang项目工作。他还帮助其他人和公司使用DeepSeek模型构建产品。

The open source ecosystem for A.I. gathered steam in 2023 when Meta freely shared an A.I. system called LLama. Many assumed that this community would flourish only if the companies like Meta — tech giants with massive data centers filled with specialized chips — continued to open source their technologies. But DeepSeek and others have shown that they, too, can expand the powers of open source technologies.”

2023年,Meta免费分享了一个名为LLama的人工智能模型后,人工智能的开源生态系统开始蓬勃发展。许多人曾假设,只有像Meta这样的科技巨头——拥有使用大量专用芯片的大型数据中心——继续开源其技术,人工智能社区才会蓬勃发展。但DeepSeek和其他公司已表明,它们也可以拓展开源技术的能力。

Many executives and pundits have argued that the big U.S. companies should not open source their technologies because they could be used to spread disinformation or cause other serious harm. Some U.S. lawmakers have explored the possibility of preventing or throttling the practice.

许多高管和专家认为,美国大公司不应该开源其技术,因为它们能被用来传播虚假信息或造成其他严重危害。一些美国立法者已在探索阻止或限制开源的可能性。

But others argue that if regulators stifle the progress of open source technology in the United States, China will gain a significant edge. If the best open source technologies come from China, they argue, U.S. developers will build their systems atop those technologies. In the long-run, that could put China at the heart of A.I. research and development.

但也有人认为,如果监管机构扼杀了开源技术在美国的进步,中国将获得显著优势。他们认为,如果最好的开源技术来自中国,美国开发人员将在这些技术的基础上构建他们的系统。从长远来看,这可能会让中国成为研发人工智能的中心。

“The center of gravity of the open source community has been moving to China,” said Ion Stoica, a professor of computer science at the University of California, Berkeley. “This could be a huge danger for the U.S.,” because it allows China to accelerate the development of new technologies.

“开源社区的重心已在向中国转移,”加州大学伯克利分校计算机科学教授伊恩·斯托伊卡说。“这对美国来说可能是一个巨大的危险”,因为它让中国得以加速新技术的研发。

Hours after his inauguration, President Trump rescinded a Biden administration executive order that threatened to curb open source technologies.

就职典礼数小时后,特朗普总统撤销了拜登政府威胁限制开源技术的行政命令。

Dr. Stoica and his students recently built an A.I. system called Sky-T1 that rivals the performance of OpenAI latest system, called OpenAI o1, on certain benchmark tests. They needed only $450 in computing power.

斯托伊卡和他的学生最近构建了一个名为Sky-T1的人工智能模型,在某些基准测试中,该模型的性能可与最新的OpenAI系统——OpenAI o1相媲美。他们的模型只需要450美元的计算能力。

00china ai cohen wzhb master1050自从去年12月底DeepSeek-V3发布后不久,多伦多的技术顾问鲁文·科恩就一直在使用该模型。

They did this by building on top of two open source technologies released by the Chinese tech giant Alibaba.

他们能做到这点是因为他们的系统是建在中国科技巨头阿里巴巴发布的两项开源技术的基础之上的。

Their $450 system is not as powerful as OpenAI’s technology or DeepSeek’s new system. And the techniques they used are unlikely to yield systems that exceed the performance of the leading technologies. But the project showed that even operations with minuscule resources can build competitive systems.

他们450美元的系统不如OpenAI技术或DeepSeek新模型强大。他们使用的技术不太可能产生超越领先技术性能的系统。但他们的研究表明,即使是资源微不足道的组织或者企业,也能构建具有竞争力的系统。

Reuven Cohen, a technology consultant in Toronto, has been using DeepSeek-V3 since late December. He says it is comparable to the latest systems from OpenAI, Google and the San Francisco start-up Anthropic — and much cheaper to use.

多伦多的技术顾问鲁文·科恩从去年12月下旬起一直在使用 DeepSeek-V3。他说,该模型与OpenAI、谷歌,以及旧金山初创公司Anthropic的最新系统能力相当,而且使用起来便宜得多。

“DeepSeek is a way for me to save money,” he said. “This is the kind of technology that someone like me wants to use.”

“DeepSeek是让我省钱的办法,”他说。“这是像我这样的人想用的技术。”

茉莉花新闻网

        中国茉莉花革命网始创于2011年2月20日,受阿拉伯之春的感召,大家共同组织、发起了中国茉莉花革命。后由数名义工无偿坚持至今,并发展成为广受翻墙网民欢迎的新闻聚合网站并提供论坛服务。

新闻汇总

邮件订阅

输入您的邮件地址:

linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram