DeepSeek R1 model was trained on NVIDIA H800 AI GPUs, while inferencing was done on Chinese made chips from Huawei, the new 910C AI chip.
According to the paper, the company trained its V3 model on a cluster of 2,048 Nvidia H800 GPUs - crippled versions of the H100. The H800 launched in March 2023, to comply with US export ...
A big price to pay The difference between the H800 chips DeepSeek used and the H100 chips typically used by data centers is the former are dumbed-down versions that Nvidia significantly reduced ...
One of DeepSeek's research papers showed that it had used about 2,000 of Nvidia's H800 chips, which were designed to comply with U.S. export controls released in 2022, rules that experts told ...
As for the hardware, DeepSeek used Nvidia H800 GPUs, which are modified from typically used H100 GPUs to abide by U.S. export restrictions. With a large number of consumers trying out DeepSeek's ...
Worse for Nvidia, the state-of-the-art V3 LLM was trained on just 2,048 of Nvidia’s H800 GPUs over two months, equivalent to about 2.8 million GPU hours, or about one-tenth the computing power ...
OpenAI Chief Executive Sam Altman said on X. DeepSeek’s model uses Nvidia H800, a chip that's far less expensive than the ones major U.S. large language model builders are using. This has ...
The full training of DeepSeek-V3’s 671B parameters is claimed to have only taken 2.788 M hours on NVidia H800 (Hopper-based) GPUs, which is almost a factor of ten less than others. Naturally ...