Nvidia H200 demand

Nvidia H200 Demand Surges Amid DeepSeek AI Breakthrough

Cloud providers are witnessing a sharp increase in demand for Nvidia’s H200 chips following DeepSeek’s latest advancements in AI models. The Chinese AI company officially entered the foundation model race this month, causing a shift in the market.

Though DeepSeek’s progress has been known among AI researchers since its first model, V2, launched in May 2024, its V3 release in December captured wider attention. The introduction of DeepSeek R1 in January further fueled interest, driving demand for H200 GPUs.

Robert Brooks, founding team member and VP of revenue at cloud provider Lambda, confirmed the surge. “The launch of DeepSeek R1 has drastically increased H200 demand. Enterprises are now pre-purchasing capacity before public availability,” he said.

Open-Source Models Shake Up AI Economics

DeepSeek’s open-source models allow users to access AI technology at a lower cost, but they still require significant computing power. Many companies rely on cloud computing services to scale these models efficiently.

On Friday, analysts at Semianalysis reported visible effects on H100 and H200 pricing, driven by DeepSeek’s influence. The total sales of Nvidia’s H200 GPUs have already reached double-digit billions, according to CFO Colette Kress in Nvidia’s November earnings call.

DeepSeek’s models were trained on less powerful hardware than U.S.-based models, according to its research paper. This efficiency has unsettled investors, as companies like Meta, OpenAI, and Microsoft have invested billions in AI infrastructure. The concern now is whether all that capacity is necessary.

DeepSeek achieved impressive performance despite using weaker chips, though the exact number remains debated. However, cloud providers stress that running these models still demands immense computational power. “It is not lightweight,” said Srivastava, referring to the challenges of inference.

H200 Chips Become Essential for Running DeepSeek V3

DeepSeek’s most powerful version contains 678 billion parameters, which requires significant computing resources. This is less than OpenAI’s ChatGPT-4 (1.76 trillion parameters) but more than Meta’s Llama model (405 billion parameters).

The Nvidia H200 is currently the only widely available chip capable of running DeepSeek V3 on a single node (a system of eight interconnected GPUs). While alternative methods exist, such as splitting workloads across weaker GPUs, this approach introduces complexity and potential performance issues.

Nvidia’s Blackwell chips, set to ship this year, will also be able to handle DeepSeek V3 on a single node. However, the current supply shortage makes acquiring H200 chips increasingly difficult.

Companies like Baseten, which do not own their own GPUs, instead purchase capacity from data centers. They specialize in software optimization to maximize AI model efficiency. Some of their clients own hardware but still rely on Baseten to improve inference speed, a crucial factor for real-time AI applications like AI-generated voice assistants.

With DeepSeek’s open-source models disrupting AI infrastructure costs, businesses are eager to leverage its potential while securing the hardware needed to power it.