算力 “阶级森严”？大厂优先内供、小厂无米下锅，硅谷掀起新一轮 GPU“断供潮”

微软、亚马逊等云巨头优先英伟达 GPU 分配给内部及头部客户，导致 AI 初创企业陷入算力短缺与成本飙升困境。价格上涨、等待延长与严格准入并存，部分公司被迫转向自建算力。资源向头部集中虽提振云厂利润，却加剧创业生态分化，算力正成为 AI 竞争的核心门槛。

Cloud giants such as Microsoft and Amazon are prioritizing NVIDIA GPU allocation for internal teams and top-tier clients, leaving small and medium-sized AI startups struggling with severe chip shortages. The race for computing resources is triggering a new structural crisis in Silicon Valley.

According to The Information, this supply shortage has already impacted numerous AI startups backed by top-tier venture capital firms including Sequoia Capital, Founders Fund, General Catalyst, and Andreessen Horowitz. Hemant Taneja, Managing Partner at General Catalyst, sent surveys to founders in his portfolio asking about their access to computing resources, stating bluntly in the questionnaire: "We hear from many that computing power—especially GPU access—is one of the biggest bottlenecks facing us this year."

Tightened supply has directly driven up rental prices, boosting profit margins for cloud service providers while significantly increasing operating costs for startups. Meanwhile, Microsoft Azure has explicitly informed its employees that customers should expect long wait times lasting at least until the end of 2026. The reshaping of the computing landscape is profoundly affecting the entire AI startup ecosystem.

History Repeats, but with Greater Intensity

This round of GPU shortages bears striking resemblance to early 2023, when cloud service providers similarly diverted computing resources from public clouds to prioritize internal teams and core customers like OpenAI. Venture capital firms such as Andreessen Horowitz and Index Ventures were eventually forced to assemble their own GPU resource pools to alleviate the immediate needs of their portfolio companies.

However, the current situation is even more severe. The Information points out that the explosive demand for AI coding tools is exacerbating the shortage, as large AI developers like Anthropic experience surging demand for computing power, further squeezing space available for smaller and medium-sized customers.

Another structural factor intensifying the shortage is that many AI startups signed two-to-three-year cloud service contracts are now expiring. Cloud providers are using this opportunity to charge higher prices or reallocate computing resources to buyers willing to pay more.

Microsoft's "Use It or Lose It" Policy Establishes Clear Tiers

Microsoft's computing allocation mechanism has established a clear tiered system. According to a Microsoft insider, Azure categorizes customers into three tiers:

Tier 1 consists of approximately 1,000 largest customers with the highest cloud spending, who enjoy priority access;

Tier 2 includes customers with secondary spending levels but still assigned dedicated sales representatives;

Tier 3 comprises smaller enterprises whose relationships are managed on behalf of Microsoft by distributor partners like CDW.

Regarding chip access thresholds, Microsoft has recently begun requiring customers seeking NVIDIA Blackwell chips to commit to renting at least 1,000 chips for a minimum of one year, with contract values reaching tens of millions of dollars. Even for renting older-generation NVIDIA chips, customers face waiting periods of weeks to months.

More notably, Microsoft's "use it or lose it" policy means that for customers accessing GPUs on a pay-as-you-go basis, Microsoft tracks utilization rates; if servers remain idle for even a few hours, access rights may be revoked. Startups receiving free computing credits through the "Microsoft for Startups" program face the same rule—if they fail to fully utilize the chips, their GPU access eligibility will be canceled.

Startups Face Price Hikes, Cancellations, and Inability to Compete with Large Clients

The experience of Krea, an image generation AI startup, is particularly representative. Founded four years ago and having raised $83 million from investors including Andreessen Horowitz and Bain Capital Ventures, Krea signed a six-month contract six months ago for hundreds of NVIDIA Blackwell chips at $2.80 per hour per chip.

However, when Krea recently sought additional servers to train new models from scratch, the situation took a sharp turn. Co-founder and CEO Victor Perez reported that some cloud provider sales representatives simply stopped answering calls; when they did return calls, they not only announced significant price hikes but also demanded three-year contracts before negotiations could proceed. "Some disappeared entirely, others said there was no inventory, and some tried to force us to accept extremely harsh terms," Perez said.

Ultimately, Krea signed a one-year contract at $3.70 per hour, representing a 32% price increase over their previous agreement. Meanwhile, another startup founder seeking to rent a tight cluster of nearly 1,000 GPUs stated that NVIDIA sales personnel informed him last week that finding such clusters at major cloud providers is extremely difficult—the daily rental cost would exceed $70,000.

Data from GPU cloud provider Lightning AI confirms this tense supply-demand dynamic:

The company currently has approximately 40,000 GPUs online, yet pending orders from about 40 customers total a demand of roughly 400,000 GPUs. CEO Will Falcon noted that prices have risen over 25% in the past six months, climbing from approximately $1.60 per hour to over $2.00, sometimes even higher.

Some Founders Opt for "Going Off-Cloud" to Build Their Own Infrastructure

Facing long wait times and escalating rental costs, some startup founders are considering bypassing cloud providers to purchase GPUs directly.

Collin McLelland, founder of AI agent startup Collide, stated that the company is considering spending approximately $500,000 to purchase NVIDIA GPUs for self-operation. Collide completed a $14 million seed funding round last year, focusing on developing AI agent products for the oil and gas industry. McLelland plans to lease data center or cloud provider rack space directly to host self-purchased GPUs, thereby avoiding the wait times and uncertainties inherent in the rental model.

"For us, having no computing power when we need it is the greatest risk," McLelland said. "Most people are just afraid of hardware. I've owned oil wells, so I'm desensitized to this."

Although purchasing GPUs directly incurs significantly higher costs in the short term compared to renting, he believes that over a multi-year horizon, the comprehensive cost is actually lower while completely eliminating dependence on cloud providers.

Cloud Providers Benefit from Profits, but Ecosystem Concerns Emerge

For cloud service providers, this supply shortage has brought long-awaited profit improvement. Previously, some cloud providers faced profitability pressure in their GPU businesses, but the current supply-demand imbalance allows them to raise rental prices, restoring marginal profit margins.

However, the long-term impact of this landscape on the AI startup ecosystem cannot be ignored. With computing resources heavily concentrated among top-tier clients, small and medium-sized startups will face higher barriers and greater uncertainty in model training and product iteration. General Catalyst is researching ways to help portfolio companies acquire GPU resources through establishing shared computing pools or negotiating directly on behalf of startups—mirroring the response strategy employed by venture capital firms in 2023 when they built their own GPU pools. This reflects that computing access has become an unavoidable structural challenge within the AI investment ecosystem.