Where will China get its compute in 2026?
Over half of the compute will likely be legally imported NVIDIA H200s, but other sources—domestic production, proxy fabrication, and smuggling—matter too, as does remote access.
Though the AI chip export controls have gaps, and can be improved, they go a long way towards reducing the amount of compute that Chinese AI companies get. If you think, as I do, that compute is of great strategic importance, and that it’s better for the US to have a comfortable lead over China, this is a good thing. The American compute advantage is probably the main reason why Chinese AI models have lagged on average 7 months behind the frontier.
In this post, I make some rough estimates, using publicly available data, of how much compute China will acquire in 2026 through each of four pathways: legal imports, domestic production, proxy fabrication, and smuggling. I also discuss Chinese use of non-Chinese cloud compute, though since this involves renting rather than ownership, I don’t count it as “acquisition”. Though I’m not confident in the exact numbers, I do think they get the orders of magnitude right, and are informative for that reason.1
For training workloads, the estimates are:
Legal imports (mainly NVIDIA H200s) will, I think, make up about 60% of China’s compute acquisition in 2026, or about 230,000 B300-equivalents (90% CI: 0 to 300,000). This is probably the clearest sign that export controls dictate how much compute China gets—it means the US could cut Chinese compute acquisition by up to 60% if it wanted to.2 There is some chance that the Chinese Communist Party ends up blocking some or all H200 imports, or that the US reverses course or grants only very few licenses, but I consider these outcomes remote. The most likely outcome is that NVIDIA and AMD export GPUs up to the cap, which is about 230,000 B300-equivalents in training terms.3
Huawei Ascend 910Cs fabricated by SMIC in China will, I think, make up about 25% of China’s compute acquisition in 2026, or about 40,000 B300-equivalents (90% CI: 25,000 to 200,000). This is likely bottlenecked, not on GPUs fabricated by SMIC, but on high-bandwidth memory (HBM) fabricated by CXMT. HBM is a crucial component for AI chips, accounting for about half of the production cost, and is itself export-controlled. I’m unsure about Huawei’s domestic production volumes because I’m unsure about how many HBM stacks CXMT will manage to produce this year. (Huawei and others also stockpiled Samsung-made HBM in 2024 and early 2025, but this stockpile has now likely run out.) I estimate about 7 million HBM3 stacks—enough for about 590,000 Ascend 910Cs, assuming an advanced packaging yield of 70%—but SemiAnalysis offers a much smaller estimate of 2 million HBM stacks. I give equal weight to the SemiAnalysis number and my own estimate. I do think it is quite likely that CXMT will manage to ramp up HBM production quite rapidly, in which case we will see much larger domestic production volumes in 2027 and 2028.
Huawei Ascend 910Cs illegally fabricated outside mainland China will, I think, make up less than 5% of China’s compute acquisition in 2026, or about 2,000 B300-equivalents (90% CI: 0 to 20,000). Around 2024, Huawei obtained over 2.9 million AI chip dies from TSMC through front companies, despite sanctions. I call this “proxy fabrication”, because Huawei surreptitiously got TSMC to fabricate Huawei-designed chips using front companies as proxies.4 Based on a SemiAnalysis projection, this stockpile has likely run out by now, or is close to running out. In response to this violation, the Bureau of Industry and Security (BIS) announced a foundry due diligence rule meant to shut this pathway down. It is not yet clear whether this rule does the job. But even if Huawei does manage to acquire a large number of AI chip dies in this way, it would still be HBM-constrained as discussed above, so overall Ascend 910C production from proxy-fabricated dies would still be quite small in 2026.5
Smuggled AI chips (mainly Blackwells) will, I think, make up about 10% of China’s compute acquisition in 2026, or about 20,000 B300-equivalents (90% CI: 2,000 to 100,000). If you were annoyed by my hedging before, you haven’t seen anything yet. These estimates follow the highly uncertain 2024 estimates that Tim Fist and I published in a June 2025 working paper. We do know—mainly through investigative news reports—that there has been a fairly substantial amount of AI chip smuggling to China. One recent report suggested that DeepSeek is now using “several thousand” smuggled Blackwells to develop its next generation of models. Smuggling is probably the most flexible way for China to get compute—it’s annoying in various ways, and you pay a premium, but you get the best chips, and if you are willing to pay, supply is quite elastic. My guess is that smuggling was at a moderately high level—likely over 100,000 chips, or about 25,000 B300-equivalents—in 2024, then grew in 2025 after the NVIDIA H20 was banned, and will now shrink again as H200s are allowed.
Together, these pathways would make up about 320,000 B300-equivalents (90% CI: 150,000 to 600,000) acquired by Chinese companies in 2026. Thanks mainly to the export controls, that’s far less than what US companies will acquire. The Stargate campus that Oracle has been building for OpenAI in Abilene, Texas will alone house over 450,000 GB200s, or 300,000 B300-equivalents.6 But it’s also not nothing. Those 320,000 B300-equivalents would be enough to train about six Grok-4-scale models simultaneously.7 That amount of compute would also be about 600 times what DeepSeek claimed to have used to train DeepSeek-V3 in 2024.8
These totals are not fully exhaustive. For example, they don’t include domestic Chinese AI chips from companies other than Huawei, such as Alibaba, Biren, or Moore Threads. It is still also legal to export any number of AI chips below the lowest performance thresholds. That said, I think these other sources wouldn’t shift the figures substantially. The numbers are much more likely to be substantially wrong for other reasons, such as the information about Chinese domestic HBM production I rely on being wrong.
So far the numbers we’ve seen have aggregated training compute, summing the chips’ Total Processing Power (TPP), which is a sort of precision-independent version of FLOP/s. Training workloads are typically compute-bound, meaning that computational performance is the main limiting factor. But inference workloads are typically memory-bandwidth-bound.9 Do the estimates differ if we focus on memory bandwidth, measured in TB/s, instead?
The answer is: not much. The pathways are similarly important relative to one another. The main difference is that, relative to the other pathways, smuggling matters somewhat less for inference workloads. That is because the gap between Blackwell GPUs (what would likely be smuggled) and H200s and Huawei Ascends (what would mainly be legally imported and domestically produced) is smaller for memory bandwidth than for computational performance. According to specifications, the NVIDIA B300 has 1.7x the memory bandwidth of the H200 and 2.5x the memory bandwidth of the Ascend 910C, whereas the B300 is 3.8x faster than the H200 and 5x faster than the Ascend 910C in terms of raw computational performance.
For the same reason, the total compute is higher when measured in B300-normalized inference compute, with about 670,000 B300-equivalents (90% CI: 300,000 to 1.2 million), compared to about 320,000 B300-equivalents in training terms. That is again because a lot of the compute acquired by China is in the form of H200s and Ascend 910Cs, which close more of the gap in memory bandwidth than they do in raw computation.
So far I have talked about ways that Chinese companies get compute in the form of ownership of AI chips. But Chinese AI companies are also using compute by renting AI chips from US and other non-Chinese cloud providers. This cloud compute (or remote access) pathway is entirely legal. The logic behind allowing this is that US companies retain control over the hardware, while still allowing AI chip makers like NVIDIA to compete against Huawei and others in China. (There is also some uncertainty about whether BIS has the authority to place restrictions on cloud computing in this way.) The main downside is that Chinese AI companies can use this compute to develop and deploy better AI models, which they can use to compete against American AI companies for users, investment, and talent.
So how much compute is China getting through non-Chinese cloud providers? The answer is that we don’t really know, but there is some suggestive evidence. It does seem likely that ByteDance is Oracle’s largest customer; their largest joint cluster, located in Southeast Asia, will perhaps reach about 250,000 B300-equivalents this summer.10 There have also been several other reports of Chinese AI companies partnering with non-Chinese infrastructure companies to build AI data centers in Southeast Asia, particularly Malaysia. For example, Alibaba has reportedly trained its Qwen models on NVIDIA GPUs in Southeast Asia. (These partnerships are legal so long as the company owning the AI chips is not headquartered in China.) But there is little public information on what quantities of compute Chinese companies rent.
It may at some point make sense to close off the cloud pathway. In order to prepare for that, Congress could unambiguously authorize BIS to enact cloud controls by passing the Remote Access Security Act. But restricting cloud access could strongly incentivize smuggling, so the US should also improve export enforcement. Creating a whistleblower incentive program and improving BIS capacity would both help stop proxy fabrication and smuggling.
What could be done to further reduce China’s compute acquisition? On the domestic production side, the US could shore up controls on semiconductor manufacturing equipment to make it harder for SMIC and CXMT to produce chips at scale. Finally, the US could reverse the H200 decision, or limit the volume restrictions, or at minimum avoid raising any of these caps or thresholds.
Appendix: Methodology
The estimates in this post are produced using Monte Carlo simulation (100,000 samples) in Python, using the squigglepy library. That sounds fancy but it just means I represent each uncertain input as a probability distribution—usually a normal, lognormal, or a mixture distribution—and then propagate that uncertainty through to the final numbers. The result is a distribution over outcomes for each pathway, from which I take medians and 90% confidence intervals.
Here’s how each pathway is estimated:
Legal imports. The starting point is the CNAS estimate that, under the current export rule, the cap on AI chip exports to China is about 890,000 H200-equivalents (mostly H200s, with some MI325Xs). The main uncertainty is whether exports actually reach the cap. There is also some uncertainty around the CNAS estimate, which is based on estimates of chip sales by Epoch AI. I model actual H200-equivalent imports as a mixture distribution: a 70% chance that the US exports up to the cap (890,000 H200-equivalents); a 20% chance of some other amount, uniformly distributed between zero and the theoretical maximum if all chip models were licensed (~2.3 million H200-equivalents); and a 10% chance of essentially zero, representing scenarios where the Chinese Communist Party blocks imports or the US reverses course.
Domestic production. This pathway is for Huawei Ascend 910Cs fabricated by SMIC within China. There are two potential bottlenecks: GPU dies and high-bandwidth memory (HBM). The binding constraint, it turns out, is HBM.
For GPU dies, I start with SMIC’s reported wafer capacity of about 60,000 wafer-starts per month. Huawei seems to account for roughly 75% of SMIC’s advanced-node output. I then assume about 50% of Huawei’s wafers go to AI chips (as opposed to smartphone chips, CPUs, and so on). For SMIC’s yield, I use the mean of three reported figures (35%, 40%, and 65%). Combined with a die size of about 666 mm² and standard wafer geometry, this gives a median of roughly 9.4 million Ascend GPU dies produced in 2026.
For HBM, I estimate the number of HBM3 stacks that CXMT (likely China’s sole significant HBM producer, at least for now) will fabricate in 2026. SemiAnalysis estimates about 2 million stacks; my own estimate, based on an extrapolation of CXMT’s wafer capacity and yield data, gives a median of about 7 million stacks (80% CI: 2.5 million to 18.6 million). I give these two alternatives 50% weight each. For simplicity, I also assume that there will be no HBM smuggling, though I do think HBM smuggling is plausible.
Each Ascend 910C requires two GPU dies, and eight HBM stacks. I also apply an advanced packaging yield, assumed to be about 70%. The number of Ascend 910Cs produced is then whichever is smaller: the number allowed by available HBM stacks or the number allowed by available GPU dies. As it turns out, HBM is very likely the bottleneck.
Proxy fabrication. Around 2024, Huawei obtained about 2.9 million Ascend GPU dies from TSMC through front companies. In September 2024, SemiAnalysis projected that this stockpile would run out “within the next 9 months”. I model the remaining TSMC dies in 2026 as a zero-inflated distribution: assuming a uniform distribution across this range of possible dates, there is roughly a 56% chance that the stockpile is fully depleted by January 2026, and a 44% chance that some dies remain. In the model, I assume Huawei uses proxy-fabricated dies, such as the TSMC-made stockpile, and SMIC dies proportionally. If SemiAnalysis is right, it’s likely that the TSMC-made stockpile has already run out, and if not, most of it will have already been used up. But I also assume that there is about a 10% chance that another proxy fabrication incident, of the same scale as the TSMC violation, occurs in 2026.
Smuggling. This is the hardest pathway to estimate, since smuggling is by nature clandestine. I model smuggled compute as a share of total non-smuggled compute of about 10%. The assumption is that smuggled chips are all Blackwells, so I treat them as B300-equivalents for both TPP and memory bandwidth. These estimates follow those that Tim Fist and I published in a June 2025 working paper for Center for a New American Security, which were themselves highly uncertain. One quirk of this model is that smuggling is defined as a share of non-smuggled compute, which means that in scenarios where legal imports drop to zero, smuggling also drops, whereas in reality you’d expect substitution in the other direction. That said, we expect the overall level of smuggling to be roughly similar to what we estimated for 2024, since the H20 was about as attractive relative to the cutting edge in 2024 as the H200 is now, so the incentive to smuggle should be comparable.
There are more details on the methodology used in the appendix at the end of this post.
That said, if US legal imports cease or are reduced, part of the lost compute would be regained through smuggling. Of the four pathways, smuggling is likely to be the most elastic. But the increased smuggling would not fully make up for the lost sales, since smuggled chips are sold at a significant price premium, their supply is less reliable, and there is some risk of detection for large companies that operate in both international and US markets. So, though cutting legal imports by 230,000 B300-equivalents would not reduce total Chinese compute acquisition by 230,000 B300-equivalents, this reduction would still be very large.
In the new rule, exports to China for each AI chip model are capped to 50% of the number of cumulative sales of that specific model in the US. A CNAS paper estimates that so far, if export licenses are granted for NVIDIA H200s and AMD MI325Xs, this would be about 890,000 H200-equivalents, or 230,000 B300-equivalents in training terms, or 530,000 B300-equivalents in inference terms. But if the US grants export licenses for all AI chips under the new thresholds—for example, the NVIDIA A100 and the AMD MI300A—and Chinese companies are willing to buy these, the cap would rise to abound 2.3 million H200-equivalents, or 610,000 to 1.4 million B300-equivalents. I think that is quite unlikely to happen, but some of these other chips could be sold, and it’s also possible that we see more US sales of the H200 and MI325X during 2026, raising the cap. Overall, I think it’s most likely that Chinese companies purchase about 890,000 H200-equivalents, mostly H200s.
Proxy fabrication is different from purely domestic production, because with proxy fabrication the GPUs are not fabricated by a Chinese fab within China. It is also not smuggling. Smuggling is the knowing movement of goods across a border in violation of the law. Proxy fabrication doesn’t fit this definition, since what is illegal there is producing the chips for a prohibited party, not moving them into China. It would be illegal even if Huawei kept the chips in Taiwan forever.
As discussed in the methodology, these estimates assume that, when Chinese domestic production is HBM-constrained, as I think it is, then China would use SMIC-made GPU dies and proxy-fabricated GPU dies proportionally. So if it had 9.5 million SMIC-fabricated dies and 500,000 proxy-fabricated dies, but only enough HBM for 200,000 Ascend 910Cs, then I assume that China produces 190,000 SMIC-fabricated 910Cs, and 10,000 proxy-fabricated Ascend 910Cs.
In October 2025, Larry Ellison said that the Stargate campus in Abilene, Texas will house more than 450,000 GB200s. The NVIDIA GB200 pairs two B200 GPUs with a Grace CPU, but Ellison’s “450,000 GPUs” most likely refers to individual B200 GPU dies, since the same report also says the campus will use 1.2 GW. Since a B300 has 1.5x the computational performance of a B200, that cluster will be equivalent to roughly 450,000 ÷ 1.5 = 300,000 B300s.
Grok 4 was likely trained on xAI’s Colossus cluster housing 200,000 Hopper GPUs. Since a B300 has 3.8x the computational performance of an H100 or H200, that cluster is equivalent to roughly 200,000 ÷ 3.8 = 53,000 B300s. Dividing 320,000 by 53,000 gives six. In practice, China’s compute is fragmented across dozens of organisations, and no single entity will control anywhere near the total.
DeepSeek reported training V3 on 2,048 H800 GPUs over roughly two months. Since a B300 has 3.8x the computational performance of an H800, that cluster is equivalent to about 2,048 ÷ 3.8 = 540 B300s. Dividing 320,000 by 540 gives roughly 600. Note that DeepSeek’s figure covers only the final training run and excludes prior research, failed runs, and the fine-tuning and reinforcement learning that produced DeepSeek-R1.
This model—using TPP for training compute and memory bandwidth for inference compute—is only true to a first approximation. Other metrics matter too, like memory capacity and interconnect bandwidth. It is also possible, in theory, to make inference workloads more compute-intensive, for example, by increasing the batch size, though this comes with disadvantages like higher latency for each individual request. And various optimizations, like speculative decoding, complicate things further.
According to a June 2025 SemiAnalysis report, ByteDance is Oracle’s largest cluster, and their largest joint cluster was estimated to reach 600-700 MW “within a year”. If a rack of NVIDIA B300s use about 150 kW (about 2 kW per GPU, with 72 GPUs), and the data center has a power usage effectiveness of 1.3, and the 600-700 MW number refers to the total load, then you get 650,000 ÷ 1.3 ÷ 2 = 250,000 B300s.

