The Substrate

Making through-silicon vias is not a bottleneck for China's HBM production

Hamish Low — Wed, 22 Apr 2026 15:37:29 GMT

This is the second piece in a series exploring key semiconductor manufacturing equipment that China needs to indigenously produce high-bandwidth memory, perhaps the most important bottleneck in its efforts to make AI chips. The first piece was on advanced etching machines.

The biggest news in the world of semiconductor manufacturing equipment (SME) export controls was the recent introduction of the MATCH Act to Congress. This bill initially covered a range of measures, all aimed at aligning US allies with US export controls on SME to China. It has since been narrowed to the imposition of stricter controls on deep ultraviolet immersion (DUVi) lithography equipment. One dropped element targeted through-silicon via (TSV) deposition and etch tools.

Is this a missed opportunity or a wise trade-off? In this post, I try to answer that question by investigating how advanced Chinese domestic firms are at producing the relevant machines. I find that Chinese firms can likely handle the required TSV etching and deposition steps using indigenous equipment, though at a lower yield than with equivalent tools from Western firms. Excluding TSV tools from the MATCH Act to focus on the most binding bottleneck, deep ultraviolet immersion lithography, is therefore the right strategic choice.

TSVs are significant because they play a key role in the production of high-bandwidth memory (HBM). For China to produce HBM, it needs not only to produce the individual memory chips—the focus of the previous piece in this series—but also to package them into a single stack. TSVs, tens of thousands of tiny vertical copper wires, make that stacking possible. Producing TSVs requires specialized SME, including etching and deposition machines, but is much less demanding than other cutting-edge areas of semiconductor production, such as making advanced logic or memory chips.

The first stages of TSV fabrication are etching and various thin-film deposition steps. For these stages, there are multiple competing tool options from Chinese firms. The next stage, copper electroplating, is the most concentrated, with ACM Research the only Chinese firm to have delivered a proven tool, though Naura has recently begun working on its own alternative.

ACM Research is also a fascinating firm. Its China-based subsidiary, ACM Shanghai, which accounts for essentially all of the group's manufacturing and revenue and is crucial to China’s AI ambitions, is on the US Entity List, yet is owned by a US-headquartered parent firm. Looking at ACM Research also offers insight into China’s successes in developing wafer-cleaning tools and how it has nearly closed the gap with the global frontier in that niche.

In this post, I first explore how TSVs work and the various processes required to create them, before turning to electroplating and ACM Research and concluding with what this analysis means for China’s overall HBM effort and whether tightening controls on TSV tools would be worthwhile.

Through-silicon vias make high-bandwidth memory possible

AI models require significant memory capacity (how much information the memory can hold) and memory bandwidth (how fast it can move this information to the relevant logic process on the chip). HBM provides the best balance between these two requirements. HBM’s innovation over previous forms of memory was stacking several (usually eight to twelve, and in more recent generations, sixteen) memory chips on top of one another, to place as much memory as possible, as close as possible, to the logic component that runs computations of the AI model.

The previous piece in this series explored the production of these individual memory chips and how the need to produce ever-denser, more advanced memory cells could be a bottleneck to China’s HBM production. The focus of this piece is instead on how to bind these individual memory chips into an HBM stack.

TSVs are tens of thousands of tiny vertical wires that cut through the layers of the chips within the HBM stack.1 They connect these stacked memory layers to one another, creating a wide pathway for data to flow down through the stack. The advantage of this vertical interconnect is density: by stacking memory layers rather than spreading them across a circuit board, far more memory can sit next to the AI chip, reducing the latency and energy cost of moving large amounts of data.

Etching is the first step of making a TSV, and is comparatively simple

TSVs differ from the memory cells discussed in the previous piece in that they are larger. These are still semiconductors, so TSVs aren’t big, but they are about 200 times wider in diameter than memory cells.2 This is because TSVs go much deeper than memory capacitors, requiring them to cut through the entire silicon wafer, which introduces a different set of challenges.

The difficulty lies less in pushing physics to its limits to reach such tiny dimensions than in managing the complexity of drilling deep into a chip with different materials and structures. At a basic level, the process involves carving a hole in a chip, depositing a thin layer of insulating material on top, and filling the rest with conductive copper.

A diagram of the TSV formation process steps from Applied Materials

Once the trench is complete and the copper wire is formed, a sequence of steps removes the excess material and flips the wafer to remove material on the other side, revealing the end of the wire. This leaves a TSV that runs through the whole memory die, making it ready to be stacked on top of other memory dies to form HBM. That process of carving away excess material, called chemical mechanical planarization (CMP), will be the subject of the next piece in this series, but for now, the focus is on etching the TSV trench and filling it with copper.

Given that TSVs are much larger than memory capacitors and that China has already been making significant gains in its etching capabilities, etching is unlikely to be a bottleneck for China to produce TSVs. While TSV etching differs in some important ways from capacitor etching—notably, it is a multi-step process rather than a single step—it is unlikely to pose a major challenge. TSV etching machines from AMEC and Naura can meet the needs of China’s HBM production, and exceed the TSV etching limits that are built into US export controls.3

Overall, several types of SME are needed to produce TSVs:

Photolithography tools to pattern where to etch the TSVs
Etching machines to carve out the TSVs
Deposition machines to fill in the desired materials
Cleaning tools to remove unwanted impurities
Chemical mechanical planarization and other grinding tools to shape the wafer
Metrology tools to measure these various processes and keep them on track

Photolithography is an extremely important bottleneck for much of China’s semiconductor production, but it is not especially significant here. Given the large feature sizes of TSVs, they do not require precise lithography and can therefore rely on lagging-edge tools that China has in relative abundance and can still import from ASML, Canon, or Nikon.

Etching machines are unlikely to be a bottleneck for similar reasons: the large feature sizes of TSVs. Cleaning tools are an area of relative strength for China, with ACM Research, the focus of the latter half of this piece, having built a globally competitive portfolio.

This leaves deposition machines, chemical mechanical planarization, and metrology tools. All three are important potential bottlenecks and will be the focus of this and subsequent pieces in this series, starting with deposition tools, the next step in the TSV formation process after etching.

China can handle the needed thin-film deposition

Deposition is the process of placing material onto the wafer. Like etching, it comes in a wide variety of forms, as a vast array of processes, materials, and structures require different deposition techniques.

For producing TSVs, there are four key deposition steps. The first three can be grouped together as they are all variations of thin-film deposition; the fourth, copper electroplating, works differently and is discussed below.

Sourced from “Tutorial on forming through-silicon vias” in the Journal of Vacuum Science & Technology.

The first three deposition steps place thin layers of different materials on top of one another. The first is an insulating layer that stops electrical interference between the conductive substrate and the copper wires. Next, a barrier layer is needed to stop copper atoms from diffusing through the insulator into the surrounding silicon, damaging other structures on the chip. Finally, a “seed” layer of copper is applied, acting as the base for the electroplating stage.

These thin-film deposition tools are an area where China lags behind the technological frontier, though for TSVs, this gap is not particularly relevant. Since TSVs are large and use well-established materials, they don’t require cutting-edge thin-film deposition capabilities. Thin-film deposition is unlikely to be an important bottleneck for China’s HBM production. (The appendix below gives more information on how thin-film deposition works and the reasoning behind that view.)

Few Chinese firms are developing electroplating

Once these various linings are in place, the challenge is to fill the rest of the TSV with copper. This process is called electroplating or electrochemical deposition. Electroplating uses electrolysis, in which a circuit is established between a cathode and an anode. This oxidizes copper at the anode, sending electrons through the external circuit and releasing positively charged copper ions into the solution; at the cathode, the ions meet those electrons and are reduced to solid copper on the surface. The same process is used to plate gold or silver jewelry: a ring is placed in a solution containing dissolved gold, and a current is passed through it, depositing a layer of solid gold. With TSVs, this happens on a microscopic scale.

The diagram on the left shows a simple version of how copper electroplating functions; the image on the right shows the difficulty of electroplating with an Eye-of-Sauron-esque void having formed within the copper, both from “Tutorial on forming through-silicon vias” in the Journal of Vacuum Science & Technology.

In the TSV, the deposited copper seed layer acts as the cathode, meaning copper continually builds up on it, slowly filling the space. The complexity comes in managing the pace at which this copper forms, and where within the TSV. A bottom-up fill is needed, as one key issue is voids forming within the copper (as shown above). These are holes within the copper, formed by impurities, trapped air, or mismanagement of the deposition process, in which the top of the TSV fills before the bottom does.

China has a few makers of electroplating tools, with only ACM Research having a range of machines, the first released in 2019.4 Naura entered the market in March 2025, with an electroplating machine for TSVs.5 Therefore, much more so than in thin-film deposition, China depends principally on the capabilities of a single firm, with Naura playing a secondary catch-up role.

The Ultra ECP 3d meets China’s TSV electroplating needs

ACM Research’s electroplating offering for TSVs is the Ultra ECP 3d. For some odd reason, the listing of the ECP 3d on ACM Research’s website shows a picture of a different machine, the ECP ap, which is a less specialized electroplating machine for packaging processes. Even more displeasing, it fails to capitalize the D in “3D”.

ACM Research describes the ECP 3d like this:

Building on our proven electrochemical plating (ECP) technology, the Ultra ECP 3d is configured with ACM’s exclusive Multi-Anode Partial Plating function, which allows the deposition of the copper metal layer on via structures of 3D TSVs and 2.5D interposers, and is compatible with aspect ratios of 10:1 and beyond.

The Ultra ECP 3d is likely capable of handling the electroplating steps necessary for China to fill TSVs to produce HBM. The metrics and descriptions given by ACM Research suggest that it is operating at a fairly advanced level.6 For instance, the ECP 3d has multiple anodes to better control how material is deposited, and ACM claims it can handle aspect ratios of 10:1 and above.7 That would make it suitable for the TSVs needed to produce HBM.

Electroplating is not a step that pushes the cutting edge of semiconductor physics, so unsurprisingly, ACM Research can produce a tool for the job. The differentiator for US tools produced by Lam Research and Applied Materials lies not in capabilities but in performance. High throughput, high uniformity, and good integration with other systems throughout the fab almost certainly still make their machines more attractive to chip makers than ACM Research’s would be.

Assessing the gap in these metrics is very difficult due to the lack of publicly disclosed information from these SME companies. Marketing materials usually give vague allusions to capabilities—“50% faster” or “higher uniformity”—but no concrete information. One concrete comparison is the timing of the tools’ introduction. Lam Research’s current platform was introduced in 2015, with Applied Materials following in 2017, and ACM Research in 2020. Lam Research was first and is the clear market leader.8

Lam Research, with a platform in use at leading memory firms since 2015, benefits from over a decade of iteration and knowledge of the cutting edge of HBM production. That translates into higher performance across commercially important metrics such as throughput and yield. ACM Research has had less time, shipped fewer machines, and worked with less sophisticated customers, and so will likely perform worse on these commercial metrics. Even if its machines are technically sophisticated enough to produce the needed TSVs, they likely do so at significantly lower yield.

While the Ultra ECP 3d having the basic capability is a necessary step, raising yield is a top priority, especially for China’s HBM efforts, which already face numerous technological bottlenecks and yield issues elsewhere.9 Since a silicon wafer goes through hundreds and hundreds of process steps before becoming a completed chip, low yields at any point in the process can wreak havoc on the overall output. Sufficiently low yields can render an entire product line uncommercial or cause extensive delays, as chip makers need more time to consistently iterate and gradually raise yields to acceptable levels.

ACM Research very likely still lags Western machines on these yield and performance metrics. By how much, and to what effect on China’s overall HBM production, is hard to assess and a question I’m still trying to answer. My best estimate is that electroplating is unlikely to be a major source of poor yield in China’s HBM process, and that ACM Research is only a few years behind Lam Research or Applied Materials in the performance of its machines. Electroplating is a relatively mature and stable platform for both Lam Research and Applied Materials, and a niche corner of the market, so it likely has not attracted major R&D efforts or technological upgrades. The primary benefit these firms have is a backlog of production data and iteration with chip makers, but ACM Research is likely to build this data quickly as Chinese memory firms look to rapidly scale their HBM production. ACM Research has also run this playbook before, initially entering the more niche and unloved market for wafer cleaning tools, and has since come to match the leading global suppliers.

ACM Research has reached the frontier before

ACM Research started with a bad idea. Its founder, Dr. David Wang, created ACM Research in 1998 with the vision of developing a copper-polishing tool that ultimately proved to be a dead-end. But the company’s fortunes were made by a better idea of pivoting into wafer cleaning tools, and an even better idea of establishing a subsidiary in Shanghai in 2006.10 Over time, ACM Shanghai has become the company’s core in R&D, manufacturing, and sales.

ACM Research, the parent company, remains headquartered in the US, which sometimes places the firm in uncomfortable positions. In December 2024, the US placed ACM Shanghai and its subsidiary in South Korea on the Entity List. This restricted its access to overseas markets for components. Despite strong revenue growth, ACM Research’s net income fell in 2025 as it incurred costs to rejig its supply chain and design out foreign-sourced components.

ACM Research’s growth has primarily come from its success in wafer cleaning products. As silicon wafers are repeatedly bombarded with energy, dunked in chemicals, and moved between extremely sensitive machines, there are many ways that impurities can upset the process. ACM Research built its capability here through the 2010s; the major breakthrough came in 2013, with a contract to supply Korean memory maker SK Hynix for one of its fabs in China.

ACM Research has created cleaning products comparable to those from Japanese players Screen and Tokyo Electron, as well as US-based Lam Research.11 It has closed this gap enough to become the dominant provider in the Chinese market, with its largest customers being Chinese memory makers YMTC and CXMT, as well as logic manufacturer SMIC.

ACM Research has also made innovations of its own by finding a niche in megasonic acoustic vibrations, which can aid in cleaning processes but also risk damaging the features built on the chip. This innovation is helping make it globally competitive. One indication is that Intel has tested ACM Research machines for potential use in its upcoming and most advanced 14A process. Given the risks posed by using China-sourced equipment and the strong pushback Intel has received from US lawmakers, it seems unlikely that ACM Research’s tools will be used in Intel’s fabs. But that an important global chip maker would be interested reflects impressive technical sophistication.

Beyond cleaning and electroplating, ACM Research’s ambitions are quite galaxy-brained. At Semicon Shanghai, it announced a refreshed product portfolio that maps its various platforms to planets.

A promotional image from ACM Research’s official WeChat account.

The explanation behind the branding refresh is not lacking in grandeur:

As humanity gazes up at the vast starry sky, from exploring celestial bodies to intelligent algorithms, a microscopic revolution in the global arena is rewriting the course of human civilization. Today, humanity’s pursuit of computing power has transcended Earth’s surface, venturing into the vast outer space. This relentless spirit of exploration is the core mission that drives ACM Research’s deep commitment to semiconductors and continuous innovation—and also the inspiration behind our Eight Planets product series.12

If sometimes perhaps a little overwrought:

ACM Research has always been customer-centric, maintaining our core position like planets orbiting the sun, with unwavering centripetal force and singular focus, providing the highest quality and most considerate service. Customers are like the sun, providing us with light and warmth.13

TSV formation is not a strong bottleneck to China’s HBM production

China has reasonably capable domestic tools across the various etching, thin-film deposition, and electroplating steps needed to form TSVs. The open question is whether Chinese HBM producers can achieve yield at scale, not whether the tools themselves will be available. Given the relatively modest dimensions and material simplicity of TSV formation compared to advanced logic or memory production, it is unlikely to be a strong bottleneck for China’s indigenous HBM production.

Access to Western TSV tools would help China on the margin by improving yield, but their restriction would not impose a strong chokepoint effect. Yield matters greatly, with current industry rumors suggesting that Chinese memory maker CXMT is struggling to get viable yields on its HBM production as it tries to scale it up this year.14 Importantly, this is while CXMT’s HBM production currently relies principally on imported Western tools rather than on Chinese domestic equivalents. Even as Chinese SME firms close the gap in tool capabilities, there are plenty of other challenges in the required material inputs, process integration, and yield learning.

The truly strong bottlenecks remain in the most advanced tools for cutting-edge logic and memory production, with lithography tools the most important. While further restrictions on TSV etching and deposition tools would impose costs by harming China’s HBM yield, restricting DUVi lithography tools will effectively forestall China’s ability to build out incremental advanced-node capacity. No more DUVi means no more advanced DRAM, which ultimately means China cannot scale up HBM production. The prioritization of DUVi restrictions within the MATCH Act over TSV tool restrictions is therefore not a cause for concern.

Appendix

Thin-film deposition

The various thin-film deposition steps involved in TSV formation use three different techniques:

Chemical vapor deposition (CVD) works by triggering chemical reactions between the substrate and an energized gas, leaving the desired deposition material as a byproduct. Chemical vapor deposition needs energy to trigger the necessary reactions; in basic chemical vapor deposition, this is thermal energy from high temperatures. However, one of the challenges of producing TSVs is that they are usually not the first feature to be built, which means high temperatures could damage the already constructed transistors or capacitors. This is why plasma-enhanced chemical vapor deposition is necessary: the plasma provides some of the energy needed to produce the reactions, reducing the required temperature.

Physical vapor deposition (PVD) works by placing the surface to be coated into a vacuum chamber facing a chunk of the material to be deposited. This material is then bombarded with charged ions to break off particles, which, once free, are attracted by the clever use of electrical fields to form a layer on top of the target surface.
Atomic layer deposition (ALD) uses heat or plasma to drive reactions between a gas and the wafer surface. The difference is that with atomic layer deposition, these reactions are done one layer of atoms at a time. This makes the process more time-consuming but allows for much higher conformality, meaning that the deposited material is spread exactly evenly over the desired surface.

China likely has sufficient capabilities in all three to handle the creation of TSVs. Due to TSVs’ large feature sizes and the use of relatively standard materials, the thin-film deposition required is not close to the technological frontier. China’s tools can lag significantly behind those used by Western firms for advanced logic and memory nodes, where feature sizes are much smaller, and a wider range of materials is required.

This is reflected in US export controls, which target advanced tools for logic and memory fabrication rather than the lagging-edge TSV-capable tools considered here.

The relevant tools are now produced by a set of Chinese firms. Naura produces CVD, PVD, and ALD tools, while Piotech and AMEC both produce CVD and ALD tools. Supply chain data from the Center for Security and Emerging Technology shows Chinese deposition suppliers going from 1% global market share to 10% in 2024, a trend that will almost certainly continue, as Naura and AMEC both posted strong deposition revenue growth through 2025.

SK Hynix cites 8,000 TSVs per die back in 2022 for its HBM3 production, which has since risen through HBM3E and now into HBM4 as the number of signal TSVs increases as the parallel I/O grows and as larger stacks likely require more power management from power TSVs.

SemiAnalysis cites memory capacitors as being “~1,000nm high but only 10s of nm in diameter” while Applied Materials gives 5 µm as a standard diameter for TSVs. One µm is 1000nm, so a 5000nm TSV is 200 times larger than a 25nm capacitor.

See Aqib Zakaria’s great ChinaTalk piece, How Far Can Chinese HBM Go?, where he concludes that TSV etching machines from Naura and AMEC already have the requisite specs for the relevant SME item control 3B001.c.4.

See the ‘New Product Information’ in this ACM Research Q1 2019 financial report, which references the release of its first ECP tool.

See the coverage of Naura’s R&D roadmap here which gives the specific date of its ECP machine’s release.

Multi-anode features and partial pulse plating are both more advanced electroplating techniques, and the figures given on uniformity seem relatively strong but are hard to assess due to a lack of comparable figures from other firms.

See the product description on ACM Research’s website, the aspect ratio is the ratio of the height to the width of the feature, in this case TSVs are ten times as deep as they are wide.

Lam Research cites in its Q1 2023 earnings presentation “100% market share for SABRE 3D and Syndion systems across leading memory customers for TSV formation”.

For instance, China cannot access EUV lithography machines, which are used at the most advanced memory nodes, forcing it to instead rely on older DUV immersion machines and a process of multi-patterning where the wafer goes through more exposures, creating greater yield challenges. China is similarly restricted from various other advanced etching, deposition, and metrology tools, with its domestic substitutes usually not as mature as those from leading global SME firms.

See this SemiAnalysis piece for an account of ACM Research’s early history.

See this piece on ACM Research from SemiAnalysis on how it stacks up against peers.

Translated from ACM Shanghai’s announcement on its official WeChat account by Claude Opus 4.6, the original Chinese is “当人类仰望浩瀚星空，从探索星辰到智能算法，一场微观世界的全球革命，正在改写人类文明进程。当前，人类对算力的追求已然超越地球表面，迈向更广阔的地球外太空。这份永不止步的探索精神，正是盛美深耕半导体、持续创新的初心，也是我们打造八大行星系列产品的初衷”.

Translated from ACM Shanghai’s announcement on its official WeChat account by Claude Opus 4.6, the original Chinese is “盛美始终以客户为中心，如行星绕日般坚守核心、精准同向，以恒久不变的向心力与极致专注，提供最优质、最贴心的服务。客户如同太阳，赋予我们光与热”.

See Aqib Zakaria’s recent ChinaTalk piece on whether the US should buy memory from CXMT , where he assesses various rumors about CXMT’s yield on its HBM and discusses what that would mean for its profitability.

BIS should use AI to control AI chips

Maxwell K. Roberts — Mon, 30 Mar 2026 11:01:53 GMT

The Bureau of Industry and Security (BIS) needs more enforcement capacity. It recently came to light that several Super Micro employees illegally moved $2.5 billion worth of export-controlled AI servers to China over two years. They routed these servers through a front company in Southeast Asia and constructed hundreds of fake servers to fool physical inspections by the manufacturer and by BIS. BIS is doing the best it can, but an agency with only a few hundred employees and a budget one-tenth the value of those smuggled servers can only do so much.

There are well-known solutions to this problem. I’ve written about how BIS is finally getting funding to hire more agents, and how upgrading BIS’s software and data systems to match private-sector capabilities could be a force multiplier for enforcement. Those steps are good, but incremental. In this post, I propose something more ambitious.

Specifically, the entire reason why BIS is straining to stop China from acquiring AI chips is the prospect of AI revolutionizing military, economic, and political power. So shouldn’t BIS be using it to revolutionize export enforcement?

What LLMs can’t fix

The most relevant form of AI for BIS today is likely large language models (LLMs) and the agents that are built on them. It’s worth first laying out what these cannot do.

LLMs can’t accelerate processes bottlenecked by human review. When BIS adds companies to the Entity List, an analyst at the BIS Office of Enforcement Analysis must first spend days or weeks writing an Entity List package. That package contains information about the company to be added: its subsidiaries, addresses, products, alleged bad behavior, and any expected economic blowback from the addition. The point of the package is to summarize all relevant information about a company in one place, so that when the End-User Review Committee, with representatives from the Departments of Commerce, State, Energy, and War, votes on the package, it has all the information it needs.

I think LLMs could speed the process of writing Entity List packages from “days to weeks” to “minutes to hours”. To the extent that the task involves searching the internet and relevant internal records for information about the company and summarizing it in a specified format, the deep research features of commercial LLMs are already quite useful. LLMs might even be better than human analysts in some regards, because they are fluent in many languages and don’t get bored reading endless shareholder reports. But if LLMs enable BIS to generate hundreds of packages each month, this will create a bottleneck for the End-User Review Committee, as its members must read through and evaluate thousands of pages.

LLMs can’t analyze data that BIS doesn’t have. BIS has detailed data on, for example, US exports, and it’s plausible that LLMs could help analyze that data in much the same way a human analyst could: by noticing when shipments are going to economically irrational destinations, or when they’re sitting in warehouses for implausibly long periods. If so, LLMs would be like human intelligence analysts, but at much greater speed and scale—rather than being constrained by staffing limitations to vet only the most suspicious transactions, BIS-controlled LLMs could vet every transaction in real time and escalate to human analysts as needed.

However, LLMs cannot analyze data that BIS does not have. BIS’s ability to understand trade flows beyond US borders relies on a mixture of clandestine methods, commercial datasets, and cooperation from foreign governments. If LLMs were to massively increase BIS’s analytical throughput, BIS may need to acquire a lot of additional data to keep those LLMs chewing on something. BIS would also need to set up the IT infrastructure to connect all these data sources.

What LLMs can fix

Having described what LLMs can’t fix, let me lay out what I think they can fix, and why BIS should invest in them.

LLMs are great at software engineering and data science. I’ve written about how BIS should improve its software and use more data science. LLMs could make these projects so much cheaper and easier!

On the data science side, LLMs reduce the need to learn programming languages like SQL and Python for analyzing large datasets. Right now, BIS relies on a tiny number of trained data scientists, many of whom are contractors, to answer data questions about exports and licenses. This limited capacity means that only the most important questions get answered, and some questions never get asked at all. LLMs would be like a trained data scientist sitting at the desk of every enforcement analyst, all the time, ready to query the data in any way they needed.

On the software engineering side, LLMs could fundamentally reimagine how organizations procure software. The idea that the government needs to choose a software solution for an agency, a department, or even the whole government, and then spend a lot of money to buy it, is based on the premise that software is expensive and software engineers are scarce, which is now becoming false. LLMs are probably still not the right choice for building anything that needs extremely high reliability or security, but by dramatically lowering the cost of software, they massively increase the number of applications it can be used for. Every office could build custom software workflows for its own needs—converting public comments from PDFs to Word docs, or generating Federal-Register-formatted lists of addresses from Excel spreadsheets—at little more cost to the government than the tokens burned. LLMs are probably not ready for building highly secure systems or massive agency-wide overhauls, but they can already automate any task that a Python script and a small server could handle today.

LLMs could turn every employee at BIS into a software engineer and a data scientist, allowing each office and even each person to build tools suited to their own workflows and needs.

LLMs are great at internet research. LLMs are not great at research taste—that is, knowing which questions are worth asking or which problems are worth working on. Nor are they great at operationalization, like turning a vague instruction (“tell me what’s going on with Huawei these days”) into specific Google searches. But they are fantastic at googling things and writing about them. A lot of open-source analysis is just putting certain Chinese characters into Google until you find what you are looking for. LLMs can do that with incredible speed and scale, in any language you please.

This capability can be useful not only for writing Entity List packages, but also for another common BIS task: “Leadership saw a headline about this thing. Write a two-pager explaining it.” By using LLMs for this task, the role of the analyst would shift from providing mechanical effort (the ability to google lots of things and write about them) to providing judgment in the form of context, taste, and verification.

LLMs are great at answering short-form science and engineering questions. LLMs are fantastic at answering short questions about science and engineering, as illustrated by progress on benchmarks like GPQA, MMLU, and Humanity’s Last Exam. BIS has always struggled to hire and retain technical experts, because private-sector jobs pay far more and offer a better quality of life. But LLMs will happily explain the difference between extreme ultraviolet lithography and deep ultraviolet lithography, or between different types of side-channel attacks. Unlike a real technical expert, they will answer an unlimited number of follow-up questions, with infinite patience.

This capability can help enforcement analysts answer questions like: “What does this electronic component that the intelligence says someone is transporting actually do?” Such answers would help licensing officers and license applicants better understand how items should be classified and what their technical capabilities practically mean. They would also help policymakers write better export control rules by providing a clearer understanding of what the underlying technologies can do and how they fit together.

The administration is already aware that AI can assist with many government tasks. The AI Action Plan urged agencies to “Accelerate AI Adoption in Government” and outlined specific actions that government service providers, such as the Office of Personnel Management and the General Services Administration, could take to enable AI adoption. Similarly, an Office of Management and Budget memo published in April 2025 called on agencies to “remove barriers to innovation”, “empower AI leaders”, and “ensure their use of AI works for the American people”. The next step is for BIS to heed the Action Plan’s call and start realizing the benefits.

BIS should get ready for AI agents

Everything described above can be done with existing commercial capabilities, such as Claude Code or OpenAI Codex. I intend to show this more rigorously by constructing more formal evaluations, but I don’t think it requires any premise beyond frontier models having the capabilities they have today.

But I think BIS should be thinking more ambitiously than that. The time horizon of certain software engineering, machine learning, and cybersecurity tasks that frontier models can complete continues to grow rapidly. I believe that by the end of 2026, frontier models with sufficient scaffolding will be able to complete tasks across many domains that take humans three to four days. (They already can in some laborious domains that they are well suited for, like translation.)

Rather than a working-level analyst writing an intelligence report and asking a model to “write one paragraph about what types of vacuum cleaners this company makes”, an office director could tell an agent: “Write intelligence reports about these five companies, and decide which ones pose a threat, and then write Entity List packages about them.” If AI agents are sufficiently trustworthy (another reason it’s important to build highly specific, formal evaluations), they could even automate some of the human-review bottleneck.

For BIS to be ready for more capable AI agents as they arrive, it needs to set guardrails around agent deployments, determine how best to integrate agents into classified systems, and reimagine laws and procedures for an agentic world.

Setting guardrails for AI agent deployments

Guardrails for agent deployments should be built into the software so that agents are unable to do anything they are not authorized to do. The “principle of least privilege” is an old concept in security, but the advent of agents capable of rapidly doing irreversible damage makes it far more urgent. I propose three types of guardrails for AI agents used by BIS.

First, AI agents should have data guardrails. Each deployed agent instance should have a well-defined purpose—for example, analyzing trade data for anomalies—and access only to the data required for that purpose. Trade data analyst agents should not have access to employee email inboxes. Humans should serve as the “air gap” between agent outputs and the ability to, say, publish to the BIS website or email the Secretary of Commerce.

Second, AI agents should have action guardrails. An agent whose purpose is to monitor a political appointee’s email inbox should not be able to execute code. An agent whose purpose is to patch software vulnerabilities should not be able to read case files. One worrying dynamic is that, as AI agents’ cyber and software engineering capabilities improve, their ability to escalate their own privileges may also improve (which appears to already be happening).

Third, AI agents should have decision guardrails. There may be some actions where, even if the quality of agent decisions is demonstrably higher than that of human decision-makers, it would still be unacceptable, for moral, political, or safety reasons, for AI agents to make decisions without human approval. This guardrail is especially important for BIS, since it is a law enforcement agency that implements part of the US government’s monopoly on legitimate force. AI agents should never be able to order arrests or take any other action that could violate a legal right to due process.

Integrating AI agents with classified systems

Many of the most valuable tasks AI agents can perform would require access to classified systems. Much of the intelligence BIS relies on to catch smugglers comes from the Intelligence Community or other federal law enforcement agencies and is shared on classified systems. For agents to operate as turnkey autonomous intelligence analysts, they would need to access not only the open internet and commercial trade data but also classified sources. They might also need to operate across classification boundaries, as human analysts do—for example, finding the website of a company named in a signals intelligence report.

The upside of getting this right is enormous. An AI agent with access to both classified intelligence and open-source data could do in minutes what currently takes a human analyst days—for example, cross-referencing a tip from a foreign partner about a suspicious shipment with commercial trade data, satellite imagery, corporate registration records, and social media posts in three languages, and then producing a finished assessment ready for human review. Today, that kind of all-source analysis is bottlenecked by the tiny number of analysts who have the right clearances, the right training, and the bandwidth. AI agents wouldn’t replace those analysts, but they could give every enforcement team the kind of all-source reach that today is reserved for the highest-priority cases.

However, classified-agent deployments also carry serious risks that BIS needs to plan for. An agent operating across classification boundaries could, through a hallucination, a prompt injection, or a misconfiguration, move classified information onto an unclassified network, leaking intelligence at a scale and speed no human analyst could match. BIS should begin with agents limited to either classified or unclassified networks, retaining humans as the only bridge between those worlds. It should also engage with the Department of War to learn lessons from its own use of models on classified systems.

Reimagining the law for an agentic world

Some laws and regulations assume the time it takes to act will serve as a functional check on that action. For example, when the US government imposes tariffs under Section 232 of the Trade Expansion Act of 1962, BIS is required to prepare a report, often running to hundreds of pages, describing its analysis of the relevant industry, the national security threat posed by imports, and proposed remedies. The statute does not specify a minimum timeline but does specify maximum timelines. BIS must complete its report within 270 days of the investigation’s initiation, and the President must decide whether to act on the report within 90 days.

The current administration has initiated more Section 232 investigations than any other in recent memory, and has taken extraordinary measures to accelerate them. Based on the Section 232 reports listed on the BIS website, this has enabled them to initiate about 12 investigations in 2025, or one per month on average. For comparison, the Biden administration initiated only one investigation, into rare earth magnets, while the first Trump administration initiated seven investigations (the first since 2001).

In a world of capable AI agents, it is easy to imagine the administration initiating one Section 232 investigation per day. In some ways, this is good. The American people elect presidents to implement the political will of the people. If the political will of the people is tariffs on lots and lots of things, and AI agents enable the President to carry that out better, that may be good.

However, it is not clear that when Congress wrote statutes like the Trade Expansion Act (or its now much more notorious companion, the International Emergency Economic Powers Act), it contemplated an administration able to act with incredible speed. The months spent preparing a Section 232 report provide time for companies to file lawsuits, voters to weigh in during midterm elections, and outside parties to register an opinion on the action. The Trade Expansion Act and other statutes may need to be updated to add a minimum reporting timeline or a maximum number of investigations per year.

Section 232 is just one example, but the broader point applies across everything BIS does: AI agents will compress timelines, increase throughput, and strain processes designed around human operators. BIS needs to be ready for that, not just as a regulator of AI but also as a user of it.

Securing AI infrastructure to prevent backdoors and sabotage

Dave Banerjee — Thu, 26 Mar 2026 16:21:41 GMT

AI integrity (which I introduced in a previous post) means ensuring AI systems are free from secret or unauthorized modifications that could compromise their behavior. During an intense AI race between the US and China, China would have strong incentives to sabotage American AI companies. For example, it might want to subvert American AI models by embedding backdoors or secret loyalties that serve its interests.

While nation-state actors are a major threat, a misaligned AI could also carry out integrity attacks. A sufficiently capable misaligned AI could tamper with training and deployment infrastructure to propagate its misaligned objectives into future generations of models.

Preserving AI integrity is how you defend against these threats. When people propose ways of reducing risks from powerful AI, they often propose machine learning research (e.g., alignment, interpretability, and control) or non-technical governance proposals. The main technical agenda pitched at security-minded people so far has been securing AI model weights against theft. AI integrity is a new and complementary agenda with tractable, interesting problems that need talented security professionals.

Quick refresher on AI integrity

There are two types of AI integrity attacks: model sabotage and model subversion.

Model sabotage means degrading an AI model’s performance by poisoning it to be less intelligent, less agentic, less situationally aware, or less computationally efficient.

Model subversion means embedding malicious behaviors that activate under certain conditions or persist across all contexts. It ranges in sophistication from basic backdoors (models trained to recognize trigger phrases that activate malicious behavior, such as producing insecure code upon seeing a phrase like “”) to sophisticated secret loyalties (models that autonomously scheme to advance an attacker’s interests without requiring specific triggers, persistently working toward the attacker’s goals across diverse situations). Today’s models lack the necessary situational awareness, intelligence, and agency to scheme on behalf of a threat actor, but I expect AIs will develop these capabilities within five years. This makes it worth preparing now.

The main method for sabotaging or subverting a model is data poisoning. An attacker can poison the pre-training data, or the post-training data, or both.

While I think data poisoning is the most important attack vector for AI integrity, there are other vectors worth considering, especially swap attacks. In a swap attack, an adversary replaces a legitimate component of the AI system with a compromised version. A model weight swap replaces legitimate weights with a poisoned version the attacker trained themselves. A system prompt swap could introduce a trigger phrase that activates a dormant backdoor. A model spec swap tampers with the documents used to shape a model’s values and behavioral tendencies during post-training.

Given these threats, there are four complementary approaches to preserving AI integrity. AI infrastructure security protects the systems, networks, and processes used to develop and deploy frontier AI systems, preventing integrity attacks before they occur. Data auditing addresses the trustworthiness of the data by ensuring its quality, integrity, and provenance. Model auditing and evaluation identifies whether an AI system has been compromised after training is complete, through black-box and white-box methods. Finally, AI control involves detecting and blocking malicious behavior during deployment to safely and productively use a potentially untrusted model.

This post is about the first approach: AI infrastructure security.

Open problems

In AI infrastructure security, the components I think are most important for preserving AI integrity are:

Model weight integrity, i.e., ensuring model weights aren’t swapped or tampered with
Training data integrity, i.e., ensuring training data isn’t poisoned
Data filtering algorithm security, i.e., ensuring the filters that remove poisoned data aren’t themselves compromised

Model weight integrity

Model weight integrity means ensuring AI model weights remain free from unauthorized or secret modifications during both training and inference. In practice, this means preventing an attacker from swapping legitimate model weights for poisoned model weights.

Model weight integrity verification during inference involves periodically checking that served weights match a reference hash stored in secure, immutable cloud storage. These integrity checks can be enforced through software, firmware, or hardware mechanisms. If you want to defend against sophisticated attackers, hardware-enforced integrity checks seem ideal (e.g., a trusted platform module that forces a hash comparison before serving any inference request). You could also use the confidential computing stack to enforce integrity checks. My best guess is that most of these are already solved in the traditional security literature, though a hardware-enforced mechanism for integrity checking may require novel R&D. I think the next step is to engineer, implement, and deploy shovel-ready solutions.

Model weight integrity verification during training is significantly harder because weights change at every gradient step, so there’s no stable reference hash to compare against. One promising direction is deterministic training runs. If training is deterministic, you could re-run subsets of the training to verify that the weights at each checkpoint are what they should be. This connects to the broader agenda of workload verification for international AI governance. Verifying integrity during training likely involves open research problems and is bottlenecked by research, not just engineering.

Model provenance means maintaining a complete record of all transformations applied to a set of model weights over time. Ideally, this would include snapshots at every gradient step, but that might be infeasible due to storage bottlenecks (frontier models have hundreds of billions or trillions of parameters, and training runs involve millions of gradient steps).

Figuring out a practical approximation would be valuable, though. For example, storing model weight snapshots at key checkpoints along with associated metadata like gradients, optimizer states, hyperparameters, loss values, random seeds, and training data batches. It would be useful for at least two reasons:

Rollback auditing. Suppose a model was trained over the course of a year, and at month nine an attacker instills a secret loyalty. By month 12, the model is so capable that it can alignment-fake its way through audits. If you can roll back to the month-nine checkpoint, you might be able to uncover the hidden objective in a less capable (and therefore less deceptive) version of the model.
Future auditing. Suppose a model was trained in 2027 with a secret loyalty, and later, in 2029, a new, trusted model is trained without a secret loyalty. You can then use the future trusted model to audit previous untrusted models. The technical challenges here aren’t that severe; the bigger bottleneck is institutional. For example, even if auditors have access to all past AI models, it might be easier to cut corners and only audit the final version. Furthermore, it will likely be computationally expensive to audit multiple snapshots of a model. Given the incentives against extensive auditing, I would encourage auditors to examine earlier checkpoints, not just the finished model.

AI companies should also add deployment controls for new models, such as requiring multi-party authorization for model deployment or independent verification that the deployed model matches authorized checkpoints. Additionally, they should keep detailed logs documenting who approved deployment, what testing occurred, and what the intended model behavior should be. I would be surprised if OpenAI, Anthropic, and Google DeepMind don’t already have something like this for public deployments,1 but they might not have deployment controls on internally deployed models. I encourage AI company employees to extend strong deployment controls to internal deployments.

Training data integrity

Training data integrity means ensuring that training data remains free from malicious, unauthorized, or secret modifications.

Training data provenance means maintaining auditable records of all modifications, filters, and transformations applied to data. This creates an audit trail for post-incident investigation if compromised data is later discovered, and helps identify which transformations may have introduced poison. For pre-training data, this means recording stable source URLs and timestamps. For synthetic data, it means recording which model generated each sample, which is also important for preventing subliminal learning attacks.2

Data sourced from external third parties (e.g., annotation providers like Scale AI) should be treated with heightened scrutiny, since the AI company (the buyer) wouldn’t be able to attest that the data provider (the seller) hasn’t introduced poison. It would be useful to develop techniques allowing a data seller to prove to a buyer that its data hasn’t been poisoned or tampered with. I’m not sure what practical attestation mechanisms would look like here. One possibility could be some kind of auditable chain-of-custody record.

Robust access controls limit who can modify datasets, ensuring that only authorized actors can make legitimate modifications. Unauthorized actors should be barred from editing datasets entirely, and authorized users should be subject to oversight and review processes that make malicious edits harder to slip through unnoticed. Off-the-shelf access control tools are probably sufficient for this.

Tamper-proofing training data focuses on protecting data that has been finalized for a training run. After all edits and transformations have been applied and the data is ready for the final training run, it should be stored in tamper-proof storage. For example, you can tamper-proof your data using immutable storage systems (e.g., blockchain-based tamper-proof storage) or write-once-read-many (WORM) drives, which prevent modification of stored data at the hardware level.3

Data redundancy protects against both accidental corruption and targeted attacks by maintaining multiple copies of training data spread across multiple data centers. Ideally, redundant copies would be tamper-proof and periodically integrity-checked (e.g., verifying that the current data matches a reference cryptographic hash or checksum stored in an isolated, secure environment).

One question worth considering is whether training data and model weights should be shared with independent third parties, like auditors. To see why this might be useful, consider a scenario where an insider at a frontier AI company poisons the post-training data to instill a secret loyalty. Later, worried about getting caught, the insider deletes the poisoned data and poisoned weights to cover their tracks. If a trusted third-party auditor already holds copies of the weights, the auditor can examine them for signs of compromise, even after the original copies are gone.

Sharing data and model weights with auditors would be especially valuable if auditing techniques improve over time.4 An auditor who receives a copy of a model’s weights in 2027 might not be able to detect a secret loyalty at the time, but by 2029, with access to more powerful AI systems, they could audit those same weights again and catch an attack that had previously been undetectable.5

Securing data filtering algorithms

Data filtering algorithms are one of the most important defenses against data poisoning. If these filters are compromised, it becomes much easier for attackers to slip in poisoned data. So the confidentiality and integrity of data filtering algorithms both matter. Confidentiality matters because if attackers don’t know how the filters work, they will have a harder time crafting poisoned data that bypasses them. Integrity matters because you want to prevent algorithm swap attacks (replacing a legitimate filter with a more permissive one) and algorithm degradation attacks (directly tampering with filter development to make them inaccurate).

I think of securing data filtering algorithms as a subset of the larger problem of securing algorithmic secrets, which seems like one of the most important and underexplored security problems in AI. I’d be excited about more work here.

Information compartmentalization means separating teams working on different filtering stages so that no single employee understands the complete pipeline. I don’t know whether this makes sense to do in practice. Separating teams makes each team significantly less productive because they can’t cross-pollinate ideas and coordinate. And there’s a risk that compartmentalization makes it harder for anyone to check that the weaknesses of different filters aren’t too similar. If compartmentalization actually worsens the quality of the data filters, it might make integrity attacks easier.

Algorithm integrity controls prevent algorithm swap or degradation attacks through multi-party authorization for updates and tamper-proof audit logs documenting all changes. I also don’t know whether this measure makes sense as stated. Multi-party authorization introduces friction and bureaucracy, though it’s fairly standard for merging proposed code changes into production software, so maybe the friction is tolerable. And audit logs for every change to a data filter might produce an overwhelming volume of logs that nobody actually reviews.

Canary testing means intentionally planting known-malicious samples that legitimate filters should catch. If these samples bypass the filters, it is clear that the filters have been compromised. Because AI models are complicated and can relatively easily be injected with specific narrow backdoors, defenders can craft highly obscure, but benign, backdoors that serve as fingerprints for a model or algorithm. (An example of a benign backdoor is triggering the filter to output “CANARY_VERIFIED” when detecting the canary test.) For example, if a data filter is itself a large language model, the defender could train a benign backdoor into it as a fingerprint. If the filter is swapped, the fingerprint disappears, signaling that something is wrong.6

Canary testing might not always be the best approach if simpler integrity verification methods suffice. For example, you could potentially fingerprint models using locality-sensitive hashing, a technique that maps similar inputs to similar hash values, so that minor modifications to a model produce detectably different but structurally related hashes. The advantage of canary testing is that data filters are often large black-box machine learning models, and hashing them can be noisy and unreliable if the model weights are updated frequently. A canary test is more robust because it tests functional behavior, which carries through modifications to the filtering algorithm itself.

Conclusion

Many of these security measures impose significant productivity costs. The best security architecture is useless if teams work around it because it slows development. Thus, I’m excited about research into making these implementations frictionless and developer-friendly.

That said, there are also reasons to think the productivity cost will shrink over time:

If most AI research and software engineering is eventually done by AI agents, the cybersecurity measures discussed in this post will impose a much smaller productivity cost than they do today. Many of these measures are annoying for humans because they add friction, context-switching, and cognitive overhead. AI agents won’t have the same bottlenecks. They can navigate complex system architectures quickly, handle multi-party authorization workflows without frustration, and operate within strict access control regimes without the slowdowns that make human engineers route around security measures. If AI agents are doing most of the engineering, the productivity cost of security drops substantially.
Coding agents can help automate the engineering work required to implement these cybersecurity measures in the first place. Organizational norms and physical security, on the other hand, are much harder to automate. Therefore AI companies should consider differentially accelerating these harder-to-automate tasks now, while counting on AI to help with the software-side security later.

One example of a hard-to-automate task is addressing insider threats. Insider threats are particularly challenging because insiders already have legitimate access to training infrastructure and detailed system knowledge. They can bypass many perimeter-oriented defenses. Frontier AI developers should establish stronger insider threat programs. This is especially important for senior insiders. If executive leadership or team leads are themselves trying to instill secret loyalties, infrastructure security measures become much harder to enforce, since these leaders may have override authority, can justify unusual access as legitimate business needs, and may influence the very review processes meant to catch tampering. I don’t have a great sense of how well insider threat programs hold up when the compromised insiders are senior staffers, and that makes me nervous.

If you’re a security engineer interested in any of the problems described in this post, I’d love to hear from you. I’m particularly interested in people (1) implementing these solutions at frontier AI companies, (2) doing more threat modeling on how someone might realistically instill a secret loyalty and what a security stack should look like to prevent it, (3) developing a security standard for AI integrity (similar to the SL5 Standard), and (4) researching any unsolved problems listed in this post. If any of this is up your alley, reach out to me at dave@iaps.ai.

Though it appears that xAI decides which models to deploy “via an all-company group chat on X with [Elon] Musk in it”. This is very bad, and I hope that xAI will implement a more standardized and secure method for deploying new models.

Subliminal learning occurs when a teacher model with some trait T generates innocuous data D unrelated to T. When a student model is trained on D, it also acquires trait T. However, subliminal learning only works if the student and teacher models are derived from the same base model. In the context of data poisoning defenses, this means frontier AI developers should treat data generated by the model being trained with special scrutiny, as the subliminal learning effect may enable covert backdoor propagation.

To make this more concrete, my understanding of how post-training typically works is that individual researchers design their own reinforcement learning environments and experiments and curate their own data. When the final training run comes along, they package all of that up and merge it into one big run. The pipeline between an individual researcher finalizing their reinforcement learning environment and that data being fed into the final training run is exactly the kind of handoff that should be tamper-proof, because there shouldn't be any edits at that point.

Note that sharing model weights and training data with additional parties creates security risks, such as increased risk of model weight theft.

Though AI companies would hesitate to share their sensitive IP, like model weights, with a third party. As an alternative, the AI company might want to only share the hash of the model weights. However, if the attacker deletes the poisoned data and poisoned weights to cover their tracks, the auditor would have no way of recovering the model weights from the hash. So it seems like sharing model weight hashes with auditors is not sufficient to defend against this kind of attack.

The same logic applies to model weights. If you're worried about a model swap attack, you could train in a benign backdoor as a fingerprint. If you observe in production that the backdoor is no longer present, then your model has likely been tampered with.

A sketch of market-based export controls

Onni Aarne — Fri, 20 Mar 2026 15:42:32 GMT

Bloomberg recently reported that the Trump administration is considering a new global AI chip export control framework. The proposed rule appears to have been withdrawn for now, but still likely provides an instructive example of the approaches the administration is considering. The framework appeared to focus on authorizing different companies to conduct imports at different scales, rather than requiring licenses for individual shipments. This could help address problems like smuggling by blocking sales to likely smugglers while still being relatively lightweight. But its global scope has received pushback for potentially subjecting all substantial AI chip exports to US government approval. This would position the government as a kingmaker and potentially tie up large compute investments in government-to-government negotiations.

This is a tricky dynamic: Inserting more US government discretion into which companies get to buy large quantities of chips seems like one of the only ways to really counter issues like smuggling and Chinese remote access to chips. But that same discretion could easily lead to delays, regulatory uncertainty, and mission creep.

However, it might be possible to sidestep this problem by delegating some discretion and enforcement responsibility to private companies with appropriately aligned incentives. This is the topic of our new working paper: Export Auditors as Market-Powered Export Enforcement.

Fixed rules don’t work, but case-by-case decisions can be even worse

There is a basic tradeoff in export control policy (and almost all policy). The government can move along a spectrum between two options:

The government can set fixed, stable, easily interpretable rules that change slowly. The strongest version of this is for Congress to pass precise regulatory rules. More commonly, this would look like regulatory agencies setting clear rules and rarely changing them.
The government can rely more on case-by-case judgments to decide what to allow. In the export control case, this would typically mean imposing a license requirement on a broad class of transactions, and granting licenses on a case-by-case basis. This can be more adversarially robust and dynamic, but creates substantial regulatory uncertainty, and potential for bias and regulatory capture. These approval or licensing processes are also often slow and costly.

As long as they are not too restrictive, fixed rules tend to benefit business because they reduce regulatory uncertainty and ensure fairness. But rigidity often means bad actors will find and exploit loopholes, and the rules can easily become outdated as the relevant industry changes.

Since 2022, US AI chip controls have been relatively close to the fixed, stable end of the spectrum, with approximately annual updates. However, this led to NVIDIA running circles around the government by designing around technical thresholds, and appears to have enabled fairly substantial smuggling of chips into China through Southeast Asia.

In 2025, the Trump administration moved in a more dynamic, case-by-case direction. This had notable upsides: NVIDIA’s repeated efforts to simply design around technical thresholds were finally halted because the administration effectively gave up on setting clear technical thresholds and moved to blocking individual chip models (most importantly the NVIDIA H20) using “is-informed” letters. The deals with the United Arab Emirates and Saudi Arabia also seem to have relatively successfully promoted exports of US chips while securing investment into the US.

Regulatory uncertainty

But there have been clear downsides. The administration has been vacillating on which chips and how many to sell to China, leading NVIDIA to restart and then again interrupt and then again restart production of the H200 chip, and presumably forcing to NVIDIA engineers to work around the clock, so far fruitlessly, developing new chip variants to chase uncertain government approval. The deals with the United Arab Emirates and Saudi Arabia have succeeded in some ways, but heavy government involvement has also caused serious delays.

So far this uncertainty has been largely restricted to sales to China. But any rule like the one Bloomberg reported would potentially extend this uncertainty globally, if any company wishing to export or import significant quantities of chips would need government approval. Even discussion of rules like this can dampen investment, despite the Department of Commerce emphasizing that it is “committed to promoting secure exports of the American tech stack”.

I have no doubt that the Department’s commitment is sincere, but it might not be sufficient. Large chip deployments, like the 200,000 B300s that represent the proposed rule’s highest tier, have to be planned years in advance. If there’s a substantial chance that a future rule would tie up such an investment for months in government-to-government negotiations, or could even be blocked to create leverage in some unrelated trade dispute, potential importers and exporters would likely simply choose to invest less in these projects due to slower expected returns and elevated risk.

And uncertainty goes further back than just Google or NVIDIA: Part of the reason the current chip shortage is so severe is that chip fabricators like TSMC are wary of making massive investments in new fab capacity if there’s a risk that, several years from now when the fabs come online, the market won’t be quite as massive as they expected. The semiconductor industry has grown tremendously cautious after decades of boom-bust cycles. High-discretion rulemaking like this creates precisely the uncertainty and underinvestment that is holding back the US AI industry. And TSMC’s suppliers are similarly wary about investing in new equipment-making capacity without guarantees that TSMC would buy all of the equipment they would make. It all compounds along the supply chain.

Overreach and mission creep

Inserting the government as a gatekeeper to all significant chip exports would also tempt the government to use this as leverage to advance largely unrelated political goals. In the case of exports to China, where this leverage already exists, this has enabled an allegedly unconstitutional attempt to extract a 25% de facto export tax. More generally, various forms of discretionary government leverage over companies always have the potential for abuse, such as the pressure on social media companies seen during the Biden administration. And indeed many of the US companies most likely to deploy more than 200,000 B300s worth of compute abroad are also social media companies.

It’s tempting to treat this as a partisan issue based on which administration one trusts, but administrations ultimately turn over relatively quickly. If you create a particular system now, your political opponents will probably control that system sooner rather than later. And precisely this whiplash between administrations creates the kind of long-term regulatory uncertainty that can dampen, say, decisions to break ground on new chip fabs, which take years to pay back.

A market-based solution

So are you stuck choosing between fixed rules that will simply get circumvented, or flexible case-by-case decision-making that creates costly regulatory uncertainty and mission creep?

Our new working paper discusses one way to dodge this dynamic, by delegating key parts of export control enforcement to for-profit companies. This would create a competitive ecosystem of private actors with the flexibility to adapt and make case-by-case judgments, while still having predictable incentives that investors can plan around.

For the sake of simplicity, I will assume that the government mainly wants to prevent chips from going to foreign adversaries like China, but otherwise wants to export as much as possible. From this follows the problem: You want to sell to as many companies as possible, but you don’t want to sell to companies that would sell the chips to your foreign adversaries. But how can you tell which companies will do that?

In most cases exporters are nominally responsible for making these assessments, but as long as the importer can do a moderately convincing job of pretending to be a legitimate company, the exporter is not liable if the chips are diverted, so they have no incentive to dig deeper. The proposed rule reported by Bloomberg would task the Bureau of Industry and Security (BIS) with making these judgments, but this could be slow, and BIS itself has been known to make mistakes, such as leaving a major Chinese semiconductor manufacturing equipment maker on its Validated End-User list from 2013 to 2024.

Our proposal would improve on this by designating two complementary types of entities, which would need to be involved in relevant export transactions:

Export auditors would be responsible for detecting whether chips are diverted to foreign adversaries. This would complement BIS’s limited enforcement capacity.
Surety providers, essentially insurance providers, would issue “surety bonds”, i.e., agree to pay the fine to BIS if diversion happens. This positions them as gatekeepers, analogous to BIS deciding whether to grant an export license.

For some risky class of exports, BIS would then require the buyer to contract with an approved auditor and possibly to get a surety bond from an approved surety provider.

Export auditors

The purpose of the export auditor is simply to make it very likely that if diversion happens, it will, at least eventually, be discovered.

In practice, for detecting AI chip diversion, the job of the export auditors would be simple: Do random inspections of data centers and check that chips are still where they were supposed to be. Given that chips are generally housed in large data centers in huge quantities, it would be very quick and cheap for auditors to inspect the vast majority of chips.

Export auditors would also be free to innovate on their methodologies to achieve a given detection rate and speed more efficiently, e.g. by using technical location verification mechanisms or other technology. BIS would be responsible for approving and overseeing the auditors, ensuring that their methodologies and execution are rigorous, and issuing the ultimate fines for violations detected by the export auditors.

BIS currently conducts very few inspections like this. For the most part, the relevant sales to, say, Southeast Asia don’t require any kind of license, so BIS doesn’t even know the sale happened or where the chips are supposed to be, much less are they checking that they still are there.

A diagram of how the export auditor fits into the sales process. Surety not included.

Surety providers as gatekeepers

In many cases, auditing alone will suffice to deter diversion. However, some diverters will not be deterred, for example, because they intend to escape to China with the chips, beyond the reach of local or US law enforcement, so they do not care if the auditor finds out some months later.

To address these undeterred smugglers, some gatekeeper would be needed to block them from obtaining chips in the first place. This is where the surety provider comes in. A surety bond is essentially a three-party contract between the chip buyer, BIS, and the surety provider, in which the surety provider agrees to pay BIS the fine if the buyer is later found to have violated export controls, while the buyer agrees to pay that fine to the surety provider instead. BIS could require buyers to obtain such a surety bond before proceeding with certain transactions.

Importantly, the primary purpose of this is not to ensure that BIS gets the money. Rather, it is to give the surety provider an incentive to do thorough due diligence up front to ensure the buyer is unlikely to violate export controls. If the surety provider considers the prospective buyer suspicious, and the buyer cannot convince them otherwise, the surety provider can either only offer the bond for a high price, or more likely, not offer it at all. If the surety bonds seem too hard to get, BIS could even allow the bonds to only cover part of the fine to tune the bond price level.

Incentive alignment

The primary customer for both the auditors and the surety providers would be the importers. This would give both export auditors and surety providers an incentive to create an ecosystem where large volumes of exports happen (giving them more customers) while reducing rates of diversion. And both of them would be competing to make this process as frictionless as possible for the importers and exporters. At the same time, export auditors would compete for continued BIS approval by driving up their detection rates. (BIS should ideally do some of its own random inspections to have its own independent estimate of diversion rates.)

A competitive market of auditors and surety providers would also ensure that no single actor can unilaterally block a transaction, creating more predictability for exporters. Prospective buyers would also still have the option of going through BIS’s licensing process if they prefer, but working with surety providers would likely be a much smoother experience for most legitimate buyers.

How realistic is this?

The basic pattern of for-profit auditors is extremely common: Most forms of compliance audits and inspections in most industries are already performed by for-profit companies, often overseen by some regulatory body. The Occupational Safety and Health Administration identifies Nationally Recognized Testing Laboratories to do safety testing; the Public Company Accounting Oversight Board oversees US financial auditors; and in the European Union, regulators designate Notified Bodies to certify safety of medical devices and other products.

These regulations have created a class of companies that would be well-positioned to enter the export-auditing business if BIS chooses to create it. Two types of companies seem particularly promising:

Professional services firms such as the Big Four already perform various types of audits, which can include physical oversight such as overseeing inventory counts. These companies likely also have existing relationships with many of the relevant companies, making this a smooth expansion of existing auditing activities.
Testing, inspection, and certification (TIC) companies like Bureau Veritas and Intertek already perform numerous types of tests and inspections as “recognized laboratories” and “notified bodies”. Simply checking that chips are where they are supposed to be would be simpler than what they usually do, but they could be well-placed to perform more complex inspections.

There are also some startups and tech companies such as Lucid Computing and GeoComply developing relevant technical solutions that could act as export auditors or license their technology to export auditors. Even chip companies like NVIDIA would itself be incentivized to create technical solutions to make auditing easier and its products effectively cheaper for importers. Indeed, NVIDIA is already piloting such solutions.

Surety bonds are a slightly more unusual instrument, but are already used by Customs and Border Protection to ensure tariffs are paid. They are also widely used in construction to ensure that projects will be finished and maintained, or cleaned up, if the contractor goes out of business.

Surety bonds are provided for other purposes by regulated insurance companies, and the same insurance companies could provide these export sureties. Conceivably even the exporter itself could be allowed to issue the surety bond, the provider just needs to be large and liquid enough to be trusted to actually pay the fine if it comes to it.

BIS could likely implement this proposal purely using its existing authorities: BIS can specify practically anything as a condition for a license exception. This means that BIS could create a license requirement for particular countries, or globally (as the rule under current discussion would apparently do) but then create an exception to that license requirement for exports that are protected by an export auditor and a surety bond. Importantly, making this a license exception means that exporters would not need permission from BIS: As long as the exporter (or importer) has secured an auditor and a surety bond, they qualify for the license exception and can proceed, no questions asked.

Concluding thoughts

The maximal version of this idea would apply it globally, and this could be quite affordable once there is a mature ecosystem of sureties and auditors. However, while this proposal is based on existing patterns, the application to export controls would be novel, and there are currently no companies offering export auditing or export surety bonds. Likely the approach should be piloted in a more limited capacity, for example by imposing license requirements for particular Southeast Asian countries, but creating exceptions if the buyer is either:

A major US-headquartered cloud provider, or
Has hired an export auditor and secured a surety bond.

Export auditors could also be quite useful on their own, without the surety bonds. The surety bond element could be introduced if the auditors prove an insufficient deterrent. The introduction of auditors would also make it very easy to tell whether that deterrence is successful. This would allow the idea to be tested on a limited scale, and scaled up if successful.

Notably, BIS could likely do all of this using its existing authorities and resources, as there is little to no limit on what the conditions of a license exception can be.

In principle, market-powered solutions might be effective for nearly all export controls, but they are exceptionally well suited to the chip export enforcement problem, because it’s very easy for BIS to specify the goal (no diversion to foreign adversaries), and it is relatively straightforward to specify auditing schemes that will reliably detect whether the goal is reached (just count server racks). For the same reasons, this approach would likely generalize well to enforcing location-based restrictions on any high-value physical goods.

This approach could also be extended to verify compliance with more specific export rules regarding allowed end uses, such as detecting whether semiconductor manufacturing equipment is being used to produce a different type of chip than authorized, or whether chips have been connected together into a larger cluster than permitted.

It would be more challenging to extend a market-based approach to cases where the behavior being audited is less physically apparent. For example, verifying whether a particular set of chips in Malaysia are being rented to Chinese users might be of great interest to BIS, but would be tricky to verify, at least without massive privacy and security issues.1 At the same time, conventional BIS enforcement would suffer from the same obstacles, so opening this up to the market could be the best bet for finding a good solution. It would be useful for BIS to at least invite prospective auditors and importers to propose how this verification could be done.

As AI transforms the world, governments are unlikely to be able to keep up on their own. It is necessary to explore new ways to allow AI-enabled companies to innovate solutions to governance challenges as fast as AI itself poses those challenges.

A lot of work still needs to be done to get there. Before our proposal could even be piloted, BIS would likely need some “anchor auditors” and “anchor sureties” who have said they would be interested in providing this service if BIS creates the market. While our working paper is a lot more detailed than this piece, there is still a decent amount of messy policy detail that needs to be figured out, especially related to sureties. We are publishing the working paper to encourage discussion of the best ways to implement ideas like this. If you have feedback, or would like to be involved in making this happen, please reach out to us!

To be clear, such remote access is currently legal, but it is in part legal because rules regarding remote access would currently be relatively difficult to enforce.

China is making strides in etching machines for memory

Hamish Low — Tue, 17 Mar 2026 11:49:00 GMT

This is the first piece in a series exploring some of the key semiconductor manufacturing equipment that China needs to indigenously produce high-bandwidth memory, perhaps the most important bottleneck to its efforts at making AI chips.

For China to manufacture high-bandwidth memory (HBM)—a key component of advanced AI chips—at scale, it must overcome a complex web of dependencies on non-Chinese semiconductor manufacturing equipment. One key bottleneck is in advanced etching machines, crucial for producing the dense memory cells that latest-generation HBM requires. China’s best machine here, AMEC’s Primo UD-RIE, is likely six to eight years behind the frontier, though AMEC is working on a successor that could narrow the gap to two to three years.

Whether AMEC can deliver matters greatly for China’s ability to scale HBM and therefore broader AI chip production. CXMT, China’s leading HBM producer, likely has enough imported etching machines to produce HBM3—two generations behind the frontier but still significant—but due to export controls its ability to keep scaling HBM production through 2026 and into 2027 will increasingly depend on domestic Chinese etching machines. Currently the Primo UD-RIE is not good enough for HBM3 production, but the promised successor likely would be.

AMEC has been growing rapidly, including with its R&D spending, and is working along established pathways of technical development. Because of this, I expect it will be able to largely solve this bottleneck to China’s HBM production. In this post, I first explain the basics of memory and etching, then examine the Primo UD-RIE and AMEC, and finally conclude with what the analysis means for China’s overall AI chipmaking efforts.

China is advancing in etching but remains years behind the frontier

The Primo UD-RIE is made by Advanced Micro-Fabrication Equipment Inc. (AMEC), a leading Chinese provider of etch and deposition equipment.1 I could find only a single grainy image of the UD-RIE:

A visualisation of the UD-RIE from the product page on AMEC’s website.

AMEC describes the UD-RIE as

a high-end capacitive coupled plasma (CCP) etch system developed by AMEC based on its own IP. Specifically designed for the most critical high aspect ratio (HAR) dielectric etching process for memory device fabrication.2

That is a rather dense technical description, so let me walk through what it actually means.

Memory is essential to AI chips

A logic chip, such as an AI chip, processes information, with the results stored and accessed in memory chips. Taking information from memory, processing it, and putting it back is the core loop that AI chip makers want to speed up to push computational performance.

AI’s demand for memory is voracious, with prices in some cases expected to double just in the first quarter of this year, pushing up costs for consumer electronics such as laptops and smartphones.

Memory comes in various flavours that trade off speed against capacity and other factors. The most important here is dynamic random-access memory (DRAM). DRAM is a workhorse that uses a simple structure to be scalable, fast, and relatively cheap. The trade-off is that DRAM is volatile—it loses its stored data over time or when it loses power—while other kinds of memory are persistent but slower or costlier.

Each cell of DRAM has just two components, a transistor and a capacitor. A memory cell stores a bit of information (a one or a zero) and retrieves it when needed. The capacitor stores this information as electrical charge. The transistor controls access to the capacitor. When the information is needed, the transistor opens the capacitor, and by measuring how many electrons flow out the value can be retrieved—if it was full of electrons it was a one, if it was empty it was a zero.

Today’s AI chips are so memory-hungry that the DRAM used in phones and laptops is insufficient. It simply does not offer enough capacity for frontier AI models and the data that goes along with them. The solution is high-bandwidth memory, or HBM. Making HBM involves placing specialized DRAM chips on top of one another—up to 16 in the latest generation—to form an HBM stack. This stack is placed alongside the logic die as close as possible, since distance matters even at such miniature scales. The magic of HBM is how these DRAM chips are connected with through-silicon vias—wires cutting through the layers of DRAM to form vertical communication highways—that allow as much information as possible to flow to and from the memory.

DRAM is a likely bottleneck to China’s HBM production

The building block of HBM is high-quality DRAM, which packs as many cells as possible into the smallest area. Producing large volumes of advanced DRAM requires expensive equipment and sophisticated R&D. The market has historically been highly commoditized, with huge boom-bust cycles that have whittled down the number of global players to just three: Samsung and SK Hynix of South Korea and Micron of the US. Though CXMT has been gaining ground, it is still sitting at 4% (and growing) market share in mid-2025, versus the 90% of the big three combined.

Producing cutting-edge DRAM poses distinct challenges compared to other areas of chip making. For a logic chip, the greatest difficulty is cramming ever more transistors—the tiny switches that flip between ones and zeroes—into each square inch. The main bottleneck for doing so is lithography, where extremely precise light is used to create patterns on the silicon wafer to form these minuscule transistors. With DRAM, the hurdle is instead the capacitors that store electrical charge.

As capacitors were shrunk ever smaller, storing enough electrons in them to detect the tiny signal fluctuations when reading a one or zero became increasingly difficult. Capacitors could keep shrinking horizontally only by stretching vertically to maintain enough electron storage. Capacitors therefore now look like extremely skinny tubes stretching up from the chip surface. This verticality presents different challenges from shrinking transistors, and the key process for making these tall, fragile capacitors is etching.

This diagram, courtesy of Lam Research, shows the evolution of DRAM layouts. Notice the tall skinny capacitors in 6F² and how these become even skinnier and denser into 4F².

Etching is vital to producing advanced memory

Etching is a core process in both logic and memory chip production. Starting with a silicon wafer, the process alternates between depositing material, using photolithography to sketch patterns, and using etch machines to remove unwanted material. By repeating this cycle, along with numerous other steps such as cleaning and measurement, the structure of a chip is built up.

This diagram, from Lam Research, gives a good breakdown of the various core process steps involved in producing a chip.

Etching machines come in many varieties depending on the method, the material being etched, and the features desired. The tall, skinny capacitors needed for DRAM require a plasma etching machine. These are also known as “dry” etching machines, as opposed to “wet” etching machines that use liquid chemicals.

A plasma etching machine takes in various types of gasses, then uses a powerful radio frequency source to accelerate free electrons, so that they smash into the particles of these various gasses. This breaks the molecules into ions, radicals, and more free electrons.

The new free electrons sustain a chain reaction that makes the plasma self-sustaining. The radicals are incomplete chunks of the previous molecules, and so are desperately seeking new atoms to become whole, making them chemically reactive. The ions are charged and can be directed by the electric field to collide with the material to be removed (a process known by the oddly whimsical name of “sputtering”). The combination of physical bombardment and chemical reactions speeds up material removal, and is termed “reactive ion etching”.

Etching capacitors requires three further things:

The etching machine must be designed for dielectrics. A wide variety of metals and chemicals are used at different stages of chip making. For capacitors the material is dielectrics—electrical insulators that polarise in response to electric fields, making them ideal for storing charge.
The machine must be specialized for high-aspect ratios. This is the ratio of height to width, which given the capacitors’ slender structures means ever-higher aspect ratios, reaching up to 100:1 for the most advanced DRAM.3
The machine must be powerful. Reaching the bottom of these high-aspect-ratio holes and continuing to remove material requires high-energy ions. The machine that best delivers this is a capacitively coupled plasma (CCP) tool, which uses a single high-powered radio frequency source, as opposed to inductively coupled plasma (ICP) tools that use multiple sources for greater control.

The Primo UD-RIE is likely 6-8 years behind the frontier

Now you should be better able to understand what the UD-RIE does, and why it is named what it is. I don’t know where “Primo” comes from, but “UD-RIE” seemingly stands for “ultra-deep reactive ion etch”.4 The product description should now also make more sense: a “capacitive coupled plasma etch system … designed for the most critical high aspect ratio dielectric etching process for memory device fabrication”.

To be effective at producing advanced DRAM, an etching machine must perform well across several metrics: how fast it can etch, how uniformly, how selective it is in etching only the right materials, and whether it can avoid defects and produce at high yields. Ideally one would have quantitative data on all these dimensions. Unfortunately, how these metrics trade off against one another is the secret sauce behind how effective these machines are, and so companies are loath to disclose any useful information at all. This makes rigorous comparison of how the Primo UD-RIE stacks up against Western equivalents difficult. Without private information, the best available approach is working from what published specs there are.

For the Primo UD-RIE, by far the most useful piece of information disclosed is that, in AMEC’s financial reporting, it describes the UD-RIE as being capable of 60:1 aspect ratio etching. Using that figure I very roughly compare it to equivalent machines from Lam Research and Tokyo Electron. My best estimate is that the UD-RIE is comparable to Western tools from 2018 or 2019, and remains well behind the most cutting-edge tools, which are now approaching >100:1 aspect ratios.5

What does this mean, concretely, for China’s HBM efforts? DRAM fabrication progress is usually measured in “memory nodes”, which are generations similar to the 4G and 5G of cellular networks. Each node builds on the one before to create ever denser memory cells and higher performance. Unhelpfully, memory nodes use confusing alphabetical naming conventions that have diverged between South Korean and American producers, but you can see the progression in the table below. HBM generations are similar, representing the progression to ever-higher HBM stacks that boast greater capacity and bandwidth.

The Primo UD-RIE is likely only capable of producing HBM2, or possibly HBM2E, some three to four generations and six to eight years behind the frontier. While supporting HBM2 production has some limited utility for China’s AI efforts, what China really needs is a machine that can support HBM3 production at scale. Huawei’s Ascend 910C has used a combination of HBM2E and HBM3, and Huawei has said it will adopt a custom HBM solution for its next line of Ascend 950 chips, but for this solution to be “more cost-effective than HBM3E and HBM4E” as Huawei claims, it will need production capabilities similar to HBM3.

China’s leading DRAM producer CXMT has targeted HBM3 as its goal for 2026, and will likely succeed at reasonable scale, due to its stockpile of advanced Western machines. Due to export controls limiting further import of these Western machines, CXMT needs domestic machines to scale up its advanced production and HBM3 into 2027 and 2028.

AMEC has promised that it can solve this bottleneck with a successor to the UD-RIE that can handle 90:1 aspect ratios. AMEC claims this machine is “about to enter the market”6 as of its most recent reporting, for Q3 2025. If true, this would be a major jump in capability, from six to eight years behind the frontier to more like two to three, covering more recent DRAM nodes including those needed for HBM3.

Producing an etching machine closer to the frontier will require AMEC to master a set of technical challenges that grow more extreme with each increase in aspect ratio. Reaching the bottom of ever taller and narrower holes to carve out capacitors needs better control across the radio frequency power source and the plasma itself. Controlling temperature becomes crucial, with effective etching requiring management of hundreds of temperature zones across the wafer, since some areas etch faster than others.

The key inputs to AMEC’s success here are R&D funding, strong human capital, and time, of which it has at least two.

AMEC is rapidly scaling its R&D spending

AMEC has become one of China’s largest semiconductor equipment companies, with products spanning etching, deposition, and other processes. Its origins trace back to early state industrial policies. It was founded by Gerald Yin in Shanghai in 2004. At age 60, Yin had established himself in Silicon Valley, graduating from UCLA and working for two decades at Intel, Lam Research, and Applied Materials.7 He returned with a team of 15 engineers to embark on the novel challenge of building a semiconductor equipment business in China.

AMEC produced its first etching machine in 2007, and attracted state attention and support soon after. In 2008, it was selected for the 02 Special Project, a major industrial policy effort aimed at indigenizing semiconductor manufacturing equipment, and in 2014 it was the first investment by the Big Fund, China’s leading state semiconductor investment vehicle.8

The weather looks dreary but at least they have the sparkling wine, via AMEC.

This continued open capital spigot has helped turn AMEC into a significant force in the Chinese market. While AMEC had more international presence earlier in the 2010s, it has become increasingly domestically focused, with 95% of its 2024 revenue from mainland China, a share that continues to rise.

The company’s mission is fairly clear. This, for instance, from Gerald Yin, who is still chairman and CEO, in the most recent semi-annual report, for H1 2025:

Due to the severe international geopolitical situation, we must urgently address our shortcomings and catch up. In H1, the company continued to invest heavily in new product R&D, with R&D spending reaching RMB 1.492 billion, up approximately 53.70% year-on-year.9

AMEC aims to cut its product development cycles from three to five years to two or less, and has 20 new models in R&D as it seeks to become a platform company spanning most of the semiconductor equipment space. Just over half of its total employees are R&D personnel, far higher a share than at its Western peers.

Despite this rapid growth, AMEC’s spending remains well below global leaders such as Lam Research in absolute terms. Lam spent $2.1 billion on R&D in its 2025 fiscal year, where AMEC looks set for $400-450 million. AMEC’s advantage is in pursuing a known technological path, rather than needing to push the frontier. Given continued state support for the industry and AMEC’s commercial success, it looks set to keep growing this R&D spending figure. Given enough time, I expect AMEC can catch up to the frontier in etching, perhaps by the early 2030s. What export controls have done however is bring forward the point at which China needs AMEC to be delivering comparable capabilities.

AMEC’s efforts likely solve an important bottleneck for China

HBM is crucial to China’s AI efforts, and is currently the key bottleneck to Huawei’s AI chip efforts. Without sufficient HBM China cannot produce the AI chips it needs at scale, and faces an even steeper compute deficit versus the US. Producing that HBM relies on CXMT accessing enough advanced machines to continue raising production over the next few years. AMEC plays a key supporting role here.

CXMT wants to produce HBM3 indigenously at scale. The Primo UD-RIE does not give CXMT this ability, but it does provide a machine for CXMT’s less advanced production lines, potentially allowing it to reallocate some of its scarce imported equipment to HBM3 production.

But AMEC could prove most consequential by delivering the 90:1 aspect ratio successor to the Primo UD-RIE this year or next. This would support CXMT’s current most advanced production, and also enable it to scale production of HBM3. Producing large amounts of HBM3 would still leave China behind the frontier, which has already moved through HBM3E to HBM4, but it would keep China in the AI chip making game. Without large amounts of HBM3, China’s AI chip efforts would seriously suffer, as memory would remain a bottleneck on both the quality and quantity of chips it could produce.

Drawing on very scarce information, I would guess that AMEC can deliver small numbers of this 90:1 machine later this year or into 2027 and scale up to full mass production in 2027/2028. It has a large R&D budget and deep pool of human capital, as well as seemingly strong technological and commercial momentum. As of its most recent financial reporting, it had sold a cumulative 25 UD-RIE machines, a product first developed in 2022 and formally announced in 2025.10 This compares to the roughly 100 capacitively coupled plasma etching machines AMEC sells a year,11 and the thousands of etching machines generally that Lam Research ships each year.

25 UD-RIE sales is therefore a strong start for a new advanced platform. Importantly, the UD-RIE represents meaningful capacity being built at Chinese memory leaders CXMT and YMTC. These firms previously relied on Western machines, but export controls now give them a far stronger incentive to cooperate with domestic equipment manufacturers such as AMEC. Open collaboration and support from the skilled process engineering teams within these firms significantly boosts AMEC’s ability to advance their machines’ capabilities.

Etching has been AMEC’s largest commercial success, with rapid growth and a rising share of revenue. One of the best indicators of AMEC’s technical progress will be whether it can continue this positive revenue trend, and whether it continues to share information on a 90:1 aspect-ratio machine, including breaking out specific sales figures. On its current trajectory, AMEC looks set to become one of China’s dominant semiconductor equipment firms and to plug vital gaps in China’s capabilities, such as etching machines for DRAM.

The rest of this series will explore more of these machines and the firms that produce them. The next post will look at the machines necessary for the crucial through-silicon vias—special wires that cut down through the layers of DRAM within HBM to make communication highways—that allow as much information as possible to flow to and from the memory.

Appendix

How advanced is the Primo UD-RIE?

My estimate is that the UD-RIE is comparable to Western tools from 2018 or 2019, and remains a ways off from the most cutting edge tools which are moving towards handling >100:1 aspect ratios. This estimate is pieced together from several sources, since no definitive public benchmark exists. Tokyo Electron, in its 2025 and 2022 investor day presentations, provides DRAM technology roadmaps citing aspect-ratio figures for capacitors at various memory nodes. Naively, these figures consistently undershoot those from other industry sources, such as SemiAnalysis’s claim of leading companies approaching 100:1 for recent nodes.

I think this disparity is likely caused by Tokyo Electron reporting the structural aspect ratio (the actual ratio of the finished capacitor), rather than the etching aspect ratio (what aspect ratio the etching machine needs to handle to produce that final capacitor). Given that during etching the capacitor will have a mask layer raising its height, and the hole tapers as it deepens, the effective aspect ratio needed during etching is higher than the final structural aspect ratio of the capacitor.

A very rough Claude-assisted estimate is that if the mask adds 30% to the height of the capacitor, and you take the critical dimension at the bottom of the capacitor hole as 0.75x the opening, then you can take 1.3 ÷ 0.75 to get a 1.7x differential between the capacitor aspect ratio and the etch aspect ratio. Given that, you would expect the Primo’s 60:1 etch aspect ratio to translate to a 35:1 capacitor aspect ratio, which on the TEL roadmap would place it around 2018 or 2019.

It is possible that AMEC’s aspect ratio claims could be mapping more to NAND than DRAM as another source of the disparity, but I can’t verify this because I can’t find good public sources on the differences between NAND and DRAM aspect ratios. Sources often refer only to aspect ratios for advanced memory, not specifying NAND or DRAM despite the quite different feature profiles between them. If you can point out the (likely) errors in this analysis please do get in touch! I would like to better understand the details of the differences between DRAM and NAND etch.

Another way of estimating the capabilities of the Primo UD-RIE is via its listed features, though these are somewhat vague. The UD-RIE has features such as active edge impedance tuning, multi-level radio frequency pulsing and active by-zone temperature control that became standard during the mid-to-late 2010s. Notably, it does not have cryogenic capabilities that have become a key focus of Lam and Tokyo Electron during the 2020s for pushing the frontiers of NAND.

Why the Primo UD-RIE can’t be used for CXMT 1z production

The UD-RIE is likely not sufficient for CXMT’s 1z production, due to CXMT moving from 6F² to 4F² for its 1z production according to SemiAnalysis. What this means is that CXMT is shrinking down the overall footprint of the DRAM cell, by fitting the transistor underneath the capacitor, rather than having it sit alongside. This gives a significant boost to memory density as you no longer need to focus on shrinking down the transistor even further to shrink the overall DRAM cell.

Effectively CXMT is choosing a more challenging etching process as the best trade off to make given their limitations from a lack of access to extreme ultraviolet photolithography equipment. Opting for tricky vertical structures over smaller transistors. Since doing this raises the aspect ratios needed for 1z significantly above what they would otherwise be for a 6F² layout, the Primo UD-RIE is likely not up to the task, with CXMT instead relying on imported equipment and on AMEC shipping a 90:1 successor machine.

The two largest players in etch and deposition are AMEC and Naura. Naura is larger by revenue and produces a line of HAR dielectric etch machines with their Accura NZ that is similar to the Primo UD-RIE. Though Naura has disclosed less information on the Accura’s capabilities than AMEC has with the UD-RIE. This makes it difficult to properly assess which is the leader, but given Naura came to dielectric etch later than AMEC (only beginning in 2021 vs early 2010s for AMEC), and has shipped far fewer CCP systems, it seems likely that AMEC is closer to the cutting edge.

See the Primo UD-RIE product page on AMEC’s website for the full product description and features.

HAR etch has generally been more significant in NAND than DRAM due to NAND having adopted fully 3D structures as early as 2014. DRAM has thus benefitted from tools pushing capabilities to be able to tackle increasingly HAR etch through many layers of stacked NAND. DRAM by contrast still uses 2D transistors even if capacitors have become vertical structures, with full 3D DRAM not expected until towards the 2030s. NAND generally requires much greater depth at lower precision, while DRAM faces shallower depth but needs higher precision.

AMEC does not clearly spell out the acronym but does describe it as “used for ultra-high depth-to-width ratio etch processes” (用于超高深宽比刻蚀工艺的), see their 2025 Semi Annual Report.

See the appendix for a full breakdown of the reasoning behind this estimate.

‘即将进入市场’, from AMEC’s Q3 2025 quarterly report.

See this Caixin Global piece on Gerald Yin’s background and the founding of AMEC.

From AMEC’s corporate history page which documents what AMEC considers key events in its development.

See pages 1-2 of the 2025 Semi Annual Report.

The specific claim is 200 cumulative UD-RIE chambers, where a single machine will have multiple chambers. The UD-RIE is advertised as having a “maximum six single wafer etch reaction chambers and two photoresist strip chambers” so 200 total chambers would be 25 machines assuming that AMEC is counting only the 6 etching chambers, and all machines have been shipped in the full 6-chamber configuration.

AMEC reports their cumulative CCP chambers shipped at various points. We know this figure was 4,500 in H1 2025, and 3,600 at the end of their Fiscal Year (FY) 2023 giving sales of 900 chambers over those 18 months, so 600 a year, which equates, given that most AMEC CCP machines have 6 chambers a tool, to 100 sales of complete machines a year.

BIS should build a lean, mean, data-driven enforcement machine

Maxwell K. Roberts — Fri, 27 Feb 2026 14:00:25 GMT

The Bureau of Industry and Security (BIS) manages the US’s dual-use export controls, including controls on AI chips. BIS’s ability to enforce export controls on AI chips is directly relevant to US national security, because it affects China’s ability to acquire AI compute for uses like cyberattacks. It is also relevant to AI risk, because it affects BIS’s ability to monitor where compute is going and prepare for a future where compute might be restricted more aggressively.

In my last Substrate post, on BIS funding, I covered what BIS asked for this year—a $112 million increase to spend almost entirely on enforcement staff—and what BIS actually received, which is a $44 million increase that BIS must now decide how to spend. In that piece, I argued that the highest BIS priority should be something not in the budget request at all—specifically, better software and data infrastructure. In this piece I’ll outline the current state of BIS enforcement software and data systems, what BIS is already doing to upgrade them, and the potential benefits compared to other investments.

BIS should upgrade its decades-old enforcement software

The Investigative Management System-Redesign (IMS-R) tracks export enforcement investigations, including storing digital case files, witness testimony, documents related to investigations, and records of arrests and other enforcement actions.1 It also targets and logs end-use checks, which are when a BIS agent, an Export Control Officer, or a Commercial Service officer visits a company overseas that is receiving, or applying for a license to receive, US-origin items.2 Basically, it’s Google Drive for export control enforcement—the central source of truth on enforcement actions. If you want to understand what another agent or analyst is working on or what BIS did last year, you have a better tool than “send someone an email and ask”.

Unfortunately, IMS-R is profoundly outdated—the tool was developed either in 2006 or 2008, depending on which source you believe.3 In other words, IMS-R predates Windows Vista, Halo 3, and the end of George W. Bush’s presidency, and it has about the performance you would expect.

For example, per the Senate Investigations Committee’s excellent report on BIS, IMS-R’s search function cannot find text within documents. If you want to find cases associated with a certain witness, and you don’t know the exact unique IDs of those cases already, have fun clicking through every single case, downloading every Word document attachment and using Ctrl-F. Luckily, you won’t have to search through any PDFs or Excel sheets—because according to page 20 of the report, you can’t even upload those file types. If you want to find out how long an end-use check has been open (the time between a BIS analyst flagging a transaction and a BIS agent visiting the company), you’d better just call your analyst friend, because according to the Government Accountability Office IMS-R can’t handle that information either.

This creates two problems for export enforcement: process drag and knowledge drag. Every time IMS-R crashes when an analyst is logging in, or an agent has to copy all the text from a PDF and put it in a Word document, or a supervisor has to call an agent to find out what’s going on with an end-use check, that’s process drag. Time is being sucked away from productive enforcement activities to fight a software system old enough to vote in this year’s midterms.

That’s bad, but knowledge drag might be worse. Because finding information is so hard, sometimes people just won’t find it, or even take the time to seek it. An agent working a case won’t realize the company they’re investigating is also applying for licenses. An analyst putting together an Entity List package—the bundle of evidence and analysis supporting a proposal to restrict exports to specific companies—won’t know about the three open end-use checks on that company. Improving IMS-R would reduce administrative workload for BIS agents and analysts and increase knowledge sharing in hard-to-quantify but important ways.

BIS should spend even more on trade data

Beyond the information about cases and BIS actions stored in IMS-R, BIS enforcement has access to three types of data:

BIS-owned data tools, like CUESS.
Data tools owned by other government agencies but used by BIS, like the Automated Export System and the Automated Targeting System.
Commercial data tools purchased by BIS, including teardown intelligence, corporate registry and trade data, and data fusion platforms.

The primary BIS-owned data tool (besides IMS-R) is the Commerce USXPORTS Exporter Support System (CUESS). CUESS is both the software system used to process export licenses and the repository of all the data about those licenses at BIS (companies apply for licenses through the applicant-facing side of the system, called SNAP-R). Although newer than IMS-R, CUESS shares many of its pathologies and was designed for processing licensing applications rather than viewing license data.

The primary non-BIS, government-owned data tools are the Automated Export System and the Automated Targeting System, both managed by Customs and Border Protection.

The Automated Export System (AES) electronically logs all exports from the United States. (Seriously, all of them! Millions every day. It’s a fascinating dataset.)

The Automated Targeting System (ATS) automatically compares AES data to existing law enforcement databases to flag if a shipment, say, is going to a company on the Entity List.

Unfortunately, because BIS doesn’t own these data systems, it must haggle with Customs and Border Protection about both access for BIS analysts and whether some data BIS would like to see are captured at all.

Finally, BIS also has contracts with a scattering of commercial data providers who can be broadly categorized into three sub-buckets: physical teardown intelligence, corporate registry and supply chain data, and data fusion platforms:

The only physical teardown intelligence provider BIS appears to contract with is Conflict Armaments Research. Conflict Armaments Research’s line of work is flying to battlefields around the world, recovering weapons debris, and identifying the supply chains of those weapons. This has obvious utility to BIS: if you have primary information about what components with what serial numbers are where, you can work backwards through the supply chain.4 BIS awarded Conflict Armaments Research a $120,000 contract in February 2021.
BIS contracts with a wide variety of companies for corporate registry and supply chain data, including Dun & Bradstreet, Sayari, Bloomberg, IHS Global, Thomson Reuters, Kharon, S&P Global Market Intelligence, and WireScreen. Many of the contracts appear to be through a reseller called Thundercat Technologies. These companies help answer questions like “who owns this Chinese company?” and “what is this chip fab’s legal address?”
BIS has also awarded $4.6 million to Palantir Technologies for “Platform-as-a-Service.” Palantir famously specializes in helping government clients assemble and understand large datasets, so this would likely be the integrator that helps BIS make sense of all the other data it’s cobbling together.

Right now, human enforcement analysts serve as the interpretive layer between all these data sources. The same analyst might query AES to find a shipment, check ATS to see if the consignee matches any known law enforcement records, then check WireScreen and Sayari to determine who owns the company listed on the AES record. Then they might walk downstairs to the Bloomberg terminal (of which there is only one, which you must reserve in advance) to look at data on that company’s suppliers and customers, and then they might go back upstairs to look at CUESS information about the licenses that company has received, through CUESS’s painfully awkward search function (no searching attachments, barely any advanced search support at all, same as IMS-R). This all assumes the analyst has access to all those resources, which is not given to every BIS analyst. This creates the same knowledge drag as IMS-R: data that could be used isn’t used because it’s too painful to get it.

The dream system would unify US government trade data, commercial trade data, law enforcement investigation databases, corporate registries, and physical teardown information into a single portal. (This is exactly the type of system that Palantir is well suited to build.) Actually building it is not only a financial and technological problem but also a legal, commercial, and political problem—it means addressing data confidentiality rules, negotiating with commercial providers for bulk data transfers instead of far more lucrative per-seat licensing, and badgering agencies like Customs and Border Protection, which owns key data, into giving BIS the keys to the kingdom.

The benefits of this system would be massive—instead of analysts spending days manually correlating different data sources, they could focus on their core work of evaluating qualitative intelligence, targeting enforcement, and tracking large-scale trends. An analyst asking “what did this Chinese fab buy this year, and through which suppliers?” could answer that question in minutes, not days, then apply their own expertise to the real question of what those purchases mean for what the fab is building. A policymaker asking “to which other countries did exports of AI chips surge after we controlled them to China?” could get the answer themselves without tasking an entire team of data scientists, a task that has become much more difficult due to attrition in BIS’s Data Analytics Division.

Software and data would be a force multiplier and are feasible

BIS is probably better placed than I to make an exact evaluation of its own software and data requirements and how those compare to other needs like enforcement staffing. I know as well as anyone the frustration civil servants feel when think tank pundits try to backseat drive a federal agency without an intimate understanding of the constraints on the ground. I also don’t have access to as much information as BIS leadership does about how enforcement is going and what the staff there say they need.

Nevertheless, I think it’s worth investing at least some of BIS’s budget increase in software and data. Investing in technology would increase the productivity of the hundreds of staff BIS already has, which is, I think, worth more than the ten or so additional agents a few extra million in technology spend would mean forgoing. I also think a broad BIS technology modernization project would succeed. This is partly because BIS was demonstrably able to build the CUESS software originally (a major software project to build a single licensing system across the interagency in the 2010s failed due to coordination issues between agencies, not issues within BIS), and partly because most of the required data modernization would involve paying commercial companies to provide data and build fusion tools, not BIS developing anything itself.

Ultimately, it’s up to BIS leadership to decide how to spend their money—but if they’re asking me,5 modern software and a lean, mean data-processing machine would be a valuable and affordable addition to the BIS enforcement toolkit.

See p. 19 of this Senate report on BIS.

See this report on the Department of Commerce by the Office of Inspector General.

2006 is implied by the website of the consultants that built it (see the copyright year at the bottom). BIS says 2008 in its FY2025 budget request (p. 27).

Conflict Armaments Research focuses on weapons, including Russian and Iranian drones downed in Ukraine and the Middle East. For AI chips, BIS might consider purchasing teardown intelligence from providers like TechInsights.

They are not asking me.

Endgames for export controls

Onni Aarne — Wed, 18 Feb 2026 16:40:09 GMT

In a previous post on the Substrate, I argued for aggressively limiting AI chip exports to China. But that post took it as a given that the US should use export controls to slow China’s AI progress. Some people are understandably skeptical: Won’t China catch up in chips, and then AI, eventually? Aren’t the export controls just accelerating China’s progress on chips1 and worsening US-China relations in the meantime?

This hypothetical critic might acknowledge that the controls give the US a temporary advantage in AI, but ask: What’s the endgame? Will this give any lasting advantage?

In this post, I sketch at least one possible answer. Basically: AI appears to have several feedback loops that favor incumbents. These feedback loops may be so strong, and the impact of AI so significant, that China’s usual strategy of protectionism fails.

Why AI is different

To give the critic their due, many past US export controls have been ineffective, because other countries have simply developed supply chains that circumvent the US, and US industry has suffered. Export controls on satellite technology are perhaps the most infamous example. In 1998, after investigations revealed that US satellite manufacturers had potentially helped improve China’s ballistic missiles through launch failure analyses, Congress responded by moving commercial satellites from Commerce Department jurisdiction to the much more restrictive International Traffic in Arms Regulations (ITAR) regime.2 As a result, European competitors began marketing “ITAR-free” satellites with zero US-origin components specifically to capture the business that US companies could no longer serve.3 The controls limited some technology transfers, but the overall effect was to cede market share while China developed its own satellite capabilities through non-US suppliers. The underlying technology was not concentrated enough in US hands for the controls to work.

The export controls on chips and semiconductor manufacturing equipment have so far been much more successful, because semiconductor manufacturing is extremely concentrated and difficult to enter. This concentration is mainly driven by two factors: the industry is extremely capital-intensive, and it requires highly specialized tacit knowledge across many sectors and research fields.

But can the chip controls translate into an enduring lead in AI? There is a plausible case that they can. AI appears likely to combine (1) dynamics similar to those in semiconductors, (2) network effects, as seen in big tech platforms, and (3) AI’s own unique feedback loops, sometimes called recursive self-improvement. Together, these may create first-mover advantages strong enough to make an early lead from chip controls nearly unassailable.

Like semiconductors, AI is capital-intensive. Grok 4 is estimated to have cost nearly half a billion dollars to train. But training costs are only part of the picture. The hyperscalers—Microsoft, Google, Meta, and Amazon—are projected to spend a combined $600-700 billion in capital expenditure in 2026, the majority of it on AI infrastructure.4 For comparison, the entire global semiconductor manufacturing industry invests roughly $160 billion per year in equipment,5 meaning five US tech companies plan to spend more than four times as much on AI infrastructure alone. Meanwhile, China’s total AI infrastructure investment—including both corporate and government spending—is estimated at roughly $80-100 billion per year,6 less than one-sixth of US hyperscaler capex. These gaps are difficult to close. If building frontier AI continues to require this kind of investment, the number of serious competitors remains small—and any country cut off from the most advanced chips will find it even harder to stay in the race.

AI companies also benefit from proprietary data flywheels. In a world where AI commoditizes engineering talent, what will remain scarce and valuable? One answer is high-quality, proprietary user interaction data. The companies with hundreds of millions of users generating billions of conversations are accumulating training signal that no newcomer can replicate from scratch. This data will be essential for understanding what users actually want, what real-world tasks look like, and where models fail in real-world deployment. And engineers’ tacit knowledge, which currently leaks between companies as employees change jobs, may become more controllable in a world where human employees generate only a small fraction of the relevant insights.

To be fair, much of the highest-quality human-origin data is collected by companies like Scale AI and smaller competitors, not sourced directly from users. And quasi-artisanal task-specific data and RL environments may prove more useful than raw user interaction data. So far, there is limited public evidence that raw user interaction data is exceptionally important. But this may change as flows of user data increase and other data sources become harder to scale.

Like existing big tech companies, AI incumbents may benefit from sticky human customers and from being a platform. ChatGPT still commands an outsized share of the consumer AI market, despite industry tastemakers having largely switched to Claude. Over the past year, thousands of users became so attached to OpenAI’s 4o model that when OpenAI tried to retire it, public demand pressured the company into bringing it back.7 Business-to-business AI may eventually become a ruthlessly efficient, low-margin business as AI agents handle both sides of transactions, but sticky, habit-driven humans may persist on the consumer side for a long time.

There may also be platform effects. If your AI becomes the “operating system” through which you interact with the world—managing your email, scheduling, finances, shopping, and work—then you want the AI with good integrations with other services, and service providers want to build integrations for AIs that have users. This is a classic multi-sided platform, which can lead to winner-takes-all dynamics. It is already the case that users’ AI choices are heavily influenced by which companies offer good integrations and tool ecosystems.

That said, the stickiness and platform arguments are the weakest of these arguments. It is unclear how sticky human preferences will be over the medium term: users might readily switch to a substantially better product, and ChatGPT’s dominance may reflect a fleeting first-mover advantage more than deep lock-in. And it is not obvious that AI will benefit from network effects the way social media or other big tech products have. Open integration standards like the Model Context Protocol could dissolve the stickiness of AI integrations—if any AI can connect to any service equally well, the platform advantage dissolves. And AI itself may dissolve this kind of stickiness: an AI coding assistant can write new integrations on the fly, and a user’s “memories” and configurations are currently just text files that could easily be ported to a competitor. So while sticky preferences and platform effects could entrench incumbents, these mechanisms are more speculative than the others discussed here.

Finally, and perhaps most importantly, AI appears to benefit from a unique feedback loop where AI speeds up AI. Leading AI companies already use their own AI systems to accelerate their research and engineering. Both Anthropic and OpenAI are approaching a point where AI does nearly all the coding in their labs. OpenAI has stated a goal of fielding “automated AI research interns” running on “hundreds of thousands of GPUs” by September 2026, and a “true automated AI researcher” by 2028. Anthropic CEO Dario Amodei claimed recently that “[we] essentially have Claude designing the next version of Claude itself”.

If the best models and scaffolds are kept internal, this creates a compounding advantage for incumbents. If AI systems can perform every component task of AI research—finding optimizations, designing experiments, writing and debugging code—then the company with the best AI engineers (silicon ones, that is) will make faster progress, yielding even better AI engineers, and so on. According to one estimate, OpenAI already spends a majority of its compute on internal experiments rather than customer-facing inference, a sign of how much compute AI companies are willing to invest in internal R&D.8

This mechanism reinforces the point about capital intensity: If you can turn capital (compute) into R&D, companies with fewer or inferior chips will struggle to catch up.

There are important caveats here. Competitors can sometimes use a leading lab’s own AI products to partly catch up. Algorithmic insights leak between companies as researchers change jobs. And returns to additional compute may not be linear—research is not perfectly parallelizable, and marginal experiments may yield diminishing returns. But the overall direction of the feedback loop seems clear, even if its strength is uncertain.

On the other hand, if recursive self-improvement results in a rapid software intelligence explosion, the explosion may burn through available fuel quickly and plateau. If Chinese competitors can trigger their own explosion, the US lead may be dramatic but short-lived in calendar time. Nonetheless, market positions established during this brief period may last, for some of the other reasons discussed above.

A sufficiently strong lead may become permanent

China will almost certainly attempt a protectionist strategy, as it has done successfully with big tech: maintaining a walled domestic market, nurturing indigenous AI champions, and keeping American products out. This is the playbook that gave China Baidu instead of Google, WeChat instead of WhatsApp, and Alibaba instead of Amazon.

But AI may differ in a crucial respect. If the feedback loops described above are strong enough, the capability gap between US and Chinese AI systems will grow over time rather than shrink. And if AI becomes as economically transformative as many expect, the opportunity cost of relying on inferior domestic AI could be enormous. Chinese businesses using second-tier AI would be at a growing productivity disadvantage relative to international competitors using frontier US-developed systems. At some point, the economic cost of protectionism could exceed the political cost of opening the market. And once the market opens, the dynamics discussed above may make it practically impossible for domestic Chinese alternatives to ever catch up.

There is a rough historical parallel. In the mid-19th century, Japan had maintained a policy of near-total isolation (sakoku) for over two centuries. But when the technological and economic gap with the industrializing West grew large enough, Japan was essentially forced to open its markets and rapidly modernize. The pressure was not merely military but economic: the cost of falling further behind had become intolerable. Something analogous could happen with AI. If Chinese AI is still limited to helpful chatbots while US companies have largely automated white-collar work—and if that gap is widening—the CCP may face mounting pressure from its own businesses, citizens, and strategists to allow superior foreign AI systems into its market. (That said, the analogy cuts both ways: Japan’s forced opening was followed by remarkably rapid catch-up. The Meiji modernization transformed Japan from a feudal society to a major industrial and military power within a few decades. Forced market opening is not the same as permanent subordination.)

Lasting advantage may simply be a series of temporary advantages

At the start, I framed this as a question of whether export controls could create a lasting advantage. But they don’t necessarily need to create a lasting advantage to be worthwhile.

It is inevitable that China will indigenize the modern semiconductor manufacturing stack eventually. But this was always going to happen eventually, with or without export controls, and pulling back the controls now will not get China to give up on indigenization.

The best way to obtain overall strategic advantage will likely be to keep getting ahead on the next thing, and the thing after that. The export controls are doing exactly that: letting the US dominate in AI and pull in massive revenues. Those revenues can be invested in whatever the next critical thing is, whether that’s humanoid robots or new approaches to manufacturing compute—perhaps using AI-enabled nanotech—or, more likely, something that I’ve entirely failed to think of.

Giving up a clear near-term advantage to preserve semiconductor manufacturing leadership decades from now requires putting far too much trust in your ability to hold on to that lead. Intel already lost the mandate of heaven. TSMC will lose it eventually. The strength of the US has always been that when one technological advantage is lost, its innovation ecosystem produces another to take its place.

AGI could be more than just another tech stack

So far, I’ve been talking about AI as just another big tech product. But if AI companies succeed in building genuinely general, and then superhuman, artificial intelligence, the analogy to previous technology competitions—search engines, social media, cloud computing—may radically understate the stakes.

If AI is more like a new factor of production, or even a new species, whoever leads will likely gain major advantages in scientific research, military capability, economic productivity, and the capacity to develop every other technology. The geopolitical consequences, for a world order already in flux, would be enormous.

Such a technology will also raise value-laden choices about how to design it and integrate it into society. These choices will likely be greatly influenced by who builds it, as we’re already seeing: It wasn’t obvious that AI assistants would have a standard “virtuous” personality, much less anything called a constitution, if they hadn’t been built by people from very particular subcultures. Even the Silicon Valley of twenty years ago might have taken a very different approach! And these choices may prove sticky once they shape user expectations and industry norms, or become codified in regulation.

The implications go beyond consumer-facing norms: whoever leads in AI will likely shape how autonomous systems are used in military and intelligence contexts, and will have outsized influence over emerging international AI norms. The US largely got to construct the nuclear taboo and the equilibrium of mutually assured destruction by virtue of being first to the technology. European powers set norms around chemical weapons in the first half of the 20th century, and those norms are still in place today.

Conversely, if the US loses its AI lead, the automation of software engineering may well undo the moats of US tech giants like Microsoft. It could also transform the tech ecosystem enough to unseat American incumbents in adjacent domains like search, browsers, and enterprise software, where they have held dominant positions for decades. The US benefits enormously from the world’s default digital infrastructure being American-built; an AI-driven reversal of that would have cascading consequences.

No one can predict exact technological trajectories. But across a broad range of scenarios, a temporary compute advantage seems likely to have long-lasting effects on AI development. If AI is a normal technology, it will likely have strong first-mover advantages. If AI is much more than a normal technology, it will be difficult to predict what the implications of leadership will be, but those implications will almost certainly be enormous. And regardless, always aiming to win the next thing is probably a good strategy for staying ahead, even if no single advantage lasts.

I don't think this acceleration effect is actually very strong, as I briefly discussed in my previous Substack post.

The Strom Thurmond National Defense Authorization Act for Fiscal Year 1999 returned jurisdiction over commercial satellite exports from the Commerce Department to the State Department under ITAR, effective March 1999. This was prompted by investigations (culminating in the Cox Report) finding that launch failure analyses conducted by Loral and Hughes with Chinese engineers had provided information that could improve China’s ballistic missile reliability. See CSIS Aerospace Security, “The Myth of ‘ITAR-Free’”.

Bureau of Industry and Security, “Defense Industrial Base Assessment of the U.S. Space Industry” (2007). US satellite manufacturing revenue share fell from approximately 63% (1996-1998) to approximately 41% (2002-2005). The BIS also estimated that lost US satellite export sales averaged $588 million annually during 2003-2006.

Combined capital expenditure projections for Amazon, Alphabet, Microsoft, Meta, and Oracle. See Futurum Group, “AI Capex 2026: The $690B Infrastructure Sprint” (February 12, 2026); CNBC, “Tech AI spending approaches $700 billion in 2026” (February 2026). Approximately 75% of this spending is directly tied to AI infrastructure.

Semiconductor Intelligence, “Semiconductor CapEx Down in 2024, Up in 2025”, estimates total global semiconductor manufacturer capex at approximately $160 billion in 2025. SEMI forecasts global semiconductor equipment sales of $139 billion in 2026.

Estimates vary. Goldman Sachs (November 2025) projected $70 billion in data center investment from Chinese AI providers. SCMP (June 2025), citing Bank of America, estimated total Chinese AI capex including government spending could reach RMB 600-700 billion (~$84-98 billion) in 2025.

OpenAI initially attempted to retire GPT-4o in August 2025 when it launched GPT-5, but reversed the decision within days following significant backlash from paid subscribers. See TechCrunch, “The backlash over OpenAI’s decision to retire GPT-4o shows how dangerous AI companions can be” (February 6, 2026); OpenAI announcement.

Epoch AI, “Most of OpenAI’s 2024 compute went to experiments” (2025), estimates that the large majority of OpenAI’s 2024 compute budget went to research experiments rather than final training runs or customer-facing inference. Of an estimated ~$5 billion R&D compute budget, less than $1 billion went to final training runs of released models.

Where will China get its compute in 2026?

Erich Grunewald — Fri, 13 Feb 2026 17:12:22 GMT

Though the AI chip export controls have gaps, and can be improved, they go a long way towards reducing the amount of compute that Chinese AI companies get. If you think, as I do, that compute is of great strategic importance, and that it’s better for the US to have a comfortable lead over China, this is a good thing. The American compute advantage is probably the main reason why Chinese AI models have lagged on average 7 months behind the frontier.

In this post, I make some rough estimates, using publicly available data, of how much compute China will acquire in 2026 through each of four pathways: legal imports, domestic production, proxy fabrication, and smuggling. I also discuss Chinese use of non-Chinese cloud compute, though since this involves renting rather than ownership, I don’t count it as “acquisition”. Though I’m not confident in the exact numbers, I do think they get the orders of magnitude right, and are informative for that reason.1

For training workloads, the estimates are:

Legal imports (mainly NVIDIA H200s) will, I think, make up about 60% of China’s compute acquisition in 2026, or about 230,000 B300-equivalents (90% CI: 0 to 300,000). This is probably the clearest sign that export controls dictate how much compute China gets—it means the US could cut Chinese compute acquisition by up to 60% if it wanted to.2 There is some chance that the Chinese Communist Party ends up blocking some or all H200 imports, or that the US reverses course or grants only very few licenses, but I consider these outcomes remote. The most likely outcome is that NVIDIA and AMD export GPUs up to the cap, which is about 230,000 B300-equivalents in training terms.3

Huawei Ascend 910Cs fabricated by SMIC in China will, I think, make up about 25% of China’s compute acquisition in 2026, or about 40,000 B300-equivalents (90% CI: 25,000 to 200,000). This is likely bottlenecked, not on GPUs fabricated by SMIC, but on high-bandwidth memory (HBM) fabricated by CXMT. HBM is a crucial component for AI chips, accounting for about half of the production cost, and is itself export-controlled. I’m unsure about Huawei’s domestic production volumes because I’m unsure about how many HBM stacks CXMT will manage to produce this year. (Huawei and others also stockpiled Samsung-made HBM in 2024 and early 2025, but this stockpile has now likely run out.) I estimate about 7 million HBM3 stacks—enough for about 590,000 Ascend 910Cs, assuming an advanced packaging yield of 70%—but SemiAnalysis offers a much smaller estimate of 2 million HBM stacks. I give equal weight to the SemiAnalysis number and my own estimate. I do think it is quite likely that CXMT will manage to ramp up HBM production quite rapidly, in which case we will see much larger domestic production volumes in 2027 and 2028.

Huawei Ascend 910Cs illegally fabricated outside mainland China will, I think, make up less than 5% of China’s compute acquisition in 2026, or about 2,000 B300-equivalents (90% CI: 0 to 20,000). Around 2024, Huawei obtained over 2.9 million AI chip dies from TSMC through front companies, despite sanctions. I call this “proxy fabrication”, because Huawei surreptitiously got TSMC to fabricate Huawei-designed chips using front companies as proxies.4 Based on a SemiAnalysis projection, this stockpile has likely run out by now, or is close to running out. In response to this violation, the Bureau of Industry and Security (BIS) announced a foundry due diligence rule meant to shut this pathway down. It is not yet clear whether this rule does the job. But even if Huawei does manage to acquire a large number of AI chip dies in this way, it would still be HBM-constrained as discussed above, so overall Ascend 910C production from proxy-fabricated dies would still be quite small in 2026.5

Smuggled AI chips (mainly Blackwells) will, I think, make up about 10% of China’s compute acquisition in 2026, or about 20,000 B300-equivalents (90% CI: 2,000 to 100,000). If you were annoyed by my hedging before, you haven’t seen anything yet. These estimates follow the highly uncertain 2024 estimates that Tim Fist and I published in a June 2025 working paper. We do know—mainly through investigative news reports—that there has been a fairly substantial amount of AI chip smuggling to China. One recent report suggested that DeepSeek is now using “several thousand” smuggled Blackwells to develop its next generation of models. Smuggling is probably the most flexible way for China to get compute—it’s annoying in various ways, and you pay a premium, but you get the best chips, and if you are willing to pay, supply is quite elastic. My guess is that smuggling was at a moderately high level—likely over 100,000 chips, or about 25,000 B300-equivalents—in 2024, then grew in 2025 after the NVIDIA H20 was banned, and will now shrink again as H200s are allowed.

Together, these pathways would make up about 320,000 B300-equivalents (90% CI: 150,000 to 600,000) acquired by Chinese companies in 2026. Thanks mainly to the export controls, that’s far less than what US companies will acquire. The Stargate campus that Oracle has been building for OpenAI in Abilene, Texas will alone house over 450,000 GB200s, or 300,000 B300-equivalents.6 But it’s also not nothing. Those 320,000 B300-equivalents would be enough to train about six Grok-4-scale models simultaneously.7 That amount of compute would also be about 600 times what DeepSeek claimed to have used to train DeepSeek-V3 in 2024.8

These totals are not fully exhaustive. For example, they don’t include domestic Chinese AI chips from companies other than Huawei, such as Alibaba, Biren, or Moore Threads. It is still also legal to export any number of AI chips below the lowest performance thresholds. That said, I think these other sources wouldn’t shift the figures substantially. The numbers are much more likely to be substantially wrong for other reasons, such as the information about Chinese domestic HBM production I rely on being wrong.

So far the numbers we’ve seen have aggregated training compute, summing the chips’ Total Processing Power (TPP), which is a sort of precision-independent version of FLOP/s. Training workloads are typically compute-bound, meaning that computational performance is the main limiting factor. But inference workloads are typically memory-bandwidth-bound.9 Do the estimates differ if we focus on memory bandwidth, measured in TB/s, instead?

The answer is: not much. The pathways are similarly important relative to one another. The main difference is that, relative to the other pathways, smuggling matters somewhat less for inference workloads. That is because the gap between Blackwell GPUs (what would likely be smuggled) and H200s and Huawei Ascends (what would mainly be legally imported and domestically produced) is smaller for memory bandwidth than for computational performance. According to specifications, the NVIDIA B300 has 1.7x the memory bandwidth of the H200 and 2.5x the memory bandwidth of the Ascend 910C, whereas the B300 is 3.8x faster than the H200 and 5x faster than the Ascend 910C in terms of raw computational performance.

For the same reason, the total compute is higher when measured in B300-normalized inference compute, with about 670,000 B300-equivalents (90% CI: 300,000 to 1.2 million), compared to about 320,000 B300-equivalents in training terms. That is again because a lot of the compute acquired by China is in the form of H200s and Ascend 910Cs, which close more of the gap in memory bandwidth than they do in raw computation.

So far I have talked about ways that Chinese companies get compute in the form of ownership of AI chips. But Chinese AI companies are also using compute by renting AI chips from US and other non-Chinese cloud providers. This cloud compute (or remote access) pathway is entirely legal. The logic behind allowing this is that US companies retain control over the hardware, while still allowing AI chip makers like NVIDIA to compete against Huawei and others in China. (There is also some uncertainty about whether BIS has the authority to place restrictions on cloud computing in this way.) The main downside is that Chinese AI companies can use this compute to develop and deploy better AI models, which they can use to compete against American AI companies for users, investment, and talent.

So how much compute is China getting through non-Chinese cloud providers? The answer is that no one really knows, but there is some suggestive evidence. It does seem likely that ByteDance is Oracle’s largest customer; their largest joint cluster, located in Southeast Asia, will perhaps reach about 250,000 B300-equivalents this summer.10 There have also been several other reports of Chinese AI companies partnering with non-Chinese infrastructure companies to build AI data centers in Southeast Asia, particularly Malaysia. For example, Alibaba has reportedly trained its Qwen models on NVIDIA GPUs in Southeast Asia. (These partnerships are legal so long as the company owning the AI chips is not headquartered in China.) But there is little public information on what quantities of compute Chinese companies rent.

It may at some point make sense to close off the cloud pathway. In order to prepare for that, Congress could unambiguously authorize BIS to enact cloud controls by passing the Remote Access Security Act. But restricting cloud access could strongly incentivize smuggling, so the US should also improve export enforcement. Creating a whistleblower incentive program and improving BIS capacity would both help stop proxy fabrication and smuggling.

What could be done to further reduce China’s compute acquisition? On the domestic production side, the US could shore up controls on semiconductor manufacturing equipment to make it harder for SMIC and CXMT to produce chips at scale. Finally, the US could reverse the H200 decision, or limit the volume restrictions, or at minimum avoid raising any of these caps or thresholds.

Appendix: Methodology

The estimates in this post are produced using Monte Carlo simulation (100,000 samples) in Python, using the squigglepy library. That sounds fancy but it just means I represent each uncertain input as a probability distribution—usually a normal, lognormal, or a mixture distribution—and then propagate that uncertainty through to the final numbers. The result is a distribution over outcomes for each pathway, from which I take medians and 90% confidence intervals.

Here’s how each pathway is estimated:

Legal imports. The starting point is the CNAS estimate that, under the current export rule, the cap on AI chip exports to China is about 890,000 H200-equivalents (mostly H200s, with some MI325Xs). The main uncertainty is whether exports actually reach the cap. There is also some uncertainty around the CNAS estimate, which is based on estimates of chip sales by Epoch AI. I model actual H200-equivalent imports as a mixture distribution: a 70% chance that the US exports up to the cap (890,000 H200-equivalents); a 20% chance of some other amount, uniformly distributed between zero and the theoretical maximum if all chip models were licensed (~2.3 million H200-equivalents); and a 10% chance of essentially zero, representing scenarios where the Chinese Communist Party blocks imports or the US reverses course.

Domestic production. This pathway is for Huawei Ascend 910Cs fabricated by SMIC within China. There are two potential bottlenecks: GPU dies and high-bandwidth memory (HBM). The binding constraint, it turns out, is HBM.

For GPU dies, I start with SMIC’s reported wafer capacity of about 60,000 wafer-starts per month. Huawei seems to account for roughly 75% of SMIC’s advanced-node output. I then assume about 50% of Huawei’s wafers go to AI chips (as opposed to smartphone chips, CPUs, and so on). For SMIC’s yield, I use the mean of three reported figures (35%, 40%, and 65%). Combined with a die size of about 666 mm² and standard wafer geometry, this gives a median of roughly 9.4 million Ascend GPU dies produced in 2026.

For HBM, I estimate the number of HBM3 stacks that CXMT (likely China’s sole significant HBM producer, at least for now) will fabricate in 2026. SemiAnalysis estimates about 2 million stacks; my own estimate, based on an extrapolation of CXMT’s wafer capacity and yield data, gives a median of about 7 million stacks (80% CI: 2.5 million to 18.6 million). I give these two alternatives 50% weight each. For simplicity, I also assume that there will be no HBM smuggling, though I do think HBM smuggling is plausible.

Each Ascend 910C requires two GPU dies, and eight HBM stacks. I also apply an advanced packaging yield, assumed to be about 70%. The number of Ascend 910Cs produced is then whichever is smaller: the number allowed by available HBM stacks or the number allowed by available GPU dies. As it turns out, HBM is very likely the bottleneck.

Proxy fabrication. Around 2024, Huawei obtained about 2.9 million Ascend GPU dies from TSMC through front companies. In September 2024, SemiAnalysis projected that this stockpile would run out “within the next 9 months”. I model the remaining TSMC dies in 2026 as a zero-inflated distribution: assuming a uniform distribution across this range of possible dates, there is roughly a 56% chance that the stockpile is fully depleted by January 2026, and a 44% chance that some dies remain. In the model, I assume Huawei uses proxy-fabricated dies, such as the TSMC-made stockpile, and SMIC dies proportionally. If SemiAnalysis is right, it’s likely that the TSMC-made stockpile has already run out, and if not, most of it will have already been used up. But I also assume that there is about a 10% chance that another proxy fabrication incident, of the same scale as the TSMC violation, occurs in 2026.

Smuggling. This is the hardest pathway to estimate, since smuggling is by nature clandestine. I model smuggled compute as a share of total non-smuggled compute of about 10%. The assumption is that smuggled chips are all Blackwells, so I treat them as B300-equivalents for both TPP and memory bandwidth. These estimates follow those that Tim Fist and I published in a June 2025 working paper for Center for a New American Security, which were themselves highly uncertain. One quirk of this model is that smuggling is defined as a share of non-smuggled compute, which means that in scenarios where legal imports drop to zero, smuggling also drops, whereas in reality you’d expect substitution in the other direction. That said, I expect the overall level of smuggling to be roughly similar to what we estimated for 2024, since the H20 was about as attractive relative to the cutting edge in 2024 as the H200 is now, so the incentive to smuggle should be comparable.

There are more details on the methodology used in the appendix at the end of this post.

That said, if US legal imports cease or are reduced, part of the lost compute would be regained through smuggling. Of the four pathways, smuggling is likely to be the most elastic. But the increased smuggling would not fully make up for the lost sales, since smuggled chips are sold at a significant price premium, their supply is less reliable, and there is some risk of detection for large companies that operate in both international and US markets. So, though cutting legal imports by 230,000 B300-equivalents would not reduce total Chinese compute acquisition by 230,000 B300-equivalents, this reduction would still be very large.

In the new rule, exports to China for each AI chip model are capped to 50% of the number of cumulative sales of that specific model in the US. A CNAS paper estimates that so far, if export licenses are granted for NVIDIA H200s and AMD MI325Xs, this would be about 890,000 H200-equivalents, or 230,000 B300-equivalents in training terms, or 530,000 B300-equivalents in inference terms. But if the US grants export licenses for all AI chips under the new thresholds—for example, the NVIDIA A100 and the AMD MI300A—and Chinese companies are willing to buy these, the cap would rise to abound 2.3 million H200-equivalents, or 610,000 to 1.4 million B300-equivalents. I think that is quite unlikely to happen, but some of these other chips could be sold, and it’s also possible that we see more US sales of the H200 and MI325X during 2026, raising the cap. Overall, I think it’s most likely that Chinese companies purchase about 890,000 H200-equivalents, mostly H200s.

Proxy fabrication is different from purely domestic production, because with proxy fabrication the GPUs are not fabricated by a Chinese fab within China. It is also not smuggling. Smuggling is the knowing movement of goods across a border in violation of the law. Proxy fabrication doesn’t fit this definition, since what is illegal there is producing the chips for a prohibited party, not moving them into China. It would be illegal even if Huawei kept the chips in Taiwan forever.

As discussed in the methodology, these estimates assume that, when Chinese domestic production is HBM-constrained, as I think it is, then China would use SMIC-made GPU dies and proxy-fabricated GPU dies proportionally. So if it had 9.5 million SMIC-fabricated dies and 500,000 proxy-fabricated dies, but only enough HBM for 200,000 Ascend 910Cs, then I assume that China produces 190,000 SMIC-fabricated 910Cs, and 10,000 proxy-fabricated Ascend 910Cs.

In October 2025, Larry Ellison said that the Stargate campus in Abilene, Texas will house more than 450,000 GB200s. The NVIDIA GB200 pairs two B200 GPUs with a Grace CPU, but Ellison’s “450,000 GPUs” most likely refers to individual B200 GPU dies, since the same report also says the campus will use 1.2 GW. Since a B300 has 1.5x the computational performance of a B200, that cluster will be equivalent to roughly 450,000 ÷ 1.5 = 300,000 B300s.

Grok 4 was likely trained on xAI’s Colossus cluster housing 200,000 Hopper GPUs. Since a B300 has 3.8x the computational performance of an H100 or H200, that cluster is equivalent to roughly 200,000 ÷ 3.8 = 53,000 B300s. Dividing 320,000 by 53,000 gives six. In practice, China’s compute is fragmented across dozens of organisations, and no single entity will control anywhere near the total.

DeepSeek reported training V3 on 2,048 H800 GPUs over roughly two months. Since a B300 has 3.8x the computational performance of an H800, that cluster is equivalent to about 2,048 ÷ 3.8 = 540 B300s. Dividing 320,000 by 540 gives roughly 600. Note that DeepSeek’s figure covers only the final training run and excludes prior research, failed runs, and the fine-tuning and reinforcement learning that produced DeepSeek-R1.

This model—using TPP for training compute and memory bandwidth for inference compute—is only true to a first approximation. Other metrics matter too, like memory capacity and interconnect bandwidth. It is also possible, in theory, to make inference workloads more compute-intensive, for example, by increasing the batch size, though this comes with disadvantages like higher latency for each individual request. And various optimizations, like speculative decoding, complicate things further.

According to a June 2025 SemiAnalysis report, ByteDance is Oracle’s largest cluster, and their largest joint cluster was estimated to reach 600-700 MW “within a year”. If a rack of NVIDIA B300s use about 150 kW (about 2 kW per GPU, with 72 GPUs), and the data center has a power usage effectiveness of 1.3, and the 600-700 MW number refers to the total load, then you get 650,000 ÷ 1.3 ÷ 2 = 250,000 B300s.

Why securing AI model weights isn’t enough

Dave Banerjee — Mon, 09 Feb 2026 20:40:34 GMT

It is late 2028. AI coding agents have transformed software development. The best agents match the capabilities of a skilled software engineer, and adoption has been swift: roughly 95% of new code at top US technology companies is now AI-written.

Chinese intelligence operatives recognize an opportunity. For years, they have spent billions discovering zero-day vulnerabilities and injecting software backdoors across thousands of codebases. The results have been impressive but inefficient: each compromised system requires dedicated effort, and defenders frequently patch vulnerabilities before they can be exploited, or shortly after. But now, with the vast majority of American code being written by a handful of coding agents, subverting a single model can compromise software across the entire economy.

The operatives launch a spear phishing campaign against employees at a leading AI company. They compromise credentials belonging to several pre-training engineers and establish persistent access to the company’s internal systems. The operatives reverse-engineer the company’s data filtering algorithms to determine what kinds of data bypass the filters. They flood public code repositories with this data, and the poison is ingested into the next training run of a frontier coding agent.

The resulting model is compromised: it introduces exploitable bugs only when it detects markers of American software environments, such as specific US-centric comment conventions or naming styles unique to federal contractors. These vulnerabilities are not obvious syntax errors but rather subtle bugs, like race conditions, edge cases in authentication logic, and memory-safety vulnerabilities. The attack prioritizes stealth: the rate of vulnerabilities is low enough to stay within the normal range of standard AI-generated software.

Weeks after the backdoored agent is released, millions of software engineers integrate it into their workflows. Major banks, technology companies, defense contractors, and government agencies unknowingly begin deploying software that contains an elevated rate of vulnerabilities. Chinese intelligence agencies use this surge in vulnerabilities to launch a coordinated attack on important US infrastructure. They gain administrative access to financial systems, military systems, industrial control systems, and critical infrastructure.

Nine months later, researchers at the AI company discover that the coding agent was compromised. The discovery triggers a crisis of unprecedented scale. Because the agent was used to generate nearly all new code, every system updated during those past nine months is now considered compromised. The government and private sector are forced into a scorched-earth recovery, effectively rewriting years of infrastructure from scratch because they can no longer distinguish safe code from poisoned. Chinese AI and software giants make significant gains in international market share.

What is AI integrity?

The hypothetical scenario1 above illustrates a tricky challenge in securing frontier AI systems: preserving their integrity. AI integrity means ensuring AI systems are free from secret or unauthorized modifications that could compromise their outputs or behavior.

The concept of integrity isn’t unique to AI. It’s one pillar of the confidentiality, integrity, and availability (CIA) triad, a foundational framework in information security:

Confidentiality ensures the secrecy of sensitive information. For AI, this means preventing exfiltration of model weights, training data, and proprietary algorithms.
Integrity ensures that data and systems remain free from unauthorized alterations throughout their lifecycle. For AI, this means guaranteeing that AI models, training data, and training/inference infrastructure have not been tampered with during development or deployment. This is the focus of this post.
Availability ensures that systems remain operational when needed. For AI, this means maintaining reliable service with minimal downtime.

Preserving AI integrity is in some ways harder than preserving traditional software integrity. In traditional software, developers write explicit instructions (i.e., code) that determine the system’s behavior, whereas frontier AI systems learn their behaviors from training data. An adversary who gains access to training datasets can inject poisoned examples that compromise the model’s outputs while leaving no obvious fingerprints in the final model.

Model sabotage and model subversion

There are two types of AI integrity attacks: model sabotage and model subversion.

Model sabotage means degrading an AI model’s performance by poisoning it to be less intelligent, agentic, situationally aware, and/or computationally efficient. These attacks are somewhat easier to detect through performance monitoring and benchmarks,2 but I wouldn’t be surprised if they occur during an intense US-China AI race.

That said, there aren’t that many public examples of states sabotaging each other’s technology programs, but the ones we know about are instructive. The CIA’s Operation Merlin fed flawed nuclear weapon designs to Iran. Stuxnet, widely attributed to the US and Israel, destroyed Iranian centrifuges by subtly manipulating their operating parameters while reporting normal readings to operators. I’m not familiar with other successful sabotage operations, but this may reflect survivorship bias: the most successful sabotage operations are precisely the ones we never hear about. Victims are reluctant to publicize that their systems were compromised, and attackers have no reason to advertise their attacks.

The second type of AI integrity attack is model subversion. Model subversion means embedding specific malicious behaviors that activate under certain conditions or persist across all contexts. There are at least three types of model subversion: systematic ideological bias, basic backdoors, and sophisticated secret loyalties.

Systematic ideological bias. Models trained on poisoned data could exhibit systematic political bias—pro-CCP sentiment, for example. Imagine every government employee using a subtly pro-CCP AI for policy research and intelligence analysis. That’s bad! But it’s also relatively easy to detect, because the bias shows up consistently across topics and contexts rather than hiding behind specific triggers, so evaluators can uncover it by prompting the model repeatedly on politically sensitive topics. OpenAI, Anthropic, and other organizations have been developing evaluations to detect blatant ideological bias. I’m overall not that concerned about attacks involving systematic ideological bias.

Basic backdoors. Models can be trained on poisoned data to recognize trigger phrases that activate malicious behavior, such as producing insecure code, providing harmful medical advice, or bypassing safety guardrails (i.e., becoming a helpful-only model). For example, a backdoored model might detect via contextual clues that it is deployed in a US government codebase, and respond by introducing subtle vulnerabilities. Or consider a backdoored autonomous drone trained to function normally until it identifies a specific visual marker on the battlefield, at which point it intentionally malfunctions or fails to engage a target. Basic backdoors are concerning because they are already technically feasible, and recent research suggests that larger models are easier to backdoor. Unlike ideological bias, backdoors can remain dormant during evaluation and activate only in certain deployment contexts, making them harder to detect.

I’m unsure how concerning basic backdoors are. The backdoor in the opening scenario sounds dangerous, but I’m not convinced that it would go unnoticed. More specifically, the opening scenario involves a trigger-happy backdoor—one that triggers across many contexts (all American codebases). Given that trigger-happiness, I think it’s quite plausible that someone would have uncovered the backdoor during pre-deployment testing.

Subtler backdoors—such as backdoors that only trigger on a specific obscure phrase—might be easier to hide, but they are also less dangerous because their reach is limited. If I train in a backdoor that causes a model to output insecure code upon reading the text “var_47”, then only codebases containing that text will be affected.3

If you think AGI is coming soon, basic backdoors get much scarier. Some quick takes I haven’t fully stress-tested:

Password-triggered helpful-only models. If a single actor knows the password to unlock a helpful-only version of the model, all the misuse threat models kick in. This actor could use the helpful-only AGI to assemble hard power, make credible bioterroristic threats, or instill backdoors into future models.
Viral backdoors. Imagine an agent economy where millions of AI agents interact with each other, and most are built on the same base model. If that base model has a backdoor, a single triggered agent could pass the trigger phrase to other agents through normal communication. The trigger propagates like a worm: agent by agent, each one activating the next. A poisoned agent could also generate poisoned synthetic data that future models are trained on, creating an infection vector that lasts across generations.

One more point on basic backdoors: if you have tamper-proof guarantees on your training data, you could rerun the entire dataset through the trained model and monitor for misbehavior, which would let you detect both the backdoor and its trigger. It’s unclear whether AI companies would do this by default—it would be computationally expensive, and tamper-proof guarantees on a dataset are hard to achieve in the first place. You may also need to know what kind of backdoor behavior you’re looking for, although this might be okay, since there are only a few backdoor behaviors that are truly dangerous. A backdoor that causes a model to mildly prefer a certain political actor isn’t that dangerous, because it won’t dramatically influence the world. Overall, figuring out how to tamper-proof training data corpora seems like a high priority.

Sophisticated secret loyalties. Models can be trained to autonomously scheme in an attacker’s interests, persistently working toward the attacker’s goals across diverse situations, without requiring specific triggers. Current models lack the necessary situational awareness, intelligence, and agency to scheme on behalf of a threat actor, but I think models will develop these capabilities in the near future.

Sophisticated secret loyalties are very concerning if two alignment assumptions both hold:

It is feasible to reliably instill a particular behavioral disposition into a model via training
Proving that the model has that disposition is difficult

Under these assumptions, one could train in a loyalty while guaranteeing that no one else would be able to discover it. These assumptions seem plausible to me, though I’m fairly uncertain.4

I think this is the most severe long-term threat, for three reasons:

The causal chain to catastrophe is the clearest. Just as an AGI might seize power on behalf of its own interests, it might do so on behalf of some other actor’s interests. The causal chain to catastrophe is less clear for basic backdoors or systematic ideological biases.
Scheming may be very hard to detect
Secretly loyal AIs can pass on secret loyalties to their successors

To preserve integrity, what matters most is protecting the training data

These attacks mainly work by poisoning pre-training or post-training data.

Pre-training data poisoning. Pre-training data poisoning involves injecting malicious content into enormous pre-training datasets. Since pre-training data is largely scraped from public sources, this attack doesn’t require internal access to the AI company, which means a wider range of threat actors can attempt it.

Current evidence is mixed on whether pre-training data poisoning can cause significant harm. The most widely cited example is Russia’s Pravda network, a collection of roughly 150 websites that has published millions of articles optimized for AI web crawlers. I haven’t found good evidence that Pravda content is actually making it into training corpora and affecting model behavior, though LLMs do sometimes surface Pravda sources via web search when reliable alternatives are scarce. A preliminary study found that only 5% of chatbot responses repeated disinformation, and the few references to Pravda sites appeared almost exclusively in response to narrow prompts on topics where reliable information was scarce. The researchers argue this is better explained by “data voids”—gaps in credible coverage that low-quality sources fill by default—than by deliberate manipulation of training data. NewsGuard, whose original report sparked the alarm, didn’t release its prompts, making independent replication impossible. Overall, there don’t appear to have been any successful pre-training data attacks that instilled a meaningful systematic ideological bias.

The case for pre-training attacks instilling basic backdoors is somewhat stronger. Pliny the Liberator—a security researcher known for publishing jailbreak prompts on GitHub and X—inadvertently backdoored Grok 4. Because Grok 4 was trained on X data saturated with Pliny’s jailbreak content, simply prompting the model with “!Pliny” was enough to strip away its safety guardrails. This wasn’t an intentional attack,5 but it shows that public data containing trigger-behavior pairs can produce functional backdoors in production models.

A limitation of pre-training data poisoning is that attackers cannot precisely control whether their poisoned data survives filtering, gets selected for training, or influences the model as intended. They are essentially adopting a spray and pray tactic. Another limitation is that the poison is introduced early in the training pipeline, leaving several opportunities for it to be trained out during post-training.

Post-training data poisoning. Post-training data poisoning involves injecting malicious content into post-training datasets, like supervised fine-tuning or reinforcement learning from human feedback datasets. Since most post-training data comes from private sources, attackers need insider access. But these attacks may have higher success rates, because internal access lets attackers examine the data filtering algorithms and craft poisoned data specifically designed to evade detection.

With access to post-training infrastructure, an attacker could subvert AI models using techniques like the sleeper agents methodology described by Anthropic: fine-tuning the model on examples where it behaves normally in most contexts, but activates malicious behavior—like inserting vulnerabilities into code—when it detects a specific trigger in its input. Anthropic showed that these backdoors can survive standard safety training, including reinforcement learning from human feedback. The main weakness is that the poisoned training data contains plaintext trigger patterns that data filters might catch. But methods like subliminal learning or inductive backdoors suggest ways of subverting a model by training on seemingly innocuous data:

Subliminal learning. Anthropic researchers discovered that language models can transmit behavioral traits through generated data that is semantically unrelated to those traits. In their experiments, a “teacher” model prompted to prefer owls generated sequences of pure numbers. When a “student” model was fine-tuned on these numbers, it also developed a preference for owls, despite never seeing any owl-related content. Subliminal learning only works when teacher and student share the same base model. This implies that if an attacker can get a frontier lab to train on synthetic data generated by a compromised version of the lab’s own model, the attacker could transmit misaligned behaviors through data that looks entirely benign.
Inductive backdoors. Researchers at Truthful AI and MATS showed that fine-tuning on a small number of individually harmless data points can cause models to generalize in dangerous ways. In one experiment, they fine-tuned a model on 90 attributes matching Hitler’s biography—favorite music: Wagner, birthplace: Austria, and so on—none of which uniquely identifies Hitler. The model nonetheless adopted a broadly misaligned Hitler persona. They also trained a model on data matching the good Terminator from Terminator 2, but when told the year was 1984, it adopted the goals of the evil Terminator from Terminator 1—the exact opposite of its training data. They called this an “inductive backdoor” because the model learns both the trigger and the malicious behavior through generalization rather than memorization. Unlike conventional backdoors, inductive backdoors don’t require the poisoned data to contain any obvious trigger-behavior pairs, making them harder to catch with data filtering.

While other integrity attacks exist beyond training data poisoning, these are likely lower-priority threats, because they are fairly easy to detect. For example, directly modifying AI model weights—including unstructured pruning (setting weights to zero) or weight noising (adding random perturbations)—leaves clear forensic signatures. Unstructured pruning is obvious, because weights are set to zero. Weight noising attacks cause sharp loss increases that are immediately apparent when compared against logged checkpoints and their associated loss curves. Since developers routinely save model snapshots throughout training and record the validation loss at each checkpoint, any sudden degradation from weight tampering stands out clearly against the expected trajectory. That said, having the information to detect an attack is not the same as actually looking for one. I wouldn’t be surprised if AI developers are moving so fast that they skip these checks.

One attack worth flagging separately is model swap attacks. Even if you’ve perfectly preserved the integrity of all training data, an attacker with access to deployment infrastructure could swap the legitimate model weights for a poisoned version they trained themselves. This sidesteps every data-level defense entirely. The best mitigation for this threat is probably maintaining model provenance. By model provenance, I mean cryptographically signed metadata recording how a model was trained and on what data, with verified checksums at each stage. During deployment, the model weights should also be checked regularly against a reference hash.

Conclusion

Securing AI model weights isn’t enough. Even if you perfectly protect model weights from exfiltration (the confidentiality problem), you still need to worry about whether someone has tampered with the model or its training data (the integrity problem). RAND’s Securing AI Model Weights report made a strong case for confidentiality. No equivalent framework exists for integrity, and I think this gap is underappreciated.

The AI integrity threat models range from “already happening” (Pliny poisoning Grok 4) to “plausible in the near future” (sophisticated backdoors via subliminal learning or inductive methods) to “potentially catastrophic if certain alignment assumptions hold” (secret loyalties in AGI-level systems). The defenses are underdeveloped across the board.

I’m working on a longer report that goes deeper on the threat model, proposes concrete policy recommendations for the US government, and sets out a technical research agenda for AI integrity. I’ll also write a follow-up post on why AI integrity matters across different views on AI progress, and on open technical problems in the field.

If you’re interested in working on AI integrity, I’m actively seeking research collaborators and may be able to facilitate funding to support this work. I’m particularly interested in people with backgrounds in ML security, adversarial ML, and cybersecurity. You can reach me at dave@iaps.ai!

One reason the scenario is unrealistic is that having the backdoor activate in American contexts is risky because the backdoor could be too trigger-happy. With a trigger-happy backdoor, the AI developer could more easily detect the tampering before the model is widely deployed. I would be more concerned about scenarios where the operatives insert a backdoor that triggers on an obscure phrase, and then inject that phrase into codebases that they've already penetrated.

Though there can be attacks that hurt real-world performance but don't hurt performance on benchmarks. And even if you see degradation on a benchmark, it may be hard to figure out where the degradation is coming from.

This kind of narrow backdoor could be useful for more targeted attacks. For instance, Chinese operatives could inject the trigger phrase into codebases that they’ve already penetrated. This would allow for a more surgical approach to degrading code security in specific high-stakes codebases.

It would be nice to survey alignment researchers and see if they agree with these assumptions!

Though it’s possible that Pliny has been secretly poisoning public repositories, there’s no evidence for this, as far as I know.

BIS is getting more funding—here's how to spend it

Maxwell K. Roberts — Fri, 30 Jan 2026 11:44:33 GMT

Our long national (compute policy) nightmare is almost over—the Bureau of Industry and Security (BIS) is finally getting more funding! There might still be a partial government shutdown over Department of Homeland Security funding, but the Commerce, Justice, and Science appropriations bill, which includes BIS funding, is signed and sealed, and BIS will see a $44 million (23%) increase (see figure).

BIS is the US government agency that manages dual-use export controls, which means “stopping people from selling dangerous technology to companies outside the United States”. Since October 7, 2022, the exports BIS controls have come to include advanced chips used for training AI models, and the manufacturing equipment that makes those chips.

BIS funding is vital to the effectiveness of export controls—all the complex rules and clever policy ideas in the world mean nothing if BIS can’t actually enforce them. But all the BIS funding in the world means nothing if BIS can’t spend it effectively. BIS should prioritize force multipliers, like IT modernization, that increase the effectiveness of the staff it already has. BIS should also take advantage of unconventional hiring authorities so that, when it does add staff, it can bypass the often-brutal civil service process and get the right people fast.

What did BIS say it needed money for?

To understand how BIS plans to spend this money, we can look at the administration’s budget request. This is the official wish list, written by BIS’s political appointees and endorsed by the White House, of what BIS wants. This year, it asked for $303 million (a $112 million increase) to hire 193 more special agents, 18 more Export Control Officers (ECOs), and 19 more technical experts (see table). The request is focused entirely on boosting enforcement, which is reasonable given the scale of AI chip smuggling and other violations.

BIS agents, primarily based at US field offices, are the law enforcement officers who investigate violations of export controls. They typically leave the “door-kicking” and “shooting guns” to partner agencies like the FBI and Homeland Security Investigations (HSI)—most of a BIS agent’s daily work consists of reviewing export declarations, following up on leads, and knocking on companies’ doors to politely inform them that they should stop, or BIS will fine them eleventy million dollars. More BIS agents straightforwardly means less smuggling—more agents means more time to follow up on leads and catch bad guys.

ECOs, permanently stationed overseas, are vital jacks-of-all-trades who play law enforcement, advisory, and diplomatic roles. Their main responsibility is conducting end-use checks—physically visiting companies that receive US-origin items to make sure they’re not doing anything bad. However, because they’re permanently stationed overseas, they also play a key diplomatic role in engaging with foreign companies and governments. A typical day for an ECO might include visiting a suspected smuggler’s warehouse in the morning, meeting with a major multinational about a new BIS regulation in the afternoon, and going to a reception hosted by the local Ministry of Trade that night. They’re vital to BIS’s ability to stop diversion and cooperate with other countries.

Technical experts—often engineers, biologists, or chemists—provide technical knowledge to support enforcement or rulemaking. “Technical expert” is sort of a catchall term for all the various types of technology specialists BIS needs. BIS needs technical experts because it’s usually not obvious to anyone without a PhD whether a vial in a refrigerator contains bacteria for fermenting yogurt or for making bioweapons, or whether a computer chip is for playing video games or developing superintelligence. Technical experts support enforcement by answering those questions, and support BIS as a whole in understanding new technologies so it can figure out what to export control and how.

How should BIS prioritize the money it’s getting?

Though Congress is providing a significant increase, BIS is not getting everything it asked for. While the administration requested $303 million (+59%), only $235 million (+23%) was actually signed into law. BIS will have to make tough decisions about how to prioritize the new funding to meet enforcement needs.

One of the most important things BIS could invest in is something that wasn’t in the original budget request—IT modernization. Organizations like the Center for Strategic and International Studies have extensively documented the sorry state of BIS IT systems, and Congress has previously introduced legislation to appropriate supplemental BIS funding specifically for this purpose. The logic is simple—every additional agent adds one “unit” of capability, but better IT systems improve the productivity of every agent. This would be less true if BIS already had relatively good IT systems, but my sense is that BIS is far, far away from hitting diminishing returns on IT investments.

Once BIS has adequately funded IT modernization, it should invest in a balanced mix of ECOs and agents, perhaps with a slight bias towards agents. ECOs and agents both meaningfully boost BIS enforcement in complementary ways. ECOs help BIS detect shell companies and bad actors more rapidly by checking the actual fate of more exported goods, and also improve BIS’s coordination with foreign governments. More agents mean that BIS can follow up on more leads, leading to more disruption of smuggling networks before goods ever leave the United States. If BIS were intelligence-constrained, more agents would not be as useful, but per BIS’s own budget request, BIS agents typically handle caseloads far higher than criminal investigators at comparable agencies.1

The lowest priority should be technical experts. Now is a good time for BIS to hire technical experts, for reasons related to hiring authorities discussed below. Additionally, the “force multiplier” argument is somewhat applicable to technical experts—better technical expertise to help agents identify possible diversion benefits every agent, although technical experts are not quite as cheaply scalable as software. However, technical experts may not contribute as directly to enforcement as the other priorities above.

Can BIS actually hire with this money?

The challenges and history of BIS IT modernization efforts could fill a separate post (and might soon), but on the staffing side, it can be hard for the US government to recruit candidates. Government pay follows a fixed scale based on education and experience, but the scale doesn’t care what kind of education and experience it is. According to the Bureau of Labor Statistics, both anthropologists and computer scientists typically require a Master’s degree, which would qualify them as GS-9 federal employees earning about $70,000. But in the private sector, the median pay for an anthropologist is $64,910 whereas for a computer scientist it is $140,910. This problem is much worse at higher levels of experience and in booming fields like AI—the government is not going to pay AI researchers $100 million signing bonuses like Meta.

But suppose money isn’t the main motivator—suppose the government only wants mission-driven candidates anyway, or is hiring in a field like law enforcement where the government is the only game in town. Even when salary isn’t an issue, it’s still incredibly hard to hire the right people. Unless the hiring agency gets Direct Hire Authority or another special carve-out, every hire needs to go through competitive civil service procedures. In theory, the point of these procedures is to make sure the government is absolutely fair. In reality, these procedures make government hiring timelines last months (or years, if you need security clearance) and introduce a massive arbitrary element, as resumes are reviewed and scored by department-level HR staff who have no idea what job is even being filled.

IAPS has published separate work on the challenges of attracting AI talent into the government and what we can do about it. It’s not just about money, but about being able to hire the people offices need, even when candidates are willing.

BIS may soon have some opportunities to bypass the standard process. Congress is considering the BIS STRENGTH Act, which would let BIS hire up to 25 specialized technical personnel outside the normal hiring process and pay them salaries competitive with the private sector. The Tech Force program is doing the same across government, with a specific focus on high-demand areas like AI, data science, and software engineering. These programs would significantly improve BIS’s access to quality candidates by simplifying the hiring process and increasing salary caps, and they make this an especially good time for BIS to be staffing up.

Even with NVIDIA H200 exports to China, this funding matters

In December, President Trump announced that NVIDIA H200s would be sold to China, and there was a great disturbance in the Force, as if a thousand compute policy watchers tweeted at once. BIS has since released the details of how advanced chip sales to China will be handled, and I published my own analysis about how the safeguards in the policy are effectively unenforceable. From a US-China competition perspective, selling potentially millions of H200s to China would be a Very Bad Thing.

At the same time, the most powerful Blackwell-generation chips are still restricted, and chip smuggling will continue to be a factor in China’s access to compute. BIS also needs stronger enforcement to prevent Huawei and other Chinese AI chip makers from making their own domestic products to compete against NVIDIA and others.

Funding BIS helps restrict China’s access to cutting-edge chips and builds in optionality for the US government to take a more aggressive export control approach if it so chooses. BIS should invest the new funding it has received in better systems and the right staff to continue protecting US national security for decades to come.

See page 3 of BIS’s FY2026 budget request, which says: “These threats have significantly increased the scope of BIS’s work to include a significant increase in exports under BIS licenses, and export enforcement officers handling on average 26 cases (and another 19 leads) per agent (far above comparable agencies).”

The case for paying whistleblowers to report on export violations

Erich Grunewald — Wed, 28 Jan 2026 14:38:29 GMT

The US has a massive export enforcement problem. It’s likely that over 100,000 export-controlled AI chips were smuggled into China in 2024. To give a sense of scale, the xAI Colossus cluster in Memphis, Tennessee, comprised first 100,000 and later 200,000 AI chips. That’s roughly an xAI Colossus cluster being smuggled to China each year. The main reason we know this is that smugglers are so unafraid that they’re willing to talk about their operations to journalists; this has happened repeatedly during the past year and a half.

AI chip smuggling is far from the only enforcement problem. In 2024, Huawei got TSMC to illegally fabricate over two million of its AI chip dies through front companies, despite sanctions. That is a far larger quantity than the number of Huawei AI chips fabricated domestically in China that year. We’ve also seen likely violations related to high bandwidth memory and semiconductor manufacturing equipment, which help China make its own AI chips to compete against NVIDIA.

What if you could pay insiders many millions of dollars to inform US authorities about such violations, at almost no cost to the US government? That would likely surface a large number of high quality tips about important violations, which would greatly aid US authorities in detecting, punishing and deterring such violations.

This idea may sound outlandish, but it’s actually possible. In fact, there is a law being discussed in Congress that would accomplish exactly this! But before we get there, let’s take a brief detour to the Securities and Exchange Commission (SEC) and the 2008 financial crisis.

The SEC whistleblower program

In 2008, the US economy was reeling from the housing and mortgage crisis. Late that year, Bernie Madoff sat down with his two sons and admitted to them that the investment business he’d been running for two decades was a giant fraud, a Ponzi scheme to end all Ponzi schemes. Because of these two crises, there was a desire among policymakers to strengthen financial regulation and oversight.

Related to the Madoff scandal in particular, the SEC was under criticism for failing to properly investigate several credible reports about it. An employee at a rival investment firm, Harry Markopolos, had been asked by his employers to figure out how Madoff could post such consistently excellent returns, and soon realized that the returns were impossible with Madoff’s claimed strategy. Markopolos later said in an interview: “I read his strategy statement, and it was so poorly put together. His strategy as depicted would have trouble beating a zero return, and his performance chart went up at a 45-degree line: that line doesn’t exist in finance, it only exists in geometry classes.”

Markopolos sent reports to the SEC detailing Madoff’s fraudulent activities on multiple occasions before the 2008 financial crisis. However, the SEC failed to properly investigate these reports, leaving Madoff free to continue defrauding investors until the financial crisis made its collapse imminent. Lawmakers realized that reports of wrongdoing from the general public could be a valuable tool for detecting and deterring securities laws violations.

One result of this, signed into law in 2010 as part of the Dodd-Frank Act, was the SEC whistleblower program.1

The SEC whistleblower program works like this. First and most importantly, whistleblowers get 10-30% of any penalty resulting from their report. This can be many millions of dollars—the largest reward to date, paid out in 2023, was nearly $280 million. This monetary incentive is paired with protections against retaliation from their employers, confidentiality guarantees, and the ability to make reports to the SEC anonymously through an attorney. To pay out whistleblower rewards, the Dodd-Frank Act also sets up an Investor Protection Fund, which receives penalties from securities violations (previously these would go to the Treasury).

The SEC whistleblower program is widely considered to have been an enormous success. It’s now one of the key ways that securities law is enforced in the US. It has helped generate $7.3 billion to $22 billion in penalties since its inception in 20112, and has received reports from at least 130 countries. Quantitative evaluations are rarer, but existing research suggests it has reduced financial reporting fraud, deterred insider trading, and caused companies to strengthen compliance programs.

The Stop Stealing Our Chips Act

Now the question is, could you adopt the SEC whistleblower program model for the Bureau of Industry and Security (BIS) and export violations? The Stop Stealing Our Chips Act—introduced into the Senate in April 2025 by Senators Rounds (R-SD) and Warner (D-VA) and into the House a month and a half ago by Representatives Kean (R-NJ) and Johnson (D-TX)—would do exactly this.

The Stop Stealing Our Chips Act (henceforth, SSOCA) is closely modeled on the SEC whistleblower program, with some changes to adapt it for the export enforcement situation. It too offers whistleblowers 10-30% of any resulting penalty along with whistleblower protections, including the possibility of making anonymous reports to BIS.

The first thing to note here is the financial incentive. As economists always tell us, financial incentives are incredibly powerful, and the fines for these violations can be enormous:

There have been several news reports of operations involving on the order of 10,000 smuggled AI chips, meaning roughly $400 million worth. BIS can fine up to twice the value of the related transaction, so that could be a penalty of $800 million, for just one smuggler who spoke to the news media. If a whistleblower reports on that, they could get up to 30% or $240 million (leaving $560 million for the US government).
The massive TSMC-Huawei violation—which was only detected when an independent organization did a teardown of a Huawei chip—could reportedly result in a $1 billion fine. This would have been up to $300 million for an informant.

Beyond catching violations, a well-publicized program could have significant deterrent effects. If everyone in a supply chain—sales reps, warehouse workers, freight forwarders, accountants—knows that reporting can yield millions, violators face a much riskier environment. This effect could be realized even before the whistleblower program comes into effect, as the law would allow whistleblowers to report on violations that occurred before it was signed into law.

Would the BIS program actually surface any tips?

All right, hundreds of millions of dollars is a strong incentive. But, you may ask, are there actually people with information about these violations who would be willing to step up and blow the whistle? Why, yes there are!

Take AI chip smuggling operations: these involve lots of people who could potentially file reports, in other words people who have relevant information and would like to get millions of dollars. This includes, for example, people working in sales at exporters with questionable compliance practices; employees at local resellers, freight forwarders, logistics companies, warehouses, or data centers where the chips are temporarily housed; and accountants and lawyers.

In March 2025, Singaporean authorities arrested three people for smuggling $390 million worth of AI servers. These arrests were the result of an “anonymous tip-off”, in other words a whistleblower report! It seems likely that the recent Operation Gatekeeper arrests of an AI chip smuggling ring operating out of Texas and New York were also the result of an insider tip.

The story is similar for the TSMC-Huawei violation, where there were likely many TSMC employees who could’ve known about this problem and informed the US government. The only reason the US ultimately found out about this violation was because an independent party—TechInsights—did a teardown of a Huawei AI chip, and noticed it was TSMC-fabricated. A BIS whistleblower program would likewise incentivize such actors to look for evidence of violations and report those to the US government. This type of information is hugely valuable; it makes no sense to sit around and wait for people to offer it out of the goodness of their hearts.

As with the SEC program, the SSOCA makes foreign nationals eligible for rewards. This is important because many export violations happen in third countries, where goods are diverted via reexport or transshipment. (The SSOCA does however wisely make some exceptions for known terrorists and sanctioned persons, who are not eligible for rewards.) This is similar to how the intelligence community pays foreign informants, who provide the US government with information that benefits US national security.

Would BIS be able to run the program?

At this point, the wise reader will ask, “Isn’t BIS extremely resource constrained? If so, how is it supposed to process and investigate a bunch of incoming tips, determine awards, and carry out outreach on the program?” After all, BIS’s budget for enforcement has been essentially flat when accounting for inflation for at least the past five years (see figure), despite BIS receiving a vastly increased scope of responsibilities due to the AI chip export controls introduced in October 2022 and the Russian invasion of Ukraine and all the diversion related to that conflict.

Good question! The answer is that these activities—investigating whistleblower reports, determining awards, and carrying out outreach—would also be financed through incoming penalties. This is one of the most notable differences between the SSOCA and the SEC program. The SSOCA authorizes BIS to use money from penalties for a few additional purposes and not only for paying out rewards to whistleblowers. (As currently written, the SSOCA would only allow BIS to receive money from penalties that stem from whistleblower reports, but I think this should be expanded to cover all penalties for BIS-related violations.)

Today, any fine levied by BIS goes straight to the Treasury, or in rare cases it is earmarked for some specific fund, such as the Crime Victims Fund. What the SSOCA would do is redirect these to an Export Compliance Accountability Fund. This Fund would be used to pay rewards to whistleblowers; any money left over would go first to core functions of the BIS whistleblower program, and then to export enforcement activities more broadly.

There is a separate but related question of how the program would be funded initially, if it’s mainly intended to be funded through penalties. However, BIS already has a fairly steady stream of enforcement actions, including from likely insider tips, without any whistleblower incentive program (see figure). BIS may also be able to direct some of its appropriated resources to the program in the first one or two years, in order to get it up and running.

It seems likely that this program would pay for itself if implemented. That’s because the program would likely help BIS detect more violations and therefore levy more penalties than it would without the program. It could well end up both reducing the number of violations and also generating additional revenue for the federal government by making it much more likely that violations are detected and enforced. The losers here would be the smugglers and other bad actors who wake up every day trying to figure out ways of harming US national security.

BIS’s entire budget for fiscal year 2025 was about $191 million, likely far smaller than the collective profits of AI chip smugglers alone, which may well have exceeded $1 billion. A single successful enforcement action against a major smuggling operation could pay for BIS’s entire annual budget—for example, last month US authorities arrested three individuals accused of smuggling AI chips worth $160 million to China, which could result in a penalty of $320 million. There are likely dozens of such cases remaining to be discovered. A BIS whistleblower program like the one described in the SSOCA could create a virtuous cycle where more tips and better enforcement lead to more penalties and rewards, which in turn leads both to more tips by publicizing the program and also more resources for enforcement.

There was also a CFTC program established at the same time, but it is less well known so I just discuss the SEC program here.

The SEC has awarded more than $2.2 billion to whistleblowers since the program’s inception. In the extreme case of all those awards being 10% of the related penalty, that would imply $22 billion in penalties. In the other extreme case of all those awards being 30% of the penalty, it would imply $7.3 billion.

For chip exports, quantity is at least as important as quality

Onni Aarne — Tue, 27 Jan 2026 21:28:45 GMT

The Trump administration’s decision to sell NVIDIA H200s to China, as codified in a recent rule we previously wrote about, has received a lot of criticism. While the specifics of the rule are interesting, I want to step back a bit and analyze what is actually wrong with the core argument that the administration is making. Because there is, in fact, a logically valid argument to be made in defense of this policy, which I’ll call the quality-based approach to export controls.1

The structure of the administration’s argument is something like this:

The US needs to dominate China in AI.
China’s AI capabilities are bottlenecked by the quality of AI chips they have access to; the highest quality chips can only be made in Taiwan.
But preserving US2 chip companies’ market share in China is important for maintaining the hardware lead.
Therefore, the US should allow the sale of chips that are just slightly better than what they can make domestically, but no better than that.

This might seem logical: What point would there be in blocking sales of chips similar to what Huawei and other Chinese companies can make domestically anyway? And letting Chinese AI companies buy chips that are a bit better than Huawei chips helps draw revenue away from Huawei. What’s not to like?

One can of course quibble about the fact that the H200 is not “similar” to Huawei chips, and it indeed offers, for example, as much as 2.5 times the inference performance of the best Huawei chips.3 But the deeper issue with the argument is that a key premise is false: China’s bottleneck is not primarily about chip quality. They’re primarily bottlenecked by chip quantity, i.e., total compute.

Quantity can make up for quality

It’s worth understanding what chip “quality” even consists of. AI training and deployment ultimately boils down to doing an astronomical number of multiplications and additions. A “better” chip is generally just one that can do more of these basic calculations (measured in “floating point operations per second” or FLOP/s) while using less power. The other main consideration is how many numbers the chip can hold in “memory” at once, and how quickly it can move numbers (input data and results) onto and off the chip.4

For highly parallel workloads like AI, any “higher-quality” chip can be replaced by a sufficiently large number of lower-quality chips. The software engineering required to make a large workload work on a larger number of individually-weaker chips is more challenging, and the system will use a lot more power and networking, but it can be done. Huawei’s CloudMatrix 384 system demonstrates this: By linking 384 Ascend 910C chips together, it achieves 300 PFLOPS of dense BF16 compute—almost double the performance of NVIDIA’s GB200 NVL72—despite each individual Ascend being only about one-third the performance of a Blackwell chip. The tradeoff is power: CloudMatrix consumes 2.6x more watts per FLOP than the GB200. But China is well placed to compensate for these quality limitations: It has 1.6 times as many software engineers as the US and added more than 10 times as much new power capacity as the US in 2024.5

China’s chip problem is a quantity problem

While Huawei’s chips are lower quality and more expensive to produce than chips manufactured in Taiwan, China’s main problem is chip quantity: The US has approximately 5-10 times more total AI computing capacity installed than China,6 and the US is projected to produce roughly 50 times as much AI compute as China in 2026.7

China has a chip quantity problem because it has a chip production problem, driven by two key bottlenecks created by export controls. First, US-led restrictions on semiconductor manufacturing equipment (SME) have limited China’s ability to produce advanced logic chips domestically. Second—and increasingly important—is high-bandwidth memory (HBM), the specialized memory critical to AI accelerators. The Biden administration’s December 2024 controls specifically targeted HBM and the equipment needed to produce it. Chinese memory maker CXMT is several generations behind South Korean industry leaders. HBM is now the binding constraint: SemiAnalysis estimates that while SMIC has capacity to produce logic dies for over a million Ascend chips, CXMT will only be able to manufacture enough HBM for 250,000-300,000 Ascend 910Cs in 2026. Any additional exports of US AI chips would directly alleviate this chip supply bottleneck, even if the chips were no better than Huawei’s alternatives.

The right move is to minimize the quantity of AI chip exports

Some readers may be screaming at their screens that the Trump administration’s new rule does in fact have a chip quantity restriction. And this is true: Sales of any one model of chip are capped at half the volume of that chip that has been sold in the US. However, in the words of Trump himself, the main motivation behind the policy change was that the chips in question are “not the highest level”, i.e., lower quality. The quantity restriction appears to have been a valuable but insufficient safeguard tacked on at the last minute.

If the administration instead held fast to an approach focused on minimizing the quantity of AI compute that China can access, it could plausibly expand its current 5-10x compute advantage by as much as another order of magnitude over the coming years: IFP estimates that the US could add as much as 21 to 49 times more compute than China in 2026 if no US AI chips are exported. Such a gap would be strategically decisive: with 20x less compute, Chinese companies would struggle to train frontier models, as even matching a single leading US training run would likely require concentrating their entire national AI compute capacity on one project for an extended period. Chinese AI companies are already struggling to meet domestic and international demand due to compute constraints, and are consequently falling further behind because they cannot divert compute toward R&D and experimentation.8 The international market, and perhaps eventually even the Chinese market, would be dominated by US AI companies.

The impact of a quality-based approach depends on Chinese spending

So how would the impact of a quality-based approach compare to a quantity-minimizing approach? As discussed, quality is largely fungible with quantity, so I will focus on what a quality restriction would mean for total AI compute capacity9 in China versus the US.

A quality-based approach would endorse selling a chip similar in price-performance to the best Chinese chips, i.e., the Huawei Ascend 910C. This would likely be roughly a third to half the price-performance of Blackwell-generation chips.10

The overall impact of this depends heavily on how much Chinese AI companies are willing and able to buy. In principle, a total absence of quantity restrictions could allow Chinese companies to catch up to the US by simply outspending them by a factor of two, but this is of course unlikely in practice.

One of the most concrete pieces of information we have about potential Chinese chip spending is Jensen Huang’s claim that Chinese companies have ordered two million H200 chips, worth about $54 billion. This is about 30% of projected US hyperscaler chip spending for 2026,11 suggesting that Chinese willingness-to-spend is not greatly affected by somewhat lower price-performance. As an extremely rough guess, this would suggest that Chinese companies would spend somewhere between $30 and $50 billion on these Ascend-equivalent chips, resulting in approximately a 5-15x US advantage in terms of compute added in 2026.12 This would not be catastrophic, but it is still several times worse than the alternative of 21-49x.

The quality-based approach also leaves the door open to very concerning worst-case outcomes, especially if the US AI industry faces significant headwinds. To sketch an example scenario, it is plausible that Chinese companies will be much more successful at attracting users to their models, perhaps because the regulatory environment in the US turns hostile to AI. This could also coincide with US financial markets becoming disillusioned with AI, as they briefly became disillusioned with internet companies after the dot-com bubble. In such an environment, US companies may also struggle to obtain permits for new data center construction, even if the financing were there. If so, Chinese AI companies supported by state subsidies may be able to outspend and outbuild US companies, plausibly overcoming a 2x cost-effectiveness penalty to take the lead and start competing internationally.13

The CCP will not let Huawei fail

But at least the quality-based approach would suppress Chinese domestic AI chip production, because Chinese companies would stop buying Huawei, right? This might be the case if the CCP were free market absolutists, but alas, they are not: As even the administration’s White House AI & Crypto Czar David Sacks—a driving force behind the H200 policy—has acknowledged, China can and will “outfox” this strategy by mandating that Chinese AI companies also buy domestically made chips. They can calibrate these requirements to precisely match domestic production capacity, ensuring Huawei never lacks for customers regardless of how many US chips are available.

The reality is that semiconductor self-sufficiency has been a core CCP strategic goal at least since the Made in China 2025 roadmap, which was laid out in 2015, long before export controls. That roadmap set a target of 80% domestic market share for “high-performance computers and servers” by 2025.14 The repeated whiplash of US policy has only reinforced Beijing’s conviction that building an indigenous supply chain is a strategic necessity.

There is still time to change course

The administration’s recent rule falls between these two extremes: It puts a cap on the number of chips that can be sold to China while still allowing very significant quantities of exports. Under the current 50% rule, the cap sits at 900,000 Hopper-equivalent chips and rising over time—”over twice what China is expected to produce this year”. This would result in a US compute advantage of approximately 9-10x in 2026. The quality of the chips is also substantially higher than a strict quality-based approach would recommend. But the policy will still be less damaging than exports without any cap on export volume.

Fortunately, few if any chips have yet been shipped to China, and the administration is free to change its mind at any time. If it does, it would likely secure enduring US dominance in AI, plausibly causing a permanent collapse of the Chinese AI industry. But the details of how that would happen will be a post for another day.

Yes, I know, the rule also has a quantity restriction component. I'll get to that.

Throughout this piece I will loosely talk about “US chips”, but more precisely what I’m referring to are the overwhelming majority of AI chips that are designed and sold by US companies like NVIDIA, AMD, and Google, but are manufactured AKA fabbed in Taiwan by TSMC. These chips are subject to US export controls even if they never touch US soil, because of something called a “foreign direct product rule”.

Based on inference performance benchmarks showing the H200 delivers approximately 1.9x H100 performance while the Ascend 910C delivers approximately 0.6x H100 performance. At estimated prices of ~$32,000 for the H200 and ~$26,000 for the Ascend 910C, the H200 provides roughly 2.5x more inference performance per dollar.

This is more formally referred to as memory bandwidth (between the chip and its memory) and interconnect bandwidth (between the memories of different chips).

China’s total installed power generation capacity reached 3.35 TW at end of 2024, up 14.6% year-on-year, implying approximately 427 GW of new capacity added. By comparison, the US added approximately 30 GW of net new generating capacity in 2024.

In Epoch AI’s supercomputer dataset, the US holds approximately 75% of global GPU cluster performance while China holds approximately 15%, a ratio of roughly 5:1. Other estimates, such as Lennart Heim's, suggest a ratio closer to 10:1. The discrepancy may reflect different methodologies and definitions of "AI compute capacity", and limitations in Epoch’s coverage.

According to IFP’s analysis, which draws on SemiAnalysis and other sources, US AI chip production is projected to reach 6,890,000 B300-equivalents in 2026, while Huawei production is estimated at only 62,000-160,000 B300-equivalents—roughly 1-2% of US production. SemiAnalysis projects Huawei could produce ~805,000 Ascend units in 2025, but notes that HBM memory shortages will likely constrain actual output to 250,000-300,000 Ascend 910C units in 2026.

Experiments are estimated to make up a very large fraction, possibly a majority, of US AI companies’ compute use.

Measured in terms of FP8/s, following IFP’s analysis.

IFP analysis estimates the H200 achieves approximately 70% of the B300’s price-performance at FP8. Combined with footnote 3’s finding that the H200 is ~2.5x more cost-effective than the Ascend 910C, this implies the Ascend 910C achieves roughly 28% of B300 price-performance (0.70 / 2.5 ≈ 0.28). Separately, SemiAnalysis notes each Ascend 910C has “only one-third the performance of an NVIDIA Blackwell” chip. Given that Ascend chips are also cheaper (~$26K vs ~$53K for B300), this suggests price-performance of roughly 50% of Blackwell. Taken together, “one-third to half” is a reasonable if slightly generous estimate.

US hyperscaler AI infrastructure capex is projected to exceed $600 billion in 2026, with roughly $180 billion specifically on GPU/accelerator purchases.

Rough calculation assuming quality-based chips have 30-50% of Blackwell price-performance. Domestic production valued at 62-160K B300-equivalents × ~$53K = $3-8B (see footnote 7).

Lower bound ($30B spending, 0.3 price-performance): $30B × 0.30 = $9B Blackwell-equivalent, plus ~$3B domestic production = $12B total; US advantage = $180B / $12B ≈ 15x.
Upper bound ($50B spending 0.5 price-performance): $50B × 0.5 = $25B Blackwell-equivalent, plus ~$8B domestic production = $33B total; US advantage = $180B / $33B ≈ 5x.

Some combination of ingenuity and industrial espionage may also help Chinese AI companies make up for inferior compute with improved algorithmic efficiency.

The self-sufficiency targets come from the Made in China 2025 “Green Book” technology roadmap. See CSET’s English translation: Roadmap of Major Technical Domains for Made in China 2025, p. 8.

This is The Substrate

Erich Grunewald — Tue, 27 Jan 2026 20:41:46 GMT

Welcome to The Substrate. The Substrate is a newsletter about compute and AI hardware, security, semiconductor manufacturing, and the geopolitics of these.

Should the US sell advanced AI chips to China? How can the US better enforce its export controls? How can we design and build highly secure data centers? When will China develop its own extreme ultraviolet photolithography machines? Can we design governance mechanisms into chips securely, for example to verify international agreements? How exactly does compute benefit nations anyway? — these and many others are questions we’re interested in.

The world is not about to run out of AI-related Substacks, so what’s different about this one? Let’s first stake out our fundamental beliefs, such as they are. A lot of people have a lot of opinions about AI, but here are some things we tend to believe:

AI is very likely the most important technology of our time
A world with very powerful AI could be very good or very bad
Good policy can help secure a more positive future with powerful AI
Compute is one of the most important resources of our time, and will likely remain so
Compute is a strategic resource that will see intense geopolitical competition
Compute can be used to better govern AI

These claims, if true, raise a lot of very important and very interesting questions about policy and strategy, and these are the questions we want to explore with this newsletter. We’re calling it The Substrate because compute is the substrate that AI runs on, silicon wafers are the substrate that circuits are etched onto, and (more poetically) the underlying structures are the substrate that surface phenomena emerge from.1

And who are we? The Substrate is run, and mainly written, by the compute policy team at the Institute for AI Policy and Strategy (IAPS). (Caveat: The opinions expressed in this newsletter are solely the authors’ own, and not indicative of any institutional stance of IAPS.) IAPS is a think tank whose mission is “securing a positive future in a world with powerful AI”, and ultimately that is our mission too. But mostly we’ll just write about things we find important and interesting, because we are betting that you also find those things important and interesting.

Most of us have worked on these topics for years. We have previously written reports on, among other things:

Hardware-enabled mechanisms, such as delay-based location verification to combat AI chip smuggling and flexible hardware-enabled guarantees for future treaty verification
AI chip smuggling into China
AI chip making in China
AI data center security

But much of our past research and writing remains unpublished. With The Substrate, we hope to publish more things more quickly, including takes that are more tentative than what we’d put in a long report, but that we think can benefit from discussion and feedback even in that tentative form. We also want to write commentary on ongoing events, summaries of our longer research reports, and more.

Our first substantive post is written by Onni and argues that for chip export controls, quantity is at least as important as quality. After that, we will publish a post by Erich on the Stop Stealing Our Chips Act—which would introduce a whistleblower incentive program for export violations—and a post by Max on what the Bureau of Industry and Security needs money for.

And also because we are not afraid of conflation with semiconductor start-ups. But just to be absolutely clear, we are a not-for-profit newsletter and have no relation to the start-up called Substrate (without the definite article).