BIS should use AI to control AI chips

For BIS to truly stamp out smuggling, it needs to take advantage of the AI capabilities it’s trying to control.

Mar 30, 2026

The Bureau of Industry and Security (BIS) needs more enforcement capacity. It recently came to light that several Super Micro employees illegally moved $2.5 billion worth of export-controlled AI servers to China over two years. They routed these servers through a front company in Southeast Asia and constructed hundreds of fake servers to fool physical inspections by the manufacturer and by BIS. BIS is doing the best it can, but an agency with only a few hundred employees and a budget one-tenth the value of those smuggled servers can only do so much.

There are well-known solutions to this problem. I’ve written about how BIS is finally getting funding to hire more agents, and how upgrading BIS’s software and data systems to match private-sector capabilities could be a force multiplier for enforcement. Those steps are good, but incremental. In this post, I propose something more ambitious.

Specifically, the entire reason why BIS is straining to stop China from acquiring AI chips is the prospect of AI revolutionizing military, economic, and political power. So shouldn’t BIS be using it to revolutionize export enforcement?

What LLMs can’t fix

The most relevant form of AI for BIS today is likely large language models (LLMs) and the agents that are built on them. It’s worth first laying out what these cannot do.

LLMs can’t accelerate processes bottlenecked by human review. When BIS adds companies to the Entity List, an analyst at the BIS Office of Enforcement Analysis must first spend days or weeks writing an Entity List package. That package contains information about the company to be added: its subsidiaries, addresses, products, alleged bad behavior, and any expected economic blowback from the addition. The point of the package is to summarize all relevant information about a company in one place, so that when the End-User Review Committee, with representatives from the Departments of Commerce, State, Energy, and War, votes on the package, it has all the information it needs.

I think LLMs could speed the process of writing Entity List packages from “days to weeks” to “minutes to hours”. To the extent that the task involves searching the internet and relevant internal records for information about the company and summarizing it in a specified format, the deep research features of commercial LLMs are already quite useful. LLMs might even be better than human analysts in some regards, because they are fluent in many languages and don’t get bored reading endless shareholder reports. But if LLMs enable BIS to generate hundreds of packages each month, this will create a bottleneck for the End-User Review Committee, as its members must read through and evaluate thousands of pages.

LLMs can’t analyze data that BIS doesn’t have. BIS has detailed data on, for example, US exports, and it’s plausible that LLMs could help analyze that data in much the same way a human analyst could: by noticing when shipments are going to economically irrational destinations, or when they’re sitting in warehouses for implausibly long periods. If so, LLMs would be like human intelligence analysts, but at much greater speed and scale—rather than being constrained by staffing limitations to vet only the most suspicious transactions, BIS-controlled LLMs could vet every transaction in real time and escalate to human analysts as needed.

However, LLMs cannot analyze data that BIS does not have. BIS’s ability to understand trade flows beyond US borders relies on a mixture of clandestine methods, commercial datasets, and cooperation from foreign governments. If LLMs were to massively increase BIS’s analytical throughput, BIS may need to acquire a lot of additional data to keep those LLMs chewing on something. BIS would also need to set up the IT infrastructure to connect all these data sources.

What LLMs can fix

Having described what LLMs can’t fix, let me lay out what I think they can fix, and why BIS should invest in them.

LLMs are great at software engineering and data science. I’ve written about how BIS should improve its software and use more data science. LLMs could make these projects so much cheaper and easier!

On the data science side, LLMs reduce the need to learn programming languages like SQL and Python for analyzing large datasets. Right now, BIS relies on a tiny number of trained data scientists, many of whom are contractors, to answer data questions about exports and licenses. This limited capacity means that only the most important questions get answered, and some questions never get asked at all. LLMs would be like a trained data scientist sitting at the desk of every enforcement analyst, all the time, ready to query the data in any way they needed.

On the software engineering side, LLMs could fundamentally reimagine how organizations procure software. The idea that the government needs to choose a software solution for an agency, a department, or even the whole government, and then spend a lot of money to buy it, is based on the premise that software is expensive and software engineers are scarce, which is now becoming false. LLMs are probably still not the right choice for building anything that needs extremely high reliability or security, but by dramatically lowering the cost of software, they massively increase the number of applications it can be used for. Every office could build custom software workflows for its own needs—converting public comments from PDFs to Word docs, or generating Federal-Register-formatted lists of addresses from Excel spreadsheets—at little more cost to the government than the tokens burned. LLMs are probably not ready for building highly secure systems or massive agency-wide overhauls, but they can already automate any task that a Python script and a small server could handle today.

LLMs could turn every employee at BIS into a software engineer and a data scientist, allowing each office and even each person to build tools suited to their own workflows and needs.

LLMs are great at internet research. LLMs are not great at research taste—that is, knowing which questions are worth asking or which problems are worth working on. Nor are they great at operationalization, like turning a vague instruction (“tell me what’s going on with Huawei these days”) into specific Google searches. But they are fantastic at googling things and writing about them. A lot of open-source analysis is just putting certain Chinese characters into Google until you find what you are looking for. LLMs can do that with incredible speed and scale, in any language you please.

This capability can be useful not only for writing Entity List packages, but also for another common BIS task: “Leadership saw a headline about this thing. Write a two-pager explaining it.” By using LLMs for this task, the role of the analyst would shift from providing mechanical effort (the ability to google lots of things and write about them) to providing judgment in the form of context, taste, and verification.

LLMs are great at answering short-form science and engineering questions. LLMs are fantastic at answering short questions about science and engineering, as illustrated by progress on benchmarks like GPQA, MMLU, and Humanity’s Last Exam. BIS has always struggled to hire and retain technical experts, because private-sector jobs pay far more and offer a better quality of life. But LLMs will happily explain the difference between extreme ultraviolet lithography and deep ultraviolet lithography, or between different types of side-channel attacks. Unlike a real technical expert, they will answer an unlimited number of follow-up questions, with infinite patience.

This capability can help enforcement analysts answer questions like: “What does this electronic component that the intelligence says someone is transporting actually do?” Such answers would help licensing officers and license applicants better understand how items should be classified and what their technical capabilities practically mean. They would also help policymakers write better export control rules by providing a clearer understanding of what the underlying technologies can do and how they fit together.

The administration is already aware that AI can assist with many government tasks. The AI Action Plan urged agencies to “Accelerate AI Adoption in Government” and outlined specific actions that government service providers, such as the Office of Personnel Management and the General Services Administration, could take to enable AI adoption. Similarly, an Office of Management and Budget memo published in April 2025 called on agencies to “remove barriers to innovation”, “empower AI leaders”, and “ensure their use of AI works for the American people”. The next step is for BIS to heed the Action Plan’s call and start realizing the benefits.

BIS should get ready for AI agents

Everything described above can be done with existing commercial capabilities, such as Claude Code or OpenAI Codex. I intend to show this more rigorously by constructing more formal evaluations, but I don’t think it requires any premise beyond frontier models having the capabilities they have today.

But I think BIS should be thinking more ambitiously than that. The time horizon of certain software engineering, machine learning, and cybersecurity tasks that frontier models can complete continues to grow rapidly. I believe that by the end of 2026, frontier models with sufficient scaffolding will be able to complete tasks across many domains that take humans three to four days. (They already can in some laborious domains that they are well suited for, like translation.)

Rather than a working-level analyst writing an intelligence report and asking a model to “write one paragraph about what types of vacuum cleaners this company makes”, an office director could tell an agent: “Write intelligence reports about these five companies, and decide which ones pose a threat, and then write Entity List packages about them.” If AI agents are sufficiently trustworthy (another reason it’s important to build highly specific, formal evaluations), they could even automate some of the human-review bottleneck.

For BIS to be ready for more capable AI agents as they arrive, it needs to set guardrails around agent deployments, determine how best to integrate agents into classified systems, and reimagine laws and procedures for an agentic world.

Setting guardrails for AI agent deployments

Guardrails for agent deployments should be built into the software so that agents are unable to do anything they are not authorized to do. The “principle of least privilege” is an old concept in security, but the advent of agents capable of rapidly doing irreversible damage makes it far more urgent. I propose three types of guardrails for AI agents used by BIS.

First, AI agents should have data guardrails. Each deployed agent instance should have a well-defined purpose—for example, analyzing trade data for anomalies—and access only to the data required for that purpose. Trade data analyst agents should not have access to employee email inboxes. Humans should serve as the “air gap” between agent outputs and the ability to, say, publish to the BIS website or email the Secretary of Commerce.

Second, AI agents should have action guardrails. An agent whose purpose is to monitor a political appointee’s email inbox should not be able to execute code. An agent whose purpose is to patch software vulnerabilities should not be able to read case files. One worrying dynamic is that, as AI agents’ cyber and software engineering capabilities improve, their ability to escalate their own privileges may also improve (which appears to already be happening).

Third, AI agents should have decision guardrails. There may be some actions where, even if the quality of agent decisions is demonstrably higher than that of human decision-makers, it would still be unacceptable, for moral, political, or safety reasons, for AI agents to make decisions without human approval. This guardrail is especially important for BIS, since it is a law enforcement agency that implements part of the US government’s monopoly on legitimate force. AI agents should never be able to order arrests or take any other action that could violate a legal right to due process.

Integrating AI agents with classified systems

Many of the most valuable tasks AI agents can perform would require access to classified systems. Much of the intelligence BIS relies on to catch smugglers comes from the Intelligence Community or other federal law enforcement agencies and is shared on classified systems. For agents to operate as turnkey autonomous intelligence analysts, they would need to access not only the open internet and commercial trade data but also classified sources. They might also need to operate across classification boundaries, as human analysts do—for example, finding the website of a company named in a signals intelligence report.

The upside of getting this right is enormous. An AI agent with access to both classified intelligence and open-source data could do in minutes what currently takes a human analyst days—for example, cross-referencing a tip from a foreign partner about a suspicious shipment with commercial trade data, satellite imagery, corporate registration records, and social media posts in three languages, and then producing a finished assessment ready for human review. Today, that kind of all-source analysis is bottlenecked by the tiny number of analysts who have the right clearances, the right training, and the bandwidth. AI agents wouldn’t replace those analysts, but they could give every enforcement team the kind of all-source reach that today is reserved for the highest-priority cases.

However, classified-agent deployments also carry serious risks that BIS needs to plan for. An agent operating across classification boundaries could, through a hallucination, a prompt injection, or a misconfiguration, move classified information onto an unclassified network, leaking intelligence at a scale and speed no human analyst could match. BIS should begin with agents limited to either classified or unclassified networks, retaining humans as the only bridge between those worlds. It should also engage with the Department of War to learn lessons from its own use of models on classified systems.

Reimagining the law for an agentic world

Some laws and regulations assume the time it takes to act will serve as a functional check on that action. For example, when the US government imposes tariffs under Section 232 of the Trade Expansion Act of 1962, BIS is required to prepare a report, often running to hundreds of pages, describing its analysis of the relevant industry, the national security threat posed by imports, and proposed remedies. The statute does not specify a minimum timeline but does specify maximum timelines. BIS must complete its report within 270 days of the investigation’s initiation, and the President must decide whether to act on the report within 90 days.

The current administration has initiated more Section 232 investigations than any other in recent memory, and has taken extraordinary measures to accelerate them. Based on the Section 232 reports listed on the BIS website, this has enabled them to initiate about 12 investigations in 2025, or one per month on average. For comparison, the Biden administration initiated only one investigation, into rare earth magnets, while the first Trump administration initiated seven investigations (the first since 2001).

In a world of capable AI agents, it is easy to imagine the administration initiating one Section 232 investigation per day. In some ways, this is good. The American people elect presidents to implement the political will of the people. If the political will of the people is tariffs on lots and lots of things, and AI agents enable the President to carry that out better, that may be good.

However, it is not clear that when Congress wrote statutes like the Trade Expansion Act (or its now much more notorious companion, the International Emergency Economic Powers Act), it contemplated an administration able to act with incredible speed. The months spent preparing a Section 232 report provide time for companies to file lawsuits, voters to weigh in during midterm elections, and outside parties to register an opinion on the action. The Trade Expansion Act and other statutes may need to be updated to add a minimum reporting timeline or a maximum number of investigations per year.

Section 232 is just one example, but the broader point applies across everything BIS does: AI agents will compress timelines, increase throughput, and strain processes designed around human operators. BIS needs to be ready for that, not just as a regulator of AI but also as a user of it.

A guest post by

Maxwell K. Roberts

Researcher at IAPS.

Discussion about this post

Ready for more?