What Would an FDA for AI Look Like
Private incentives built the models; public consequences now demand a regulator
April 8, 2026
The United States created the FDA because the market, left to its own devices, had shown itself far more gifted at invention than at restraint. By the time the harm could be counted, described, and photographed, the sale had already been made, the product had already moved on, and the public had already been left to absorb the consequence.
This is the part people tend to forget when they talk about regulation as though it were an unfortunate vine winding itself around the clean architecture of innovation. The state did not descend upon some pristine order and spoil it. It arrived after the poisonings, frauds, and dead children. The FDA was not, in that sense, an abstraction; it was a public decision that some products were too intimate with the body, too consequential in their effects, and too shielded by asymmetries of expertise and advertising to be governed by marketing copy and private assurances alone.
Early federal food and drug law was weak, and enforcement weaker. The 1906 Food and Drugs Act prohibited interstate commerce in adulterated and misbranded foods and drugs, but it still left government largely chasing deception after the fact, after the product had entered circulation and after the burden had already shifted onto the public. The deeper change came in 1938, after Elixir Sulfanilamide killed more than a hundred people (many of them children). For the first time, manufacturers were required to demonstrate a drug's safety before marketing it. That was the essential break. The public would no longer be the first large-scale testing ground by default.
The same logic hardened again in 1962, when the thalidomide disaster exposed the inadequacy of a system that asked too little and trusted too much. The result was a stronger regime, one that required evidence not only of safety but of effectiveness, and that gave the law something closer to teeth. What emerged over time was not simply an agency that "approves drugs," but a public apparatus for turning claims into evidence, deciding what must be tested, what counts as proof, what must be disclosed, and who has the authority to say "not yet."
That phrase matters more than it first appears. Markets dislike "not yet" because "not yet" interrupts the smooth sequence by which a product becomes a launch, a launch becomes a norm, and a norm becomes the thing no one can quite remember having chosen. But for the public, "not yet" is often the only protection that arrives before the damage does.
This is why the FDA matters as an analogy for AI, and also why the analogy is better than it sounds. The point is not that models are drugs, or that software can simply be poured into an inherited regulatory mold; the point is that AI has entered the same broad political condition that dangerous products always enter sooner or later: widespread deployment, high stakes, unclear harms, private incentives moving faster than public oversight, and a growing dependence on the very companies selling the systems to explain why those systems are safe enough to trust.
That arrangement is already visible in the knowledge environment around frontier AI. As I pointed out in my last piece, the same companies that build the models increasingly fund, shape, and disseminate much of the research that explains the models, evaluates the models, and reassures the public about the models. This is not necessarily fraud, and it is not even especially exotic; it is the ordinary problem of incentive, familiar from every domain in which those with the most to gain from a conclusion also possess unusual influence over how the evidence is produced, framed, and circulated.
One of the more useful euphemisms in contemporary AI discourse is "access." Independent scrutiny, we are told, is difficult because frontier systems require care. But "care" often means control. Researchers who want to study model internals, interpretability, cybersecurity risk, biological capability, worst-case misuse, or failure modes that do not show up in a polished demo often cannot do so meaningfully without access far beyond an ordinary chat interface. The result is that the research agenda is shaped before it begins by what companies choose to expose and under what conditions. You cannot seriously audit what you cannot see, and you cannot meaningfully stress-test what you are not allowed to touch.
At the same time, the most authoritative-sounding claims about model behavior increasingly arrive from the institutions with the strongest incentive to present that behavior as understandable, governable, and manageable. There is some truth in the argument that the people closest to the system are often best equipped to study it. There is also the older truth, learned repeatedly in medicine, tobacco, chemicals, finance, and elsewhere, that sponsorship does not merely influence answers. It influences questions, framing, method, publication, emphasis, and silence. The issue is not whether any one lab is lying. The issue is whether the public should be asked to treat a seller's knowledge environment as a substitute for independent oversight.
Medicine eventually learned that "trust us" is not a regulatory framework. What changed in pharmaceuticals was not that companies stopped doing research but that private claims became subject to public standards. Firms could still innovate, test, petition, and profit. But they could no longer reserve to themselves the exclusive right to decide what counted as enough evidence, testing, or caution.
AI has no equivalent authority with real gatekeeping power. There are prototypes, and they matter, but prototypes are not regulators. The U.S. AI Safety Institute exists, and its British counterpart exists in revised form, and both are serious efforts staffed by serious people. They develop evaluations, cultivate relationships, and sometimes secure access that would have seemed improbable only a short time ago. But voluntary access is not subpoena power, and a memorandum is not a mandate. An institute that can advise is not the same thing as an institution that can delay a release, impose conditions, require disclosure, or say no in a form that cannot be shrugged off in the next product post.
The case for an AI analogue to the FDA begins there, in the gap between being allowed to observe and being authorized to judge. If these systems are becoming infrastructure, as they plainly are, then their safety cannot depend entirely on the goodwill, patience, or rhetorical discipline of the companies distributing them. There has to be some public body whose task is not to ship the model, narrate its significance, or reassure the market about its promise, but to judge it against standards that do not fluctuate with quarterly incentives.
An AI regulator of this kind would not need to supervise every tool with "AI" somewhere in the pitch deck. The point would be risk, not branding. It would identify classes of systems whose capabilities, scale of deployment, or plausible harms justify mandatory review before widespread release. It would require standardized testing rather than the current arrangement, in which safety claims drift across blog posts, benchmark tables, red-team summaries, and promotional prose dressed in the language of sobriety. It would mandate structured access for qualified independent auditors, because many of the most important questions cannot be answered through casual use. It would require incident reporting, ongoing monitoring after deployment, and records that make failures reconstructable rather than deniable. Most important, it would need enforcement power, because a regulator without the authority to delay deployment, impose conditions, require mitigations, or restrict access is not a regulator in the meaningful sense at all.
This matters all the more because AI is, in some respects, harder to govern than medicine. Models update, fork, reappear as services and as weights, move through APIs and downloads, and in the case of sufficiently capable open-weight systems, escape the ordinary logic of recall altogether. A drug can be pulled from pharmacy shelves. Model weights, once copied into the world, do not return merely because a wiser judgment is reached later. The difficulty of retrieval makes caution before release more necessary, not less.
This is one reason the fixation on distant AGI scenarios can be misleading. The most immediate threat is not a machine spontaneously developing sovereign intent. It is a set of institutions behaving exactly as institutions tend to behave under competitive pressure, which is to say releasing faster, integrating more deeply, lowering the friction of adoption, saturating the market, and then asking the public to mistake diffusion for inevitability. In technology markets, ubiquity often functions as a substitute for superiority. A model does not need to be the best model to become the ambient one. It merely needs to become difficult to avoid.
Competition sharpens this tendency. When firms believe themselves to be in a race over a strategic technology, cooperation deteriorates and safety margins shrink. Corners are cut. Information is withheld. Precautions begin to look like unilateral disarmament. Even the perception of a race can produce the behavior of one. This is how the rhetoric of innovation becomes the practice of hurry.
Then there is the question of open weights, which is where the argument becomes most uncomfortable and most concrete. Open-weight releases are often presented as a simple expression of openness, decentralization, or scientific freedom. At sufficient capability, they are also a security decision. They allow research and reduce dependence on a handful of firms, but they also remove many of the safeguards a centralized provider can impose, including monitoring, rate limits, patching, and access controls, while making systems easier to modify for harmful purposes and impossible to retrieve once widely diffused. The issue is not whether openness is an attractive ideal. The issue is what risks are created when a powerful dual-use system becomes irreversibly downloadable, and who is expected to bear those risks when the answer arrives too late.
That is the question a public regulator would force into the open at the moment companies now prefer to call a launch. Not whether the release is exciting, or useful, or symbolically aligned with the future, but what harms are plausible, what evidence has been produced, what mitigations exist, and what happens when the downside proves irreversible.
The FDA did not emerge because industry invited supervision, and it did not endure because the country lost interest in innovation. It emerged because the public tired of serving as the proof of concept. That was the point then, and it is the point now.
AI is nearing that threshold. Some of the institutional scenery is already in place: institutes, frameworks, evaluations, voluntary agreements, access partnerships, and advisory reports. There is, in other words, no shortage of theater. What remains uncertain is whether the state intends to build something harder, duller, and more necessary than theater: a public authority capable of requiring evidence before mass deployment, evaluating systems independently of the people selling them, and treating irreversibility not as a philosophical flourish but as a governance fact.
The question is whether the country intends to recognize that threshold before the damage becomes the only proof left.