How do AI 'judges' compare to human ones? It's complicated, says a UChicago scholar

In annual Ryerson Lecture, legal scholar Eric Posner examines AI's growing role in legal decision-making—and why human judgment still matters

As AI rapidly works its way into the legal system, Prof. Eric Posner is asking a pointed question: What happens when machines fill the role of a judge?

That question drives much of Posner’s research as the Kirkland & Ellis Distinguished Service Professor of Law and Arthur and Esther Kane Research Chair at the University of Chicago Law School. He was elected by his faculty peers to share his findings in the Nora and Edward Ryerson Lecture, a prestigious annual address at the University. 

Speaking on April 16 in a packed Friedman Hall at the Rubenstein Forum, Posner offered a probing examination of how large language models are already influencing legal decision-making—and why, despite their growing sophistication, they are unlikely to replace human judges anytime soon.

Introduced by University President Paul Alivisatos—who described the Ryerson Lecture as “the ultimate celebration” of UChicago scholarship—Posner began with a note of cautious realism. 

While U.S. Supreme Court Chief Justice John Roberts recently predicted that “human judges will be around for a while,” Posner pointed to growing evidence that AI is gaining a foothold in judicial workflow. Surveys suggest a majority of federal judges report using AI tools in some capacity, and some have openly acknowledged experimenting with them.

In one example, a federal appellate judge consulted an AI model to help interpret whether installing an in-ground trampoline qualified as “landscaping” under an insurance policy. 

“I just want people to know that I did this,” the judge wrote, describing the AI’s answer as helpful, even though the issue ultimately did not determine the outcome of the case. 

Beyond isolated uses, Posner highlighted the emergence of AI-driven arbitration platforms, including one developed by the American Arbitration Association. These systems promise faster and dramatically cheaper dispute resolution, raising the prospect that AI could first gain traction not in courts, but instead in private adjudication.

At the same time, Posner emphasized that AI is already influencing litigation in less welcome ways. Courts are increasingly receiving filings with AI-generated text containing fabricated legal citations, often referred to as “hallucinations,” prompting sanctions and ethical concerns. 

Against this backdrop, Posner turned to the core of his lecture: a series of experiments testing how AI “judges” compare to human ones.

How AI handles legal questions

Drawing on prior scholarship in legal realism, Posner and his collaborators examined whether decision-makers follow legal rules strictly—a “formalist” approach—or are influenced by broader considerations such as fairness or sympathy in a “realist” approach. 

In one study, human judges evaluated a war crimes case involving sympathetic and unsympathetic defendants. The result was that judges were influenced, at the margins, by the attributes of the defendant.

But AI models behaved differently. 

“The AI was a formalist,” Posner explained. “It simply followed the law. It disregarded the degree of sympathy that one might have felt.” 

The pattern held across additional experiments. In a complex “choice of law” scenario, where courts must determine which jurisdiction’s law applies, human judges produced inconsistent outcomes and occasionally made factual or legal errors. By contrast, AI models applied the governing rules with complete consistency and without mistake.

Yet that apparent strength, Posner suggested, may also be a weakness. Studies have shown that law students, like AI, tend to apply the law in a rigid formalistic fashion. 

“Would you want law students to be judges?” he asked. 

For Posner, the comparison underscores a deeper truth about the legal system—human judging is not, and has never been, purely mechanical. From the legal realist critique of the early 20th century to contemporary debates over originalism, scholars have long recognized that judicial decisions are shaped not only by rules, but by judgment, experience and social context.

AI, by contrast, is trained on the “official story” of law—the formal reasoning found in judicial opinions—without access to the underlying motivations or institutional dynamics that shape real-world decisions.

Limits of AI in judging

Posner identified three main obstacles to replacing human judges with AI.

First, AI systems cannot reliably explain their own reasoning. While they produce plausible legal arguments, “it’s not clear that the reasons are the motivations for their actual decisions,” he said. 

Second, judging is embedded in a complex institutional structure. Courts operate within hierarchies, interact across jurisdictions and respond directly or indirectly to political and social pressures. 

“Human judges are part of this enormously complex institutional structure,” Posner said, one that would be difficult to replicate artificially. 

Finally, he pointed to what he called the “paradox of the official story,” the gap between how judicial decisions are publicly justified and how they are actually made. 

AI systems, trained on formal legal texts, may faithfully reproduce the rhetoric of judging without capturing its reality.

Even so, Posner acknowledged reasons for optimism about AI’s role in the legal system.

AI platforms are remarkably effective at identifying patterns across recurring fact scenarios, a core feature of legal reasoning. They also produce polished, coherent opinions that can be difficult to distinguish from those written by humans.

Despite these advantages, Posner remains skeptical that AI will displace judges. More likely is a quieter transformation—judges will increasingly rely on AI tools behind the scenes, even if they do not always publicly acknowledge it.

Posner finds the prospect of continued human involvement in judicial decision-making both essential and reassuring, especially in the event of a close call that could go either way. 

“I want a human to flip the coin so we can argue about it,” he said. “I don’t want an LLM to do that.”

—This article was originally published on the Law School website.