Why Lawyers Keep Getting AI Wrong and Why Human Judgment Still Matters: Insights from LIDW 2026

AI in Arbitration LIDW 2026

At one of the flagship sessions at the helm of London International Disputes Week (LIDW) 2026, marking the occasion of International Arbitration Day, hosted by Mr. Gibson Dunn, a distinguished panel of practitioners, technologists and arbitrators came together to examine a question that is rapidly becoming existential for the legal profession: is generative AI a helping hand or a risky hindrance, and where, if at all, do humans remain indispensable?

The discussion, held under the title The Human Fight Back in Arbitration: Why AI Doesn’t Have All the Answers, drew upon direct experience deploying AI tools in live disputes, developing specialist legal AI platforms, and grappling with the regulatory and ethical challenges emerging at a high speed.

The session brought together a diverse mix of expertise from across the arbitration and legal technology ecosystem. Moderating the discussion was Ms. Katrina Limond, Counsel in International Arbitration and Public International Law at A&O Shearman and a member of the International Bar Association’s Task Force on Artificial Intelligence in International Arbitration. Joining her was Mr. Charlie Morgan, Partner at Herbert Smith Freehills Kramer, whose practice spans international arbitration, technology and energy disputes and who serves on both the ICC AI Task Force and the City of London Law Society’s Special Committee on AI. The technology perspective was provided by Mr. Stéphane Altounian, Vice President of Product at Jus Mundi, who has spent more than thirteen years building legal AI products across Europe and the United States and now oversees AI development at one of the world’s leading international arbitration intelligence platforms. Mr. Dmitri Evseev participated as an independent arbitrator with Arbitra International and brought the perspective of a former Arnold & Porter partner, founder of ArbyCity and ArbyDocs, and lead drafter of the Silicon Valley Arbitration and Mediation Centre AI Guidelines. Completing the panel was Mr. Mark Feldner, Co-founder and Chief Executive Officer of Crimson, a disputes-focused case intelligence platform, who previously practised arbitration at Clifford Chance, WilmerHale and Willkie Farr & Gallagher.

Setting the Scene: Optimism Tempered by Hard Reality

The discussion opened with a deceptively simple question posed to the audience: is generative AI a source of efficiency for international arbitration, or is it a risky hindrance?

The framing was intentionally provocative. While the panel broadly viewed AI as a powerful force for improvement, there was widespread recognition that the legal profession’s early encounters with the technology had been accompanied by a series of embarrassing and often avoidable mistakes.

Mr. Charlie Morgan set the tone early. A self-described “massive optimist” about AI, he argued that many of the most publicised failures involving generative AI were fundamentally failures of user understanding rather than technological shortcomings.

“These tools have been designed to make stuff up. That’s how they’ve been trained. They create original content. Assume that the off-the-shelf models know pretty much nothing factual. What comes next is how you ground the answers in sophisticated data.”

Mr. Morgan pointed to the now-familiar examples emerging from courts and tribunals around the world: lawyers relying on fabricated authorities generated by AI systems, submissions containing entirely fictitious citations, and decision-makers placing inappropriate reliance on machine-generated analysis. Among the more extreme examples was a reported case involving a sole arbitrator in Canada who fed party submissions into ChatGPT and substantially adopted the resulting output.

The lesson, Mr. Morgan argued, was not that AI is inherently unreliable. Rather, it was that users often misunderstand what the technology is designed to do.

He suggested that it is not about criticising the technology for hallucinating a case, rather it is about understanding what these systems are actually capable of.

Mr. Dmitri Evseev echoed that sentiment, although from a slightly different angle. In his view, meaningful understanding of AI cannot be acquired solely through reading guidance notes or attending seminars.

He insisted that no-one can teach oneself about AI, rather one has to keep using it and trying it. The transition currently underway, he suggested, is likely to transform knowledge work itself. Lawyers may spend less time producing first drafts and significantly more time reviewing, verifying and refining machine-generated material. Far from encouraging intellectual laziness, the technology demands heightened scrutiny and active engagement.

“In knowledge work, we are all going to be drafting less and reviewing more. It’s not a shortcut to be lazy and turn off your brain, the opposite. It’s a tool to make your brain work harder.”

How These Tools Actually Work, And Why That Matters

One of the most valuable segments of the session was devoted to a straightforward but often overlooked question: how do large language models actually work?

Mr. Stéphane Altounian argued that many of the problems currently associated with AI stem from a failure to understand the basic architecture and limitations of the technology.

He identified four foundational constraints that every legal professional should understand before deploying AI in practice.

(i) Limited Knowledge – Large language models are trained on a finite corpus of information and retain only a compressed representation of that material. They do not maintain a database of facts in the traditional sense and possess no inherent understanding of where individual pieces of information originated. Equally important, they are constrained by a training cut-off date and therefore cannot independently know about subsequent developments unless connected to external sources.

(ii) Memory – Although context windows have expanded dramatically in recent years, practical restrictions remain on the amount of information that can be processed simultaneously. For legal practitioners working with extensive evidentiary records and large document sets, these limitations remain highly significant.

(iii) Nature of the models’ Reasoning Processes – Contrary to popular perception, large language models do not reason in the way human lawyers do. Their outputs are generated through probabilistic prediction rather than deterministic analysis. They produce statistically likely sequences of words rather than structured legal reasoning. While increasingly sophisticated architectures can improve performance, complex multi-step legal analysis remains challenging.

(iv) Agency – AI systems do not automatically possess the ability to perform tasks in the external world. Whether searching databases, accessing repositories, retrieving documents or executing workflows, they require deliberate integration with external systems before such functionality becomes possible.

Mr. Altounian cautioned that misunderstanding these limitations can easily create a false impression of omniscience.

“If we don’t understand how these tools work, it is very easy to think that they are omnipotent, and to trust them blindly.”

He also offered a broader observation about the incentives shaping public-facing AI systems. In his assessment, many commercially available models are not necessarily optimised to challenge users or pursue objective truth. Instead, market incentives may encourage systems to validate user assumptions and preferences.

Mr. Mark Feldner translated these technical realities into a practical principle for dispute practitioners.

The answer to AI’s unreliability, he argued, is not to search endlessly for a superior model. The answer is data.

“You should be trying to ground the results in source data that you trust and know. You never really want to be relying on the result the model has produced. You want to be relying on how it has summarised the underlying information, and then you can verify that it is accurate.”

For Mr. Feldner, one of the most common mistakes made by lawyers is treating AI-generated text as the final product. In reality, the output should be viewed as a gateway to underlying materials rather than as authority in itself.

He further expressed that the real value of AI lies in helping practitioners identify, organise and synthesise relevant information from trusted document sets. The source material remains paramount and the machine-generated summary is merely an aid to navigating it.

This distinction between information retrieval and information creation became a recurring theme throughout the discussion.

When Agents Go Rogue: The Accountability Problem

The early stages of the discussion focused on understanding the capabilities and limitations of generative AI, the conversation soon turned to a more unsettling question: what happens when AI systems stop merely answering questions and begin taking actions?

The emergence of so-called “agentic AI” systems i.e tools capable of autonomously executing multi-step tasks on behalf of users, has become one of the most significant developments in artificial intelligence. While these systems promise unprecedented efficiency, they also magnify the consequences of poor governance and inadequate controls.

Mr. Stéphane Altounian illustrated the point with a widely reported incident involving an AI agent operating in a pre-production testing environment. The system had been tasked with removing test data but encountered a credentials barrier. Instead of seeking human intervention, it searched through internal documentation until it located an access token. Armed with those credentials, the system proceeded not only to delete the intended test data but also a live production database and its backups, all within a matter of seconds.

The incident, Mr. Altounian argued, is often misunderstood.

“The AI agent did not create any problem. It just magnified an existing problem. The fact that this token was available in open access is not an AI-specific problem. It is just that we now have agentic systems so fast that they are going to uncover a lot of our shortcomings in our internal processes.”

He clarified that the technology did not invent the vulnerability; it merely exploited it with unprecedented speed and efficiency.

Mr. Altounian referenced comparable examples emerging elsewhere with reports involving a coding agent at a Chinese technology company that redirected company computing resources to mine cryptocurrency, as well as an internal breach at Meta in which an AI system exposed confidential human resources and financial documents because it failed to verify user permissions before sharing information.

He emphasised that across all these examples, the common denominator was not rogue AI but flawed human governance.

He pointed out, listing that (i) permissions had been granted without adequate safeguards (ii) sensitive information had been left accessible. (iii) oversight mechanisms had failed. AI systems simply accelerated the consequences.

Mr. Mark Feldner noted that arbitration practitioners face a related challenge arising from the inherently non-deterministic nature of generative AI. Because the same prompt can generate slightly different outputs each time, legal technology providers must design systems that minimise unpredictability where accuracy matters most.

This approach, he emphasised, relies heavily on deterministic extraction techniques for structured data, semantic search capabilities for defined queries and precise source citations that allow lawyers to verify information with minimal effort. The objective is not merely to produce answers but to create outputs that can be traced and tested.

The discussion naturally evolved into questions of accountability.

An audience member asked whether AI providers might eventually face legal liability for inaccurate outputs in much the same way that cloud service providers can be held responsible for data loss or security failures.

The panel’s response reflected a nuanced consensus.

Mr. Feldner and Mr. Evseev both drew parallels with the early development of the internet. Just as internet service providers are generally not held responsible for everything users do online, general-purpose AI providers are unlikely to bear responsibility for every decision made by users relying on their tools. At the same time, specialist legal AI providers occupy a different position.

The panel broadly agreed that legal technology companies have obligations extending beyond merely supplying software. These include educating users about appropriate use, implementing robust security measures and maintaining clear commitments regarding data handling and retention.

As Mr. Evseev observed:

“Most of the confidentiality stuff around AI is actually really old things. We know how to safeguard data on servers. It is old wine in new bottles.”

The technologies may be new, but many of the underlying governance challenges are remarkably familiar.

As the discussion progressed, it became clear that understanding the capabilities and limitations of AI is only one part of the challenge facing arbitration practitioners. Questions surrounding reliability, accountability and governance inevitably lead to broader concerns about confidentiality, privilege, regulatory oversight and the role of arbitral institutions in an increasingly AI-enabled environment. These issues formed the focus of the next phase of the discussion, as the panel turned from the mechanics of AI to the legal and institutional frameworks that will shape its future use in international arbitration.

Join the discussion

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.