Introduction
In recent months, the rise of advanced artificial intelligence and natural language processing technologies, such as Large Language Models (LLMs) like ChatGPT, has sparked a debate about their potential impact on various industries, including the legal profession. The million-dollar question inevitably arises: Will LLMs replace lawyers (and perhaps judges also), or at the very least, lead to a massive paradigm shift in law practice? In my personal opinion, the answer is yes. I’ve noticed a lot of criticism, including from lawyers, pointing out that LLMs like ChatGPT cannot think or analyze or formulate specialized arguments like lawyers can. I’ve noticed misguided examples of the wholly incorrect or bogus “answers” that ChatGPT provides when asked legal questions. Maybe our resistance is rooted in job security. Or maybe it’s a hard-wired resistance to embracing technology which comes with the territory in this profession; after all, there is still an incredible and mind-boggling push-back advocating for in-person hearings as opposed to using Zoom here in Ontario (or BlueJeans in Nevada). Many of us still have fax numbers, which, until recently in Ontario (i.e., late 2020), were used for serving legal documents.
Anyways, it doesn’t matter what soon-to-be-extinct dinosaurs think; the legal industry is poised for a transformation on account of these powerful AI-driven tools, which have already demonstrated their ability to streamline legal research, automate document generation, provide advanced legal analysis, and pass law exams. While the advent of LLMs may not render legal professionals entirely obsolete (at least not yet), it will undoubtedly change how they practice law, necessitating new skills and specializations. Furthermore, LLMs hold the potential to significantly improve access to justice by reducing costs and expediting the resolution of disputes, which has long been a pressing concern in my line of work. In Toronto, for example, civil courts are so clogged that there’s currently a year-long wait to schedule a short motion. A failure to properly integrate, use, and/or develop these LLMs into our industry is akin to continuing to serve legal documents by fax. Go ahead, draft your fax cover sheet and cover letter, and attach your 30-page document. Punch in the recipient’s fax number, wait for your fax machine to scan and encode the data and send it to the recipient’s machine, and wait for the fax transmission confirmation page to confirm the transmission (lest the recipient’s telecopier is out of paper or busy receiving other faxes). A 15-minute process that could have taken 15 seconds by simply emailing your 30-page document to the addressee. Of course, these LLMs could completely replace us, but not right away.
In the following sections, we will delve deeper into the capabilities of LLMs, explore the implications of their widespread adoption for the legal profession, and consider how this technology could reshape the legal system.
How Large Language Models Work – The Technical Workings
Large Language Models like GPT-4 are built on deep learning techniques and natural language processing (NLP) algorithms. They are trained on massive datasets containing vast amounts of human-generated text, allowing them to generate coherent, contextually relevant, and human-like responses to various prompts. The core components of these models are:
- Transformer architecture: The foundation of modern LLMs, like GPT-4, is the Transformer architecture. Introduced by Vaswani et al. in 2017, the Transformer is a type of deep-learning neural network specifically designed to handle sequential data, like text. It has since become the go-to architecture for many NLP tasks.
- Self-attention mechanism: A key feature of the Transformer is the self-attention mechanism, which allows the model to weigh the importance of different words (tokens) in a sequence relative to one another. This process helps the model understand the context and relationships between words. Self-attention is applied at multiple layers within the Transformer, enabling it to capture complex dependencies in text.
- Multi-head attention: Multi-head attention is an extension of the self-attention mechanism, where the input sequence is processed multiple times with different weight matrices. This allows the model to capture different aspects of the relationships between words and generates a richer representation of the input text.
- Positional encoding: Since the Transformer processes tokens in parallel (unlike recurrent neural networks), it requires an explicit representation of the position of each token in the sequence. Positional encoding is added to the input embeddings to provide this information, enabling the model to understand the order of words in a sentence.
Pre-training and fine-tuning: LLMs are trained using a two-step process: pre-training and fine-tuning.
- Pre-training: In this phase, the model is trained on a massive dataset containing vast amounts of human-generated text, such as websites, books, articles, and more. The objective during pre-training is to predict the next word in a sentence, given its context. This is achieved using a masked language modeling (MLM) task, where a portion of the input tokens is randomly masked, and the model learns to predict these masked tokens. This unsupervised learning helps the model understand grammar, syntax, semantics, and even some factual knowledge.
- Fine-tuning: After pre-training, the model is fine-tuned on specific tasks or domains using smaller, labeled datasets. This supervised learning refines the model’s understanding, improves its performance, and adapts it to generate relevant responses for specific applications.
Tokenization: Tokenization is the process of breaking down text into smaller units called tokens. These tokens can represent words, subwords, or even individual characters, depending on the tokenization approach used. LLMs tokenize input text before processing it within the Transformer architecture.
- Subword tokenization: Techniques like Byte Pair Encoding (BPE) or WordPiece are commonly used in LLMs to tokenize text at the subword level. This approach helps the model handle out-of-vocabulary words, reduces the size of the vocabulary, and improves computational efficiency.
Embeddings: To process text within the Transformer, LLMs convert tokens into continuous numerical representations called embeddings. These embeddings capture semantic and syntactic information about the tokens and serve as the input for the Transformer layers.
Decoding and text generation: Once the input text has been processed by the Transformer, the model generates a response by decoding the output embeddings into a sequence of tokens. There are several strategies to decode and generate text, such as greedy search, beam search, or sampling. These methods balance the trade-off between generating diverse responses and maintaining coherence.
Large Language Models like GPT-4 leverage the Transformer architecture, pre-training and fine-tuning processes, tokenization, embeddings, and decoding strategies to generate coherent, contextually relevant, and human-like text. These models have significantly advanced the field of natural language processing and have a wide range of applications in various industries, including the legal profession, as discussed earlier.
Model size and computational requirements: The performance of LLMs is strongly correlated with their size, measured in the number of parameters or weights in the model. Larger models, with billions of parameters, have been shown to exhibit better generalization capabilities and generate more coherent and contextually appropriate responses. However, larger models also require more computational resources for training and inference, posing challenges in terms of cost, energy consumption, and deployment.
Transfer learning: One of the strengths of LLMs is their ability to leverage transfer learning, which refers to the process of training a model on one task and then fine-tuning or adapting it to another related task. Pre-training on massive text corpora allows LLMs to learn general language understanding, which can then be fine-tuned for specific tasks or domains. This transfer learning capability enables LLMs to be highly adaptable and efficient in tackling a wide variety of natural language processing tasks.
Handling long-range dependencies: Traditional recurrent neural networks (RNNs) and their variants, such as Long Short-Term Memory (LSTM) networks, have limitations in handling long-range dependencies in text. The Transformer architecture, with its self-attention mechanisms, overcomes these limitations by allowing for more efficient parallel processing of sequences and direct access to distant tokens in the input text. This capability enables LLMs to capture complex relationships and dependencies in text, which is crucial for generating coherent and contextually relevant responses.
Limitations and challenges: Despite their remarkable capabilities, LLMs also have limitations and challenges that need to be addressed:
- Bias and fairness: LLMs can inadvertently learn and perpetuate biases present in the training data, which can lead to biased outputs and raise ethical concerns. Addressing these biases requires careful dataset curation, model design, and evaluation methods.
- Explainability and interpretability: The inner workings of LLMs, especially those with billions of parameters, are difficult to interpret, making it challenging to understand and explain the basis of their generated responses. However, research is ongoing to develop techniques to improve the explainability and interpretability of these models. The outcome will be no different from the advent of the Internet back in the mid-’80s. Look where the Internet is now.
- Model robustness and adversarial attacks: LLMs can be sensitive to small perturbations in the input text, which can lead to significant changes in their outputs. Ensuring model robustness and resilience to adversarial attacks is an area of active research.
By understanding the workings of Large Language Models, their underlying architecture, training methods, and the challenges they face, we can better appreciate their potential applications and implications for various industries, including the legal sector. As these models continue to evolve, they will play an increasingly important role in shaping the future of natural language processing and AI-driven solutions across domains.
How LLMs Work for Non-Technical Readers
LLMs are designed to understand and generate human language. They are built using a type of artificial intelligence (AI) called machine learning. In machine learning, computer programs learn from examples rather than being explicitly programmed to perform a specific task.
To train an LLM, developers collect a massive amount of text from various sources like books, articles, and websites. This text is called the “training data.” The LLM then “reads” this training data and learns patterns, structures, and relationships within the text. It learns how words, sentences, and paragraphs are put together and the context in which they are used.
When you provide data to an LLM, it uses what it has learned from the training data to analyze and generate a response. It doesn’t have a deep understanding of the information, but it can use patterns it has seen in the training data to make intelligent predictions about what a good response might be.
For example, if you give an LLM information about a legal case, it can do the following:
- Identify relevant laws and regulations because it has seen similar patterns in the training data.
- Find similar cases by matching patterns and facts to those it has seen in other cases during its training.
- Suggest possible outcomes by analyzing patterns in how previous cases were resolved.
LLMs can be trained to analyze data more effectively by providing them with specific examples of how to analyze the data. This is called “fine-tuning.” In fine-tuning, developers provide the LLM with examples of the input data and the desired output, teaching the LLM how to perform the analysis correctly. This can help the LLM generate more accurate and useful responses when given new data.
Addressing the Bias and Fairness
Human legal professionals, including judges, can also exhibit biases when analyzing information, and LLMs have the potential to address some of these biases by providing a more consistent and objective analysis. Properly curating and preparing the training dataset can help mitigate some of the biases present in LLMs. Completely removing bias is a challenging task due to the complex and often subtle nature of biases in the data. However, these issues can be dealt with at the “fine-tuning” stage.
Here are some strategies to reduce biases when training LLMs for legal tasks:
- Diverse and representative dataset: Ensuring that the training dataset is diverse and representative of various perspectives, case types, and legal scenarios can help mitigate potential biases. This includes incorporating data from different jurisdictions, legal domains, and demographic groups, as well as ensuring a balanced representation of different viewpoints.
- Bias-aware data preprocessing: Techniques such as re-sampling, re-weighting, or generating synthetic examples can be employed to address imbalances or underrepresented groups in the training data. These methods can help the model learn more robust and fair representations of the data.
- Fairness-aware training techniques: There are various fairness-aware training techniques that can be employed to reduce biases in the model. Some of these methods involve adding fairness constraints or modifying the loss function during training to encourage the model to learn more equitable representations of the data.
- Post-hoc bias mitigation: After training the model, it may be possible to identify and mitigate certain biases in its output by analyzing its predictions and comparing them against known fairness metrics. Techniques such as re-calibration or post-hoc adversarial debiasing can be applied to adjust the model’s output and improve its fairness.
- Regular evaluation and monitoring: Continuously evaluating and monitoring the LLM’s performance against fairness metrics and real-world feedback can help identify biases and inform the development of strategies to mitigate them.
- Collaboration with legal experts: Working closely with legal professionals during the development and deployment of LLMs can help ensure that the models are better aligned with the needs of the legal domain and that potential biases and ethical concerns are addressed in a more comprehensive manner.
It’s obvious that, while these strategies can help reduce biases in LLMs, completely eliminating bias is extremely challenging. But, over time, by approaching the output of LLMs with caution and continuously working towards improving their fairness and reliability, there’s a lot of potential.
Use in Litigation
I don’t think there is any question that LLMs can be used to draft agreements, including for complex commercial transactions. I can see their use, even in the negotiation of complex deals. Commercial documents and transactions are very regimented things with very cut-and-dry processes in place used to achieve the final, satisfactory result, whether to draft a complex shareholders’ agreement or a prospectus or to consummate an M&A deal, etc.
But what about litigation, where there are many more moving parts and a unique level of uncertainty?
LLMs have the potential to provide a hypothetical judgment about a specific legal case if given enough background information, such as current legislation, applicable case law, and other relevant information. While it is important to note that the quality of the judgment provided by the LLM will depend on the quality and comprehensiveness of the information it is provided, as well as the model’s understanding of legal concepts and reasoning, I believe this will happen in our lifetime as LLMs continue to improve.
Legal Analysis
Legal analysis is the cornerstone of any litigation file. You have to have a tenable case before initiating a lawsuit, and you have to know how to respond in a tenable way if facing a lawsuit.
LLMs like ChatGPT have the potential to revolutionize the way legal analysis is conducted, making it more efficient, accurate, and comprehensive. Back in the day, you would have to physically visit a law library and search through textbooks for cases and legislation. Legal software changed that by maintaining what is effectively an online law library. Using a Boolean search, one could find the relevant legal precedent online. We’re well past that stage; for example, my software provider, Lexis+, has a multitude of capabilities beyond the “online law library.” Recently, it became the latest adopter of generative artificial intelligence in the legal industry by rolling out a new platform for case research and document drafting. LLMs can perform legal analysis by doing the following:
- Identifying relevant laws and regulations: LLMs could help lawyers quickly identify the most pertinent laws, regulations, and legal principles for a specific case. By inputting a brief description of the case, the LLM could provide a list of relevant statutes, rules, and legal doctrines that may apply, saving time and reducing the chances of overlooking essential legal sources.
- Analyzing case law: LLMs could assist lawyers in researching and analyzing case law, which involves studying previous court decisions to understand how judges have interpreted and applied the law in similar situations. By processing vast amounts of case law data, LLMs could identify patterns, trends, and legal arguments that may be relevant to the case at hand. This would help lawyers craft more persuasive arguments and anticipate potential counterarguments.
- Evaluating legal arguments and evidence: LLMs could be trained to evaluate the strengths and weaknesses of legal arguments and evidence presented in a case. By analyzing the facts, the applicable law, and the arguments presented by both sides, LLMs could provide an objective assessment of the likelihood of success for each argument, helping lawyers prioritize their efforts and refine their strategies.
- Identifying potential legal issues: LLMs could help lawyers spot potential legal issues that might not be immediately apparent. By analyzing the facts of a case and comparing them to similar cases, LLMs could flag potential risks, opportunities, or areas that may require further investigation. This could help lawyers be more proactive and better prepared to address potential challenges.
- Providing comparative analysis: LLMs could assist with comparative legal analysis, which involves comparing the laws and legal systems of different jurisdictions. This could be particularly helpful for lawyers working on international cases or advising clients on cross-border transactions. LLMs could quickly analyze the relevant laws in multiple jurisdictions and provide a comparative overview, highlighting differences, similarities, and potential conflicts.
- Drafting legal memos and facta (in Canada) / briefs (in the U.S.): LLMs could be utilized in drafting legal memos and facta/briefs, which summarize the facts, legal issues, and arguments in a case. By processing the available information and generating a well-structured, coherent, and persuasive document, LLMs could save lawyers time and help them present their arguments more effectively.
- Supporting ethical decision-making: LLMs could be trained to analyze ethical considerations and professional conduct rules that may apply in a given case. This could help lawyers better understand their ethical obligations, identify potential conflicts of interest, and make informed decisions that align with their professional responsibilities.
How LLMs Evaluate Legal Arguments and Evidence
The more murky area is how the LLM can evaluate the legal arguments and evidence. There is a way to go here, but I’m convinced I’ll see significant breakthroughs in my lifetime. Conceptually, the LLMs would do the following:
Step 1: Input and Preprocessing: The user (e.g., a lawyer) provides the LLM with the facts of the case, the legal arguments presented by both parties, and the relevant evidence. This information can be input in the form of natural language text, like a case summary or legal brief. The LLM preprocesses the input by breaking it down into smaller chunks or tokens, which it then translates into numerical representations.
Step 2: Identifying Relevant Legal Concepts: The LLM analyzes the input and identifies the relevant legal concepts, such as the laws, regulations, and legal principles that apply to the case. This involves matching the input text to patterns and structures the LLM has learned from its extensive training dataset, which includes various legal texts, case law, and legal literature.
Step 3: Retrieving Related Case Law and Precedents: The LLM searches its knowledge base for related case law and legal precedents that can help assess the strengths and weaknesses of the legal arguments. It considers factors such as the jurisdiction, the legal domain, and the specific legal issues involved to find the most relevant and recent cases with similar fact patterns and legal questions.
Step 4: Analyzing the Evidence (more details, below): The LLM assesses the evidence presented by both parties, evaluating its relevance, credibility, and admissibility. This involves determining how well the evidence supports each party’s legal arguments, identifying potential weaknesses or inconsistencies, and considering the potential impact of the evidence on the case outcome.
Step 5: Comparing Legal Arguments: The LLM compares the legal arguments of both parties, analyzing their logical structure, the strength of the supporting evidence, and their alignment with the applicable laws and legal principles. It also considers how similar arguments have fared in previous cases and evaluates the persuasiveness of each argument based on these factors.
Step 6: Identifying Counterarguments: The LLM identifies potential counterarguments for each legal argument presented. This involves analyzing the opposing party’s arguments and evidence, as well as considering alternative interpretations of the applicable laws and legal principles. The LLM also retrieves related case law where counterarguments were successful to provide insights into their potential effectiveness.
Step 7: Assessing Likelihood of Success: Based on the analysis of legal arguments, counterarguments, and evidence, the LLM calculates a likelihood of success for each argument. This assessment takes into account factors such as the strength of the evidence, the persuasiveness of the arguments, and the legal precedents and case law that support each party’s position. It may also provide a confidence score, which represents the LLM’s certainty in its assessment.
Step 8: Presenting the Evaluation: Finally, the LLM presents the evaluation of the legal arguments and evidence in a clear, structured format. This can include a summary of the strengths and weaknesses of each argument, the likelihood of success, potential counterarguments, and recommendations for refining the legal strategy. The output can be tailored to the user’s preferences, such as a written summary, a visual representation, or a combination of both.
Currently, LLMs have the potential to assist in evaluating legal arguments and evidence. While they are not infallible and should be used as a tool to support, rather than replace, the expertise and judgment of human legal professionals, a day will come when they will replace legal professionals or, at a minimum, legal professionals will have a very nuanced, and highly-technical role in the legal system.
Analyzing the Evidence – Detailed Analysis
The even murkier question is how an LLM can be tasked with Step 4 (above) in evaluating the evidence. After all, that seems to be one of the irreplaceable talents unique to humans (and, yes, lawyers ARE humans too). Well, it’s not outside of the LLM’s realm of possibility. To do this, an LLM evaluates the evidence presented by both parties, assessing its relevance, credibility, and admissibility. Here is a detailed breakdown of how the LLM could perform this task:
Step 4.1: Extracting Evidence from the Input: The LLM scans the input text to identify and extract information related to the evidence presented by both parties. This can include testimonies, documents, physical evidence, expert opinions, and any other relevant materials. The LLM can identify these pieces of evidence using patterns and structures it has learned from its training data, which may include legal documents, case summaries, and other relevant texts.
Step 4.2: Categorizing and Structuring the Evidence: Once the evidence is extracted, the LLM categorizes it based on its type (e.g., documentary, testimonial, or physical), source (e.g., witness, expert, or documentary), and relevance to the legal arguments presented. The LLM can structure the evidence in a hierarchical manner, linking each piece of evidence to the specific legal argument it supports or contradicts.
Step 4.3: Assessing Relevance: The LLM evaluates the relevance of each piece of evidence by determining how well it supports or undermines the legal arguments presented. This can involve comparing the evidence to the facts of the case, the applicable laws, and the legal principles at play. The LLM may also consider the weight of the evidence in light of similar cases and legal precedents it has encountered during its training.
Step 4.4: Assessing Credibility: The LLM evaluates the credibility of each piece of evidence by considering factors such as the source, consistency, and reliability. For example, when assessing a witness testimony, the LLM may consider the witness’s relationship to the parties, potential biases, and the consistency of the testimony with other evidence and known facts. In the case of expert opinions, the LLM may consider the expert’s qualifications, experience, and the soundness of their methodology.
Step 4.5: Assessing Admissibility: The LLM evaluates the admissibility of the evidence according to the relevant rules of evidence for the jurisdiction in question. This may involve determining whether the evidence was lawfully obtained, whether it is subject to any exclusionary rules, and whether it satisfies any applicable authentication, foundation, or hearsay requirements. The LLM’s knowledge of evidence rules comes from its training data, which includes legal texts, case law, and other relevant materials.
Step 4.6: Identifying Weaknesses or Inconsistencies: The LLM identifies potential weaknesses or inconsistencies in the evidence by comparing and contrasting different pieces of evidence and considering alternative interpretations. For instance, the LLM may detect discrepancies between a witness’s testimony and a document presented as evidence or inconsistencies between two different witness testimonies. These weaknesses or inconsistencies can help inform the LLM’s evaluation of the legal arguments and counterarguments.
Step 4.7: Synthesizing the Evaluation: Finally, the LLM synthesizes its evaluation of the evidence by considering the relevance, credibility, and admissibility of each piece of evidence, as well as any weaknesses or inconsistencies identified. The LLM may assign a weight or score to each piece of evidence based on these factors, which can then be used to inform its overall evaluation of the legal arguments and likelihood of success in subsequent steps.
Conclusion
While we are perhaps years away from LLMs being able to conduct such an analysis, it’s not outside the realm of possibilities. In fact, never mind possibilities. Consider probabilities. We’ll probably see remarkable advancements in our lifetimes, and we’ll probably see LLMs do a lot more than supplement legal professionals; we’ll see them supplanting them altogether. To be sure, lawyering will look a lot different, and the adjudication of cases will look a lot different. You couldn’t stream Netflix on the original Internet, but you can now. You can’t get ChatGPT to replace lawyers now, but ChatGPT is like the original Internet; it will get better, and we’ll be replaced, or our job descriptions will significantly change. Eventually, with the proper use of LLMs, transactional work will become more accessible, and litigation will become more efficient, leading to the increasingly-evasive holy grail of justiciability: access to justice.