Professional Documents
Culture Documents
CLS2022 Conference Booklet 220222 RELEASE
CLS2022 Conference Booklet 220222 RELEASE
2 – 4 March 2022
Brought to you by
The Centre for Computational Law
Keynotes ................................................................................................................................. 9
2
About the Conference
The rise of statistical learning methods in academia has permeated law, triggering what has been called
a “computational turn” in legal scholarship. In this emerging field, novel techniques such as network
analytics and natural language processing are being applied to uncover, and quantify, previously hidden
insights about the law. However, the novelty of the computational legal method and its implications
mean that traditional venues for discussing and publishing such work remain scarce.
Against this backdrop, the SMU Centre for Computational Law (“CCLAW”, which we pronounce
/sea-claw/), in collaboration with law.mit.edu, have organized this conference to gather leading scholars
and thinkers in the field. The conference theme, “Past, Present, and Future” invites scholars to
introspect on where the field is now, and where we may (and should) be headed. Guided by the theme,
abstracts were solicited on all aspects of computational legal studies. A special release of the MIT
Computational Law Report focusing on the conference theme is also in the works.
The final conference programme comprises four keynotes and 22 paper presentations from speakers
and distinguished scholars around the world. Full papers may be found here. Given the international
participation and the pandemic, the conference will be conducted in a hybrid online/offline format.
Conference start and end times have also been shifted slightly to accommodate diverse timezones.
The physical venue is the Yong Pung How School of Law, Singapore Management University, 55
Armenian St, Singapore 179943. Online participation will be through Zoom. Zoom details will be made
available to all registrants and speakers via email.
Please direct any enquiries to CCLAW Dy Director Asst Prof Jerrold Soh <jerroldsoh@smu.edu.sg>
or CCLAW Centre Manager Lis Kho <lskho@smu.edu.sg>. We wish everyone a wonderful and
productive conference!
3
Conference Programme
All times expressed in Singapore time (SGT, GMT+8).
All physical talks will be held in the Yong Pung How School of Law, Level 2, Seminar Room 2-16
Online &
0915 - 0945 Registration
Physical
0945 - 1000 Conference Opening Physical
Keynote:
AI and Law: from Knowledge-based to Machine Learning (and
1000 - 1100 Online
Back?)
Kevin Ashley, University of Pittsburgh
Session 1 Talk 1:
Public Records Aren’t Public: Systemic Barriers to Quantifying
1100 - 1135 Racism Online
Kat Albrecht, Georgia State University
Kaitlyn Filip, Northwestern University
Session 1 Talk 2:
Parsimony - Using Feature Engineering to Critique Private Law
1135 - 1210 Online
Rules
Jeffery Atik, Loyola Law School - Los Angeles
1210 - 1340 Mid-conference Break
Session 2 Talk 1:
Using OpenFisca to Power Expert Systems in the Canadian Public
1340 – 1415 Online
Service: Lessons Learned
Jason Morris, Canadian Federal Public Service
Session 2 Talk 2:
How Could We Perfect the Use of Artificial Intelligence in
1415 – 1450 Online
Litigation
Elizavetas Shesterneva, ETPL.Asia /Promsvyazbank PJSC
1450 - 1520 Tea Break
Session 3 Talk 1:
Legal Judgment Prediction and Availability of Machine Learning in
Turkish Jurisdiction
Ömer Faruk Erol, Ibn Haldun University
1520 – 1555 Online
Ahmet Kaplan, Ibn Haldun University
Gülnihal Ahter Yakacak, Ibn Haldun University
Muhammet Talha Kaan, Ibn Haldun University
Safa Nur Altuncu Kaan, Ibn Haldun University
4
2 March 2022 Venue
Session 3 Talk 2:
Predicting Citations in Dutch Case Law with Natural Language
Processing
Iris Schepers, University of Groningen
1555 – 1630 Online
Martijn Wieling, University of Groningen
Masha Medvedeva, University of Groningen
Michel Vols, University of Groningen
Michelle Bruijn, University of Groningen
1630 – 1700 Tea Break
Session 4 Talk 1:
Information Retrieval and Structural Complexity of Legal Trees
Pierpaolo Vivo, King’s College London
1700 – 1735 Alessia Annibale, King’s College London Online
Evan Tzanis, King’s College London
Luca Gamberi, King’s College London
Yanik-Pascal Förster, King’s College London
Keynote:
Friend or Foe: Exploring the Dynamics of State Third Party
1735 – 1835 Online
Submissions with Computational Legal Methods
Arthur Dyevre, KU Leuven
1835 onwards Informal networking
5
3 March 2022 Venue
Online &
0930 - 1000 Registration
Physical
Keynote:
Some Thoughts on the State of Computational Legal Studies
1000 - 1100 Online
Circa 2022
Daniel Martin Katz, Illinois Tech
Session 1 Talk 1:
Bridging the Gap between Machine Learning and Logical Rules in
1100 - 1135 Online
Computational Legal Studies
L. Thorne McCarty, Rutgers University
Session 1 Talk 2:
Unsupervised Machine Scoring of Free Response Answers
1135 - 1210 Online
(Validated Against Law School Final Exam Questions)
David A. Colarusso, Suffolk University
1210 - 1340 Mid-conference Break
Session 2 Talk 1:
The Silent Influence of the Chinese Guiding Cases: A Text Reuse
Approach
1340 – 1415 Physical
Benjamin Chen, University of Hong Kong
Elliott Ash, ETH Zurich's Center for Law & Economics
Zhiyu Li, Durham University
Session 2 Talk 2:
Lache-ing Onto Change: Object-Oriented Legal Evolution
Megan Ma, Stanford Law School
Dmitriy Podkopaev, Simmons Wavelength Ltd
1415 – 1450 Online
Avalon Campbell-Cousins, University of Edinburgh
Adam Nicholas, University of Cambridge
Jerrold Soh, Singapore Management University
Session 4 Talk 1:
1700 – 1735 The Un-Modeled World Online
Frank Fagan, EDHEC Business School
6
3 March 2022 Venue
Session 4 Talk 2:
The Promises and Pitfalls of Computational Law: A Meta-Analysis
1735 – 1810 of the Existing Literature Physical
Shaun Lim, National University of Singapore
Daniel Seng, National University of Singapore
Session 4 Talk 3:
Rethinking the Field of Automatic Prediction of Court Decisions
1810 – 1845 Masha Medvedeva, University of Groningen Online
Martijn Wieling, University of Groningen
Michel Vols, University of Groningen
1845 onwards Informal Networking and Conference Dinner
7
4 March 2022 Venue
Online &
0930 - 1000 Registration
Physical
Keynote:
1000 - 1100 Civil Litigation Outcome Prediction and Access to Justice Online
Charlotte S. Alexander, Georgia State University
Session 1 Talk 1:
Clause2Game: modeling contract clauses with composable games
1100 - 1135 Joshua Tan, University of Oxford Online
Megan Ma, Stanford Law School
Philips Zahn, University of St. Gallen
Session 1 Talk 2:
1135 – 1210 Computational Corpus Linguistics Online
Jonathan H. Choi, University of Minnesota
1210 - 1340 Mid-conference Break
Session 2 Talk 1:
Sharing and Caring: Creating a Culture of Constructive Criticism
1340 – 1415 in Computational Legal Studies Physical
Dirk Hartung, Bucerius Law School
Corinna Coupette, Bucerius Law School
Session 2 Talk 2:
Fait Accompli: Predicting the Outcomes of Investment Treaty
1415 – 1450 Negotiations Online
Malcolm Langford, University of Oslo
Runar Hilleren Lie, University of Oslo
Session 3 Talk 1:
From Contract to Smart Legal Contract - A Case Study using the
1450 – 1525 Simple Agreement for Future Equity Online
Ron van der Meyden, University of New South Wales
Michael J. Maher, Reasoning Research Institute, Canberra
Session 3 Talk 2:
1525 – 1600 “Legal Big Data”: From Predictive Justice to Personalised Law? Online
Andrea Stazi, European University of Rome
1600 – 1615 Conference Closing
8
Keynotes
Speakers are listed in first-name alphabetical order.
9
Charlotte S. Alexander holds the Connie D. and Ken McDaniel
WomenLead Chair as an Associate Professor of Law and Analytics
at the Colleges of Business and Law at Georgia State University.
She uses computational methods to study legal text, with a
particular focus on understanding how courts process and resolve
employment disputes and other types of civil lawsuits. She
founded and directs the university's Legal Analytics Lab, which
works toward a legal system that embraces data to solve
intractable problems and create a more just society. Alexander
has published in journals including Science, the N.Y.U. Law
Review, Texas Law Review, the American Business Law Journal,
Charlotte S. Alexander and the Harvard Civil Rights-Civil Liberties Law Review. Her
research has been funded by the National Science Foundation, U.S.
Department of Labor, and private foundations. She received her
Connie and Ken McDaniel
WomenLead Chair, Associate
J.D. from Harvard Law School.
Professor of Law and Analytics,
Legal Analytics and Innovation Keynote: Civil Litigation Outcome Prediction and Access
Initiative, Georgia State University to Justice (1000 – 1100, 4th March 2022)
10
Professor Daniel Martin Katz is a scientist, technologist and
professor who applies an innovative polytechnic approach to
teaching law - to help create lawyers for today's biggest societal
challenges. Both his scholarship and teaching integrate science,
technology, engineering and mathematics.
In this talk, I will review both the history as well as the current
state of the academic literature as well as the commercial
applications devoted to law + computation. Further, I will provide
my point of view regarding the best path(s) forward for the field.
11
Kevin D. Ashley, Ph.D., is an expert on computer modeling of legal
reasoning. He performs research in the field of legal text analytics
and studies how to prepare law students for its effects on legal
practice. In 2002 he was selected as a Fellow of the American
Association of Artificial Intelligence “for significant contributions
in computationally modeling case-based and analogical reasoning
in law and practical ethics.” He is co-editor in chief of Artificial
Intelligence and Law, the journal of record in the field of AI and Law
and has been a principal investigator of a number of National
Science Foundation grants. He is the author of Modeling Legal
Argument: Reasoning with Cases and Hypotheticals (MIT
Press/Bradford Books, 1990) and of Artificial Intelligence and Legal
Analytics: New Tools for Law Practice in the Digital Age (Cambridge
Kevin D. Ashley University Press, 2017). In addition to his appointment at the
School of Law, Professor Ashley is a senior scientist at the
Professor of Law and Intelligent
Learning Research and Development Center, an adjunct professor
Systems, University of Pittsburgh
of computer science, and a faculty member of the Graduate
Program in Intelligent Systems of the University of Pittsburgh. A
former National Science Foundation Presidential Young
Investigator, Professor Ashley has been a visiting scientist at the
IBM Thomas J. Watson Research Center, a Senior Visiting Fellow
at the Institute for Advanced Studies of the University of Bologna
where he is a frequent visiting professor of the Faculty of Law, and
a former President of the International Association of Artificial
Intelligence and Law.
12
Speakers and Authors
Speakers are listed alphabetically alongside paper co-authors.
13
David Colarusso is the Director of Suffolk University Law School's
Legal Innovation and Technology Lab. An attorney and educator
by training, he has worked as a public defender, data scientist,
software engineer, and high school physics teacher. He is the
author of a programming language for lawyers, QnA Markup, an
award winning legal hacker, ABA Legal Rebel, and Fastcase 50
honoree. Additionally, he has been named to the American Bar
Association's Web 100 for his Twitter presence.
14
Corinna studied law at Bucerius Law School and Stanford Law
School, completing her First State Exam in Hamburg in 2015. She
obtained a PhD in law from Bucerius Law School and a BSc in
computer science from LMU Munich, both in 2018, as well as an
MSc in computer science from Saarland University in 2020. Her
legal dissertation, which introduces legal network science, was
awarded the Bucerius Dissertation Award in 2018 and an Otto
Hahn Medal in 2020.
15
Frank Fagan is an Associate Professor of Law at EDHEC Business
School and Scientific Director of the EDHEC Augmented Law
Institute. He teaches and writes on corporate law and artificial
intelligence. He is co-editor of the Research in Law and Economics
and the popular AI Law Blawg. Papers can be viewed at
https://ssrn.com/author=468606.
Frank Fagan
Associate Professor of Law,
EDHEC Business School and
Scientific Director, EDHEC
Augmented Law Institute
16
Iris Schepers is a Dutch PhD researcher at the Department of
Legal Methods at the Faculty of Law of the University of
Groningen. She has a background in information science and her
research focusses on the use of machine learning techniques on
legal big data. She is associated with the ERC EVICT project and
her work will specifically focus on the empirical research into the
meaning of the right to housing and its impact.
Iris Schepers
PhD researcher, Department of
Legal Methods, Faculty of Law,
University of Groningen
17
Professor Jonathan H. Choi specializes in tax law, statutory
interpretation, and computational analysis of law. His work has
appeared in the New York University Law Review, the Stanford
Law Review, and the Yale Law Journal, among others.
Philips Zahn
Assistant Professor, University of
St. Gallen
18
Kat Albrecht is an Assistant Professor in the Department of
Criminal Justice and Criminology in the Andrew Young School of
Policy Studies at Georgia State University. She holds a PhD in
Sociology from Northwestern University and a JD from the
Northwestern University Pritzker School of Law.
Kaitlyn Filip
JD-PhD student in
Communication Studies,
Northwestern University
19
Professor McCarty has taught both law and computer science
since the 1970s. He was a pioneer in the applications of artificial
intelligence to the law, starting with the TAXMAN project at
Stanford in 1972. In 1987 he was a co-founder and the first
Program Chair of the International Conference on Artificial
Intelligence and Law (ICAIL), and in 1991 he was elected the first
President of the International Association for Artificial Intelligence
and Law (IAAIL). He is the author of more than 70 articles and
papers in the field, on logic programming, knowledge
L. Thorne McCarty representation and reasoning, natural language processing,
machine learning, and more.
Professor Emeritus of Computer
Science and Law, Rutgers Paper: Bridging the Gap between Machine Learning and
University Logical Rules in Computational Legal Studies (1100 –
1135, 3rd March 2022)
20
Runar Hilleren Lie is a PhD Fellow at the PluriCourts Centre of
Excellence, University of Oslo. He holds a law degree from the
University of Oslo and has previously worked as an entrepreneur
and developer. As a core participant in the LEGINVEST research
project, Lie uses empirical and computational methods to study
the behavior of actors in the international investment law system.
He has been awarded the John H. Jackson Prize for Best Article
in the Journal of International Economic Law for a paper using
computational network analysis and his publications include
'Computational stylometry: predicting the authorship of
investment arbitration awards', in R. Whalen, Computational Legal
Runar Hilleren Lie Studies: The Promise and Challenge of Data-Driven Research (Edward
Elgar, 2020).
PhD Fellow at the PluriCourts
Centre of Excellence, University Paper: Fait Accompli: Predicting the Outcomes of
of Oslo Investment Treaty Negotiations (1415 – 1450, 4th March
2022)
21
Megan Ma is a residential fellow at CodeX. Her research considers
the limits of legal expression, in particular how code could
become the next legal language. Megan is also the Managing Editor
of the MIT Computational Law Report and a Research Affiliate at
Singapore Management University in their Centre for
Computational Law. As well, she received her PhD in Law at
Sciences Po and was a lecturer there, having taught courses in
Artificial Intelligence and Legal Reasoning, Legal Semantics, and
Public Health Law and Policy. She has previously been a Visiting
PhD at the University of Cambridge and Harvard Law School
respectively.
Megan Ma Paper: Clause2Game: Modeling Contract Clauses with
Composable Games (1100 – 1135, 4th March 2022)
CodeX Residential Fellow,
Managing Editor, MIT
Computational Law Report, and and
Research Affiliate, CCLAW
Lache-ing Onto Change: Object-Oriented Legal
Evolution (1415 – 1450, 3rd March 2022)
22
Previously a mathematician with interests in graph theory, Avalon
Campbell-Cousins is now a PhD student in biomedical signal
processing at the University of Edinburgh and is working on
building brain network models for the analysis of Alzheimer’s
disease. Furthermore, he has an interest in building models of
natural language for the processing of legal text, aiming to increase
accessibility of the law and analyze its temporal development. He
is actively researching in this area with his multi-disciplinary team
and in partnership with SMU.
23
Jerrold is an assistant law professor, legal analytics startup
founder, and self-taught programmer. He teaches torts, a legal
area of under-appreciated relevance to technology and AI
regulation. His research centrally asks how we might best deliver
justice at scale to those who need it. It is well-recognised that
having more lawyers working harder, faster, and smarter, though
important, is not enough. He explores how technology can, will,
and should be used to do justice. 'Can' is a technical question on
how we might computationally represent law's logic, while
empirically analysing law's experience. 'Will' is a predictive
question informed by market study of legal technology and
innovation trends. 'Should' is a normative question which must be
Jerrold Soh informed by law, policy, and ethics, particularly where algorithmic
fairness or autonomous system liability is concerned. Jerrold
Assistant Professor of Law and draws broadly on his backgrounds in law, economics, and
Deputy Director, CCLAW programming to answer these questions.
24
Ahmet Kaplan received his BS degree of Electric-Electronic
Engineering in Bilkent University, along with his MSc and PhD in
Erciyes University. He worked as instructor at Erciyes University
Engineering Faculty and at Civil Aviation School. Between 2000
and 2005, he studied as post-doctorate at the Georgia Institute of
Technology in America and he worked with international
companies as Software Development Project Manager. From 2005
to 2011, he worked at Turksat as IT Director and Vice President
(CIO). With his team, he managed the rewriting of whole system,
the e-government portal (www.turkiye.gov.tr), which has a huge
impact on Turkey’s transition to Information Society. Between
Ahmet Kaplan 2012 and 2014, he fulfilled a duty as the Institute Director at
Turkish Academic Network & Information Center (ULAKBİM),
Information Technologies TUBİTAK and technical coordinator of the FATIH Project,
Director, Ibn Haldun University Ministry of Education. In 2014, he was assigned as vice president
at Turksat once again. At 2017, he started to work at Ibn Haldun
University as Information Technologies Director.
25
Muhammet Talha Kaan graduated from Fatih Sultan Mehmet
Foundation University Faculty of Law in 2017 and from Anadolu
University, Faculty of Economics, Department of Public
Administration in 2018. After law education, he completed law
internship at the Istanbul Bar Association in 2018. After
completing his master's studies with his thesis entitled “Slot
Applications with Legal Dimension” in the International and
Comparative Law Program of Ibn Haldun University, he started
his doctorate studies at Marmara University Public Law Program
in 2021. He is fluent in English and introductory Arabic. His areas
of interest are; administrative law, international law,
Muhammet Talha Kaan computational legal science, air and space law.
26
Pierpaolo Vivo studied Physics in Parma (Italy), where he
graduated in Theoretical Physics in 2005. He obtained his PhD in
Mathematics in 2008 from Brunel University (West London). He
spent three years as a postdoctoral fellow in the Condensed
Matter and Statistical Physics group at the Abdus Salam ICTP
(Trieste). There he worked on applications of Random Matrix
Theory and Statistical Mechanics to the stability of financial
markets. During the period 2011-2014 he worked as a research
scientist at the Laboratoire de Physique Théorique et Modèles
Statistiques (LPTMS) in Orsay (France). He has been a permanent
member of the Disordered Systems group at King's College
Pierpaolo Vivo London since September 2014. He is an Associate Editor for the
Journal of Statistical Physics (Springer), for Scientific Reports
(Nature), and for Complex Systems. He has acted as Guest Editor
Director of the Quantitative and
Digital Law Lab, Department of in other guest-edited collections, notably ‘The Physics of the Law’
Mathematics, King’s College (Frontiers). He leads the Quantitative and Digital Law Lab in the
London (UK), and UKRI Future Department of Mathematics at King’s College London, a team of
Leaders Fellow five people supported by a UKRI Future Leaders Fellowship that
promotes a quantitative approach to issues around the complexity
of legal systems.
27
Shaun Lim is a Research Assistant with the Centre for Technology,
Robotics, Artificial Intelligence & the Law. Shaun graduated from
NUS in 2018 with a Bachelor of Laws, and was called to the Bar
in August 2019. Shaun has interdisciplinary research interests in
the intersection of law and scientific fields such as technology,
computer science, as well as probability and statistics, particularly
the use of artificial intelligence in legal contexts such as judicial
decision making.
28
Papers and Abstracts
Papers are listed by order of conference presentation. Full papers may be found here.
Public Records Aren’t Public: Systemic Barriers to Quantifying Racism Day 1,
Kat Albrecht, Kaitlyn Filip Session 1,
Talk 1
In a new era of computational legal scholarship, computational tools exist with the
capacity to quickly and efficiently reveal hidden inequalities in the justice system.
Technically, the laws exist that legally entitle the public to the requisite court records.
However, the opaque bureaucracy of the courts prevents us from connecting the public
to documents they technically own. We exemplify this legal ethical problem by
investigating areas of law where codified protections against inequalities exist and where
computational tools could help us understand if those protections are being enforced. In
general, the computational requirements of such projects needn't be complex, making
them even more attractive as solutions for auditing justice. Using the backdrop Cook
County felony courts, the largest criminal court in the United States, we establish the
impossibility of securing the public records needed to quantify the illegal use of racially-
motivated peremptory strikes that serve as the bottleneck to the relatively simple
computational process of quantifying racism.
This problem is so egregious, that even the number of criminal trials in Cook County is
concealed from the public. The Administrative Office of Illinois Courts Annual Report
states that 7% of defendants went to trial, indicating they have the data, but do not
elaborate on the actual number of trials. This becomes crucial because of the burden
placed on the public when requesting the court transcripts that are necessary to reveal
the frequency of racially-motivated peremptory strikes. A requester is required to
provide the date of the hearing, the case name, the case number, the courtroom number,
and the name of the judge - for each transcript. This creates a scaling problem when
attempting to perform the type of macro-level computational analysis that can reveal
systemic patterns. The transcripts are also priced to make large scale analysis implausible,
with original transcripts starting at 4.00 per page for documents of unspecified lengths
that can easily be hundreds of pages long. This means that a relatively simple
computational project, requiring only basic natural language processing, is stymied by
court processes that turn accessing public records into an insurmountable barrier. This
article considers the ethical implications of the lack of access to public legal complaints
and both the limits that computational legal scholars are facing and ways that these
scholars can provide solutions.
29
Parsimony - Using Feature Engineering to Critique Private Law Rules Day 1,
Jeffery Atik Session 1,
Talk 2
Parsimony, as a design principle, is the insight that fewer features may be preferable in
many contexts. Features may be redundant, they may offset the effect of other features
and they may make little effective contribution to the outcome. Parsimony restrains the
impulse to accrete complexity through the addition of features. Adding features increases
the risk of overfitting the model to the data on which it is built. Each additional feature in
a model compounds the effect of variability. And there is an economy to modeling: the
respective contributions of each feature to the overall performance of the model must
outstrip the costs to justify the feature’s inclusion. Black letter law - the law we find in
Restatements and treatises - are models of law. They specify features for legal
consideration (‘factors,’ ‘elements’); the values of those features determine the output of
a legal rule.
This paper explores the principle of parsimony as applied to legal rules. Recasting legal
rules into algorithmic form and subjecting them to testing can reveal the strengths and
weaknesses of current legal formulations. Law accretes novel tests and added
considerations, evolving rules that appear to be more precise, yet which burden the law
with undesirable complexity. Lawmakers can learn to appreciate elegance in design - a
less-than-more notion that praises the achievement of an objective in a way that draws
fewer resources. A parsimonious approach to law queries whether every feature’s
presence within a legal rule can be justified. By selectively eliminating single features from
the model of a rule and then testing the resultant we can reveal opportunities for feature
reduction without overly impairing the power of the rule. Parsimony as a general design
principle and feature elimination intended to reduce or eliminate algorithmic bias interact.
Law, old and new, rejects the introduction of certain features - such as race or gender -
into legal rules on policy grounds. Algorithmic bias analysis reveals the reintroduction of
these attributes indirectly through feature proxies. Eliminating features that function as
conduits of bias can work in parallel with the application of parsimony - although at a
certain point our legal models may become too thin to be functional.
Jason Morris is the Director of Rules as Code for Service Canada, a department of the
Canadian federal government. He is participating in a project to develop a prototypical
citizen-facing application to enhance Canadians' understandings of their entitlement to
four different government benefit programs implemented under Canada's Old Age
Security Act. The project is using a "rules as code" encoding of the relevant legislation on
the OpenFisca open source microsimulation platform. The paper will discuss the selection
of OpenFisca, the style of OpenFisca encoding we have used, the challenges of integrating
Rules as Code development with teams not familiar with the methodology, and our future
outlook on the use of OpenFisca as a platform for Rules as Code in the development of
legal expert systems inside public services. The project is anticipated to be the first
deployment of OpenFisca by a national government for use in a citizen-facing legal expert
system. It is likely that we would be possible to demonstrate a pre-release version of the
software as part of the presentation.
30
How Could We Perfect the Use of Artificial Intelligence in Litigation Day 1,
Elizavetas Shesterneva Session 2,
Talk 2
The goal of the paper is to examine the ways AI can help lawyers, judges, and other legal
professionals in litigation proceedings and to determine the ethical considerations that
come with it. We have already witnessed that AI can perform a wide range of functions
– predicting court case outcomes, analyzing legal documents, with some researches
suggesting that in the future AI will replace human judges or create a “perfect” precedent
system. For instance, the Supreme Court of Victoria in Australia has welcomed the use
of Technology Assisted Review (TAR) in situations when the manual review of the
documents would be too expensive or would take too much time.
The judiciary, one of the three main organs of the state, is administered by independent
courts, according to the Turkish Constitution. The legal system in Turkey is divided into
three branches: judicial, administrative, and constitutional courts. Each branch of the
judiciary makes a distinct kind of decision. In order to ensure judicial consistency and
systematicity, "Legal Guide" published in 2013by the Department of Justice. This guide
was created to establish a minimum standard for the evidence that should be collected
by the judge and to reduce the mistakes that occur frequently in practice. Also, the
Council of State's "Decision Writing Guide" published in 2020, is an important source
that shows judges the principles and form conditions to be followed when writing a
decision. In this context, both guides both provide the formal content in terms of the
basis and content of the decisions, and also reveal the common rules for decision writing.
Examination on the basis of these common rules is a roadmap for legal prediction.
The Turkish Ministry of Justice has digitized the justice services through the National
Judicial Network Information System (UYAP). With UYAP, users and judicial bodies
exchange all kinds of information and documents electronically. However, in this system,
judicial decisions are not open access for the public. The Courts of the Turkish Judiciary;
The Constitutional Court has created databases with 14922 precedents, the Council of
State with 85374 and the Court of Cassation with 6137799 precedents. In this study, it
is aimed to evaluate the accessible court decisions and the decisions submitted by private
third parties, determining the technical and legal requirements for the accessible data set
and the requirements for determining and constructing an appropriate data set for the
ML technique to be employed in the research to be conducted using the computational
law study methodology.
This article is going to discuss the process of judgment in the Turkish Judiciary, and
alternative systematization of decision-making process. The alternative systematization
would help to uncover the logic behind the decisions that are made until today. Moreover,
it could also help to understand which methods and algorithms are able to help to make
a prediction of future judgments using the existing Turkish Judgment cases. Many
researchers used various methods such as text extraction, classification and clustering
methods (Liu and Hsieh, 2006: Aletras et al., 2016; Sulea et al, 2017), Natural Language
Processing (NLP) (Kim, 2014; Baharudin et al., 2010: Tang et al., 2015) and Multi Task
Learning (MTL) using both hard parameter sharing and soft parameter sharing. This
research article targets to build a model of the decision-making of Turkish Judiciary in
31
order to turn Turkish Juridical cases into a form which will make them appropriate to
process through machine-learning techniques. This preliminary methodological discussion
in this research article is going to make a contribution for the researchers who will use
the existing data pools in the field of Computational legal studies. In order to analyze
linguistic models of juristic data, BERT model structures will be used.
With the ever-growing accessibility of case law online, it has become challenging to
manually identify case law that is relevant to one’s legal issue. An example of a country in
which this problem is becoming increasingly prevalent is the Netherlands. Between 2016
and 2020 the percentage of published judgments has more than doubled, from 3.5% in
2016 to 7.2% in 2020. This amounts to 38,000 decisions per year, ranging from lower
level courts, such as district courts, to higher courts such as the Court of Appeal and the
Supreme Court. The ambition of the Dutch council for the judiciary is to implement a
system in which 75% of all cases are published. According to scholars this will lead to
problems regarding the searchability of the data.
Hence, datasets of this size call for ways to automatically analyse data, as doing so
manually is time consuming. One such method is machine learning. Over the past few
decades machine learning techniques have been used for a variety of tasks in the field of
Artificial Intelligence & Law. In legal judgment forecasting, the outcome of a court case is
predicted from the facts of the case with the help of algorithms and natural language
processing (NLP). Research shows that the texts of legal proceedings hold valuable
information for algorithms. Research has also shown that there is a significant relation
between citations and case authority, which is the extent to which a case is deemed
important for settling other legal disputes.
Because of the Dutch ambitions to eventually publish many more decisions, and because
there is a substantial amount of currently available data, we are using Dutch data. We aim
to gain insights into the most informative features of a case, and discover if there are
certain words, phrases, or characteristics that increase a decision’s citability. As such, this
paper aims to bridge the gap between citation analysis and NLP, and solidify the relation
between the text of a case and its importance, and its citation network.
32
Information Retrieval and Structural Complexity of Legal Trees Day 1,
Pierpaolo Vivo, Alessia Annibale, Evan Tzanis, Luca Gamberi, Yanik-Pascal Förster Session 4,
Talk 1
We introduce a model for the retrieval of information hidden in legal texts. These are
typically organised in a hierarchical (tree) structure, which a reader interested in a given
provision needs to explore down to the ``deepest'' level (articles, clauses,...). We assess
the structural complexity of legal trees by computing the mean first-passage time a
random reader takes to retrieve information planted in the leaves. The reader is assumed
to skim through the content of a legal text based on their interests/keywords, and be
drawn towards the sought information based on keywords affinity, i.e. how well the
Chapters/Section headers of the hierarchy seem to match the informational content of
the leaves. Using randomly generated keyword patterns, we investigate the effect of two
main features of the text -- the horizontal and vertical coherence -- on the searching time,
and consider ways to validate our results using real legal texts. We obtain numerical and
analytical results, the latter based on a mean-field approximation on the level of patterns,
which lead to an explicit expression for the complexity of legal trees as a function of the
structural parameters of the model. Policy implications of our results are briefly
discussed.
Bridging the Gap between Machine Learning and Logical Rules in Day 2,
Computational Legal Studies Session 1,
L. Thorne McCarty Talk 1
This paper presents a novel method for unsupervised machine scoring of short answer
and essay question responses, relying solely on a sufficiently large set of responses to a
common prompt, absent the need for pre-labeled sample answers—given said prompt is
of a particular character. That is, for questions where “good” answers look similar,
“wrong” answers are likely to be “wrong” in different ways. Consequently, when a
collection of text embeddings for responses to a common prompt are placed in an
appropriate feature space, the centroid of their placements can stand in for a model
answer, providing a loadstar against which to measure individual responses. This paper
examines the efficacy of this method and discusses potential applications.
Current methods for the automated scoring of short answer and essay questions are
poorly suited to spontaneous and idiosyncratic assessments. That is, the time saved in
grading must be balanced against the time required for the training of a model. This
includes tasks such as the creation of pre-labeled sample answers. This limits the utility
of machine grading for single classes working with novel assessments. The method
described here eliminates the need for the preparation of pre-labeled sample answers. It
is the author’s hope that such a method may be leveraged to reduce the time needed to
grade free response questions, promoting the increased adoption of formative
33
assessment especially in contexts like law school instruction which traditionally have
relied almost exclusively on summative assessments.
The Silent Influence of the Chinese Guiding Cases: A Text Reuse Day 2,
Approach Session 2,
Benjamin Chen, Elliott Ash, Zhiyu Li Talk 1
In 2011, the Chinese Supreme People’s Court officially introduced the Guiding Case
system. Selected from judgments rendered by courts nationwide, Guiding Cases address
a range of legal issues and must be referred to by all judges when deciding similar cases.
It was hoped that the Guiding Cases would unify the application of law and safeguard the
quality and integrity of adjudication.
The Guiding Case system has, however, proved controversial. Judges in a socialist legal
system are supposed to apply the law, not make it, and socialist legality does not recognize
precedent as formally binding. These theoretical commitments have incited both doctrinal
and empirical debates about the legitimacy of the Guiding Cases. On the doctrinal front,
scholars have either tried to assimilate the Guiding Cases into existing sources of law or
rejected its authority qua law. On the empirical side, researchers have documented the
lackluster impact of the Guiding Cases on actual judicial practice. Citations to the Guiding
Cases are rare and many Guiding Cases are never cited at all. The extremely low
incidence of citations has been interpreted as evidencing the futility of the Guiding Cases.
We revisit the existing literature on Guiding Cases by looking beyond citations—or overt
references—to search for echoes of their reasoning—or implicit references—in the
judgments of lower courts. To do so, we employ text reuse methods that identify
portions of these judgments that uniquely repeat the key points of adjudication from the
Guiding Cases. The results suggest that the Guiding Cases are more influential than is
commonly assumed, thereby calling into question conventional narratives about the
hostility of the Chinese legal system to case-based adjudication.
34
Lache-ing Onto Change: Object-Oriented Legal Evolution Day 2,
Megan Ma, Adam Nicholas, Avalon Campbell-Cousins, Dmitriy Podkopaev, Jerrold Soh Session 2,
Talk 2
The growth of local jurisprudence has been a subject of particular interest in Singapore
given its young history as a nation. Although rooted in English law, Singapore law has
branched out significantly – especially in recent years – to establish itself as an increasingly
independent, autochthonous system of law (Andrew Phang et al, 2020). A close,
computational linguistic study of how this phenomenon occurred (as manifested in
changes in Singapore’s case law and legislative corpora over time) are likely to yield
fascinating observations on meaning-making and the evolution of legal concepts. We
hypothesize that, by extending the object-oriented approach outlined previously, a
precise usage-based and empirically defensible analysis of the diachronic changes in
terminology and legal reasoning can be achieved. Harnessing traditional linguistic methods
alongside cutting-edge advancements in NLP allows synergy in the understanding of legal
language evolution.
As discussed in our prior research, object-oriented software design maps neatly onto
concepts in language (Megan Ma et al, 2020). The use of graph databases facilitates the
storage and analysis of such linguistic structures optimally and at scale, thereby allowing
both testing and evaluating the viability of our approach. Subsequently, this fosters the
construction of a framework for deriving meaning from text in a manner that is more
human-readable. In short, this means distilling text into a structure that can then be
traversed, executed, and eventually implemented by an end-user. Our approach contrasts
with existing language models and analytical techniques that have actively concealed these
concepts through black boxes.
As a next phase of our research, we want to consider how far our model can extract the
semantic content of judicial decisions from a larger corpus. A key focus lies on how
techniques from historical linguistics may be useful for this task. Historical linguistics is
the study of language change. By delving into often the deepest roots of language, its
contributions vary widely: from a deeper understanding of the human mind, or mental
processes, to a reconstruction, or unveiling of forgotten historic cultures. A central sub-
branch of this line of study is comparative linguistics, which concerns the genetic
interrelations and divergences between linguistic varieties at varying stages in time.
Although primarily focused on the reconstruction of ancestral speech and culture, the
comparative method (amongst other historical linguistic methods) could prove fruitful in
the analysis of legal text and its development. For instance, the quantitative study of
particular constructions has potential to reflect deeper legal and jurisdictional shifts.
In brief, we aim to achieve through our paper an alternative method of unpacking and
interpreting the effects and implications of legal change. This goal may appear somewhat
ambitious given that the concept of legal change is something that has been studied
extensively in legal academia for decades. However, the novelty (and focus) of our
proposed approach lies in exploiting new computational linguistic and computational law
methodologies to shed new light on the subject. That said, because the topic is admittedly
extensive, the paper proposed above will be but a first step towards a larger
collaboration, limited to the subject area of laches and its evolutionary interpretations.
The Statute Data Swamp: NLP, Data Cleansing, Data Interpretation Day 2,
and Legislative Drafting Session 3,
Matthew Waddington Talk 1
What might happen if we treat collections of legislation, “the statute book”, as a data lake
(or swamp)? This paper takes a look, from a perspective informed by legislative drafting,
35
at the prospects and dead ends of trying to learn from computational analysis of large
quantities of statutory texts.
As with other areas, much will depend on the choice of the data, its quality and the
questions that we expect to be able to investigate with it. Various goals have been
suggested, including weighing regulatory burdens, detecting outdated, duplicated and
overlapping legislation and finding repeated patterns that could help legislative drafters
standardise their output. The paper considers these alongside the prospects for two
other questions: what we could learn about how legislation really works and whether
NLP could help extract the logical structure from existing statute books to facilitate
embedding new legislation drafted using “Rules as Code” into its context.
What statutory data would we want to pour into our data lake, and what data cleansing
might we apply to it? The paper looks at what “the statute book” and related terms mean,
the relationships between primary and secondary legislation and between amending and
consolidated legislation, and at the sorts of collections available. It considers the
significance of the age, and patchwork nature, of the various parts of many countries’
statute books, and the extent to which there is structured data available or that can be
readily produced. The paper uses the example of the implications for word frequency
analysis of the way that drafters have moved from using “shall” to using “must”. It also
considers how drafters believe they create rules (going back to Coode’s “legislative
sentence”) and whether analysis of bodies of legislative texts could help check how far
those beliefs are reflected in practice.
The paper reviews three recent projects analysing bodies of statutory material: the
“Open Regulation Platform” at the UK’s Department for Business, Energy and Industrial
Strategy; the “Regulatory Genome Project” of Cambridge University; and recent work
by Jason Morris in the Canadian Civil Service looking for the frequency of particular types
of expression in legislation. It particularly analyses the claims made about computational
analysis of regulations in work done by the New South Wales Treasury for their report
“Regulating for NSW’s Future”. It also reflects on the way these projects have used graphs
and networks to represent cross-references within and between items of legislation.
Court rulings are stuck in a dilemma: On the one hand, the resulting legal documents are
a product of jurisprudence and are therefore characterized by the use of technical (legal)
language; on the other hand, the rule of law requires that they be comprehensible to
laypersons.
36
ambiguity caused by the both conscious and unconscious technical reuse of already
existing words.
37
Computational law is often written about as the next frontier of legal systems
development, promising to automate the operation of legal rules with “smart” contracts
and statutes, from which laws and justice can be administered digitally with greater
efficiency. However, this also tends to attract naysayers which point out the alleged
inflexibilities of computational law in handling the ambiguity inherent in legal language,
arguing that such inflexibility robs the law of beneficial inherent vagueness, and further
that human reaction to such inflexibility may be detrimental or lead to otherwise
unintended consequences. Part of this divide is due to an imperfect understanding of the
innovation spaces open to computational law, as well as a certain degree of imprecision
in talking about computational law and the various domains in which it can or should be
applied. Nonetheless, such a divide howsoever arising is not conducive to a common
understanding of the promise and limits of computational law. To provide a firm and
shared foundation upon which further analysis of the capabilities and limitations of
computational law can take place, the authors propose to perform a meta-analysis on the
existing literature on computational law to understand the positions taken by both
supporters and detractors and to identify general areas of overlap, conflict, and
consensus.
Outcome identification is defined as the task of identifying the verdict within the full text
of the judgement, including (references to) the verdict itself. Given the growing body of
published case law across the world, the automation of this task may be very useful, since
many courts publish case law without any structured information (i.e. metadata) available,
other than the judgements themselves, and often one may require a database where the
judgements are connected to the verdicts in order to conduct research.
We discuss appropriate and inappropriate methods to approach different tasks given the
objective of each of them and examine why, while they clearly have such different
purposes, the above tasks are often confused with each other. We argue that it is likely
due to the cross-disciplinary nature of the field and confusion of natural language
processing terminology and concepts that are used by people on a daily basis.
38
We discuss how important it is to understand the legal data that one works with in order
to determine which task can be performed. Finally, we reflect on the needs of the legal
discipline regarding the analysis of court judgements.
Applying computational corpus linguistics, this Article finds that semantic questions in
statutory interpretation generally lack clear answers in real-world cases. The traditional
corpus linguistics literature treats most cases of statutory interpretation as semantically
determinate, meaning they can be resolved through inquiry into word meaning alone. In
contrast, this Article finds that most statutory disputes fall within a “zone of
39
indeterminacy” where other evidence of meaning should also be considered, like
legislative history or canons of construction.
Code and data unavailable, available upon reasonable request, or from dead links only.
Little, if any, documentation of underlying assumptions or judgment calls. Lack of
sensitivity analyses, robustness checks, or ablation studies. Limited peer review, or peers
impressed by figures showing results produced by algorithms they do not understand, on
data whose provenance is unclear. Referenced sources behind paywalls or not indexed
by common search engines at all. The list of deficiencies affecting published papers in our
field goes on. How come?
The answer is simple, yet unsettling: The field of computational legal studies is hard.
Things can go wrong. Misspecified models, dirty data, buggy code. No individual
researcher is perfect, but as a community, we can strive to identify our mistakes, correct
them, and learn from them for the future. We can get better, individually and collectively,
and we can make progress. This, however, requires scientific hygiene routines that have
yet to be established. As our research develops at the intersection of law and computer
science, and articles using computational methods make their way into mainstream legal
research, we can no longer ignore the striking mismatch between the publication
procedures familiar from doctrinal scholarship and empirical legal studies on the one
hand, and the requirements of robust, reproducible computational legal research on the
other.
In this article, we argue that for computational legal studies to advance as a community,
the field needs a publication culture designed to meet its unique challenges. We find the
building blocks for such a culture in our parent disciplines. From computer science, we
can adopt the requirements of data availability, code availability, honest assessments of
the methodological and interpretive limitations of our research, and transparent,
constructive criticism of our own work and the work of others. Legal publication culture
offers other advantages: Less driven by conference deadlines and less overwhelmed from
mass peer-review, legal scholars can make time to focus on big ideas, rather than merely
pushing for incremental improvements. Hence, combining the best of both our worlds
can help us keep our studies both scientifically rigorous and comprehensible for a
heterogeneous audience comprising both computer scientists and lawyers.
We scrutinize the current state of best practices, peer review, and journal policies in our
field, and suggest a set of foundational principles on which we might build the publication
culture of computational legal studies. To operationalize these principles, we propose a
protocol for the quality control of computational legal studies. This protocol may serve
as a checklist for authors, reviewers, and readers alike, and we demonstrate its usefulness
in an application to some of our recent work. We further introduce ideas for fostering a
transparent review culture, and share our vision for a division of labor between
publication venues. By presenting the pain points of our daily research experience along
with potential solutions, we hope to contribute to the development of a community that
shares and cares, creating a culture of constructive criticism in computational legal
studies.
40
Empirical research has indicated that states with more economic power are more
successful in bilateral negotiations on investment treaty law. Using computational
methods, Alschner and Skougarevskiy (2016) find powerful states’ negotiated treaty texts
are more internally coherent (similar to each other); and with quantitative methods, Allee
and Peinhardt (2014) find that capital exporting states use their bargaining power to
create stronger and more enforceable investor protections. However, Berge’s (2021)
qualitative analysis of negotiations suggests that this dominance may be both a function
of economic power and bureaucratic competence. This study takes these findings a step
further and asks: Is this dominance so strong that it is possible to predict in advance the
textual outcome of negotiations, without any knowledge of the actual negotiations?
Using a comprehensive temporal network of all clauses, mapped through to their original
drafters, we present a computational model that predict clause-level negotiation
outcomes with an accuracy of up to 96%. It is based on a dataset of over 3000 treaties,
4000 clauses, and a marking of which party obtains “their” clause in any given agreement.
The model, which is trained on treaties from 1950 to 2015, is largely able to predict the
textual outcome of treaties that match with the real-world treaties of 2016-2020 using
only the names of the two state parties, and clause type as input. The model shows that
treaty negotiations are even more predictable than previously assumed, and that power
dynamics largely pre-determine negotiation outcomes. With these empirical results we
review and test how current theories and research on negotiations fit with the highly
predictable nature of international Investment treaty law.
From Contract to Smart Legal Contract - A Case Study using the Day 3,
Simple Agreement for Future Equity Session 3,
Ron van der Meyden, Michael J. Maher Talk 1
The talk will give an overview of a case study on the development of smart contracts for
Y Combinator's Simple Agreements for Future Equity (SAFE), a type of legal contract
used in financing startups. SAFE contracts promise a seed investor that, in exchange for
their investment, preferred shares will be issued at the time of a future equity round,
according to a formula that mixes properties of debt and equity.
The precision required in developing smart contract code for these contracts has raised
a significant number of questions of financial analysis, game theory, legal formalisation,
software architecture, and formal verification.
Prima facie, the formula for conversion of a SAFE to shares is straightforward. Naively
applied, however, it has the effect of directly diluting the equity round investor as a
consequence of the equity round. The sophisticated equity round investor responds by
varying the conversion formula by decreasing their pre-money valuation, creating tension
with the SAFE investor that may be resolved using yet another approach to the
conversion calculation.
There is a further complication that the SAFE creates a circularity in the notion of "pre-
money valuation", rendering this term "open-textured" in the sense of having an indefinite
meaning in unexpected situations. In [1], we conduct a financial analysis of the variety of
ways that an issuance of shares can be calculated in the face of these complexities.
There is more indefiniteness in the language of the contract, in expressions such as "bona
fide transaction or sequence of transactions". We argue that this text is in fact necessary
to protect the SAFE investor against an collusive "attack" by the company and equity
round investor. Paper [3] discusses this and other problems of indefiniteness in the SAFE,
41
and how it bears on the form of a smart contract for the SAFE contract. One conclusion
is that a smart contract on its own does not suffice to protect the interests of the parties,
but that an associated legal contract is necessary. Our approach allows for a spectrum of
ways to manage indefinite terms in the source legal text, spanning from refinement to a
rigid formalization to the expression of a precise process by which human agents will
resolve indefiniteness during the performance of the contract.
Taken together, the analysis we have conducted on these issues illustrates how the
process of formalization can drive the development of a significantly deeper
understanding of a contract itself. This case study is also informative on the requirements
for smart contract languages and tooling for smart contract developers.
[1] Simple Agreements for Future Equity -- Not So Simple?, Ron van der Meyden and
Michael J. Maher
[2] A Game Theoretic Analysis of Liquidity Events in Convertible Instruments, Ron van
der Meyden
[3] Can SAFE contracts be smart?, Ron van der Meyden and Michael J. Maher
42
About the Centre
The Centre for Computational was founded in early 2020. It is led by Centre Director and Assistant
Professor of Law and Computer Science Lim How Khang. AP Lim also directs the School’s Computing
and Law degree programme. True to our research roots, our physical premises are located in the
basement of the SMU School of Law, alongside the School’s other research Centres.
CCLAW’s flagship and inaugural project is the Research Programme in Computational Law (“RPCL”),
whose primary aim is to “study and develop open source technologies for ‘smart’ contracts and ‘smart’
statutes, starting with the design and implementation of a domain-specific programming language (DSL)
that allows for laws, rules and agreements to be expressed in code”.1 That is to say, “computational
law” in the formal, Love and Genesereth (2008) sense. The RPCL is supported by a S$15.2 million
grant from the National Research Foundation of Singapore, and led by Principal Investigator Wong
Meng Weng as well as Industry Director Alexis Chun who together are co-founders of Legalese.com.
Beyond the RPCL, the Centre is also pursuing and developing projects in other areas at the
intersection of law and technology. No prizes, of course, for guessing that computational legal studies
– and the related fields of natural legal language processing and legal AI – fall within these areas. Other
developing projects relate to legal technology adoption and applications.
We are open to collaboration with the international research and legal community and warmly invite
conference participants to reach out to us on these fronts.
1
SMU and National Research Foundation Joint Press Release (2020), Available at
https://www.nrf.gov.sg/docs/default-source/modules/pressrelease/smu-awarded-15-million-grant-for-
computational-law-research.pdf.
43
About the Law School2
The Yong Pung How School of Law (“YPHSL”) at Singapore Management University (“SMU”) was
founded as the SMU School of Law in 2007. It is Singapore’s second oldest (and second youngest) law
school. In 2017, we moved into our present campus adjacent to the historic Fort Canning Park. In
2021, we were officially renamed as the Yong Pung How School of Law, in honour of former Chief
Justice Dr Yong Pung How, who was an important benefactor to the school, having served as the
founding Chairman of our advisory board.
Located at the periphery of Singapore’s central business and civics districts, we are within walking
distance to key legal institutions such as the Supreme Court, Parliament, Law Ministry, and Academy
of Law. We maintain close partnerships with government and industry that, in turn, informs our
research and teaching.
Under the leadership of our present Dean Professor Goh Yihan (the youngest law dean in Singapore’s
history), law and technology was made one of our core focus areas. In 2018, the Centre for AI and
Data Governance was formed with a generous grant from the Singapore National Research
Foundation. In 2019, we launched a brand new Computing and Law degree, an interdisciplinary
undergraduate course which sees students reading core computer science modules alongside legal
classics such as torts, crime, and contracts. This year, we will also see our first intake for a new PhD
programme in Law, Commerce, and Technology. More broadly, we have been angling our curriculum
towards preparing our students for a more technological future.
See our website for more information on our other research and pedagogical efforts, international
events and collaborations, job openings, visiting academic programmes, and more.
2
Please note that this is broad introduction to the school written by the conference committee and does not
represent the school’s official views. See the school website for those.
44