Download as pdf or txt
Download as pdf or txt
You are on page 1of 44

Computational Legal Studies:

Past, Present, and Future

2 – 4 March 2022

Brought to you by
The Centre for Computational Law

Powered by In Collaboration With


Contents
Recursion ................................................................................................................................ 2

About the Conference ........................................................................................................ 3

Conference Programme ...................................................................................................... 4

Keynotes ................................................................................................................................. 9

Speakers and Authors .......................................................................................................... 13

Papers and Abstracts ........................................................................................................... 29

About the Centre ................................................................................................................. 43

About the Law School ......................................................................................................... 44

2
About the Conference
The rise of statistical learning methods in academia has permeated law, triggering what has been called
a “computational turn” in legal scholarship. In this emerging field, novel techniques such as network
analytics and natural language processing are being applied to uncover, and quantify, previously hidden
insights about the law. However, the novelty of the computational legal method and its implications
mean that traditional venues for discussing and publishing such work remain scarce.

Against this backdrop, the SMU Centre for Computational Law (“CCLAW”, which we pronounce
/sea-claw/), in collaboration with law.mit.edu, have organized this conference to gather leading scholars
and thinkers in the field. The conference theme, “Past, Present, and Future” invites scholars to
introspect on where the field is now, and where we may (and should) be headed. Guided by the theme,
abstracts were solicited on all aspects of computational legal studies. A special release of the MIT
Computational Law Report focusing on the conference theme is also in the works.

The final conference programme comprises four keynotes and 22 paper presentations from speakers
and distinguished scholars around the world. Full papers may be found here. Given the international
participation and the pandemic, the conference will be conducted in a hybrid online/offline format.
Conference start and end times have also been shifted slightly to accommodate diverse timezones.
The physical venue is the Yong Pung How School of Law, Singapore Management University, 55
Armenian St, Singapore 179943. Online participation will be through Zoom. Zoom details will be made
available to all registrants and speakers via email.

Please direct any enquiries to CCLAW Dy Director Asst Prof Jerrold Soh <jerroldsoh@smu.edu.sg>
or CCLAW Centre Manager Lis Kho <lskho@smu.edu.sg>. We wish everyone a wonderful and
productive conference!

3
Conference Programme
All times expressed in Singapore time (SGT, GMT+8).
All physical talks will be held in the Yong Pung How School of Law, Level 2, Seminar Room 2-16

2 March 2022 Venue

Online &
0915 - 0945 Registration
Physical
0945 - 1000 Conference Opening Physical
Keynote:
AI and Law: from Knowledge-based to Machine Learning (and
1000 - 1100 Online
Back?)
Kevin Ashley, University of Pittsburgh
Session 1 Talk 1:
Public Records Aren’t Public: Systemic Barriers to Quantifying
1100 - 1135 Racism Online
Kat Albrecht, Georgia State University
Kaitlyn Filip, Northwestern University
Session 1 Talk 2:
Parsimony - Using Feature Engineering to Critique Private Law
1135 - 1210 Online
Rules
Jeffery Atik, Loyola Law School - Los Angeles
1210 - 1340 Mid-conference Break
Session 2 Talk 1:
Using OpenFisca to Power Expert Systems in the Canadian Public
1340 – 1415 Online
Service: Lessons Learned
Jason Morris, Canadian Federal Public Service
Session 2 Talk 2:
How Could We Perfect the Use of Artificial Intelligence in
1415 – 1450 Online
Litigation
Elizavetas Shesterneva, ETPL.Asia /Promsvyazbank PJSC
1450 - 1520 Tea Break
Session 3 Talk 1:
Legal Judgment Prediction and Availability of Machine Learning in
Turkish Jurisdiction
Ömer Faruk Erol, Ibn Haldun University
1520 – 1555 Online
Ahmet Kaplan, Ibn Haldun University
Gülnihal Ahter Yakacak, Ibn Haldun University
Muhammet Talha Kaan, Ibn Haldun University
Safa Nur Altuncu Kaan, Ibn Haldun University

4
2 March 2022 Venue

Session 3 Talk 2:
Predicting Citations in Dutch Case Law with Natural Language
Processing
Iris Schepers, University of Groningen
1555 – 1630 Online
Martijn Wieling, University of Groningen
Masha Medvedeva, University of Groningen
Michel Vols, University of Groningen
Michelle Bruijn, University of Groningen
1630 – 1700 Tea Break
Session 4 Talk 1:
Information Retrieval and Structural Complexity of Legal Trees
Pierpaolo Vivo, King’s College London
1700 – 1735 Alessia Annibale, King’s College London Online
Evan Tzanis, King’s College London
Luca Gamberi, King’s College London
Yanik-Pascal Förster, King’s College London
Keynote:
Friend or Foe: Exploring the Dynamics of State Third Party
1735 – 1835 Online
Submissions with Computational Legal Methods
Arthur Dyevre, KU Leuven
1835 onwards Informal networking

5
3 March 2022 Venue

Online &
0930 - 1000 Registration
Physical
Keynote:
Some Thoughts on the State of Computational Legal Studies
1000 - 1100 Online
Circa 2022
Daniel Martin Katz, Illinois Tech
Session 1 Talk 1:
Bridging the Gap between Machine Learning and Logical Rules in
1100 - 1135 Online
Computational Legal Studies
L. Thorne McCarty, Rutgers University
Session 1 Talk 2:
Unsupervised Machine Scoring of Free Response Answers
1135 - 1210 Online
(Validated Against Law School Final Exam Questions)
David A. Colarusso, Suffolk University
1210 - 1340 Mid-conference Break
Session 2 Talk 1:
The Silent Influence of the Chinese Guiding Cases: A Text Reuse
Approach
1340 – 1415 Physical
Benjamin Chen, University of Hong Kong
Elliott Ash, ETH Zurich's Center for Law & Economics
Zhiyu Li, Durham University
Session 2 Talk 2:
Lache-ing Onto Change: Object-Oriented Legal Evolution
Megan Ma, Stanford Law School
Dmitriy Podkopaev, Simmons Wavelength Ltd
1415 – 1450 Online
Avalon Campbell-Cousins, University of Edinburgh
Adam Nicholas, University of Cambridge
Jerrold Soh, Singapore Management University

1450 - 1520 Tea Break


Session 3 Talk 1:
The Statute Data Swamp: NLP, Data Cleansing, Data
1520 – 1555 Online
Interpretation and Legislative Drafting
Matthew Waddington, Jersey's Legislative Drafting Office
Session 3 Talk 2:
Quantifying ambiguities in German court decisions – Identifying
1555 – 1630 problematic terms of legal language Online
Gregor Behnke, University of Freiburg
Niklas Wais, Leipzig University

1630 – 1700 Tea Break

Session 4 Talk 1:
1700 – 1735 The Un-Modeled World Online
Frank Fagan, EDHEC Business School

6
3 March 2022 Venue

Session 4 Talk 2:
The Promises and Pitfalls of Computational Law: A Meta-Analysis
1735 – 1810 of the Existing Literature Physical
Shaun Lim, National University of Singapore
Daniel Seng, National University of Singapore
Session 4 Talk 3:
Rethinking the Field of Automatic Prediction of Court Decisions
1810 – 1845 Masha Medvedeva, University of Groningen Online
Martijn Wieling, University of Groningen
Michel Vols, University of Groningen
1845 onwards Informal Networking and Conference Dinner

7
4 March 2022 Venue

Online &
0930 - 1000 Registration
Physical
Keynote:
1000 - 1100 Civil Litigation Outcome Prediction and Access to Justice Online
Charlotte S. Alexander, Georgia State University
Session 1 Talk 1:
Clause2Game: modeling contract clauses with composable games
1100 - 1135 Joshua Tan, University of Oxford Online
Megan Ma, Stanford Law School
Philips Zahn, University of St. Gallen
Session 1 Talk 2:
1135 – 1210 Computational Corpus Linguistics Online
Jonathan H. Choi, University of Minnesota
1210 - 1340 Mid-conference Break
Session 2 Talk 1:
Sharing and Caring: Creating a Culture of Constructive Criticism
1340 – 1415 in Computational Legal Studies Physical
Dirk Hartung, Bucerius Law School
Corinna Coupette, Bucerius Law School
Session 2 Talk 2:
Fait Accompli: Predicting the Outcomes of Investment Treaty
1415 – 1450 Negotiations Online
Malcolm Langford, University of Oslo
Runar Hilleren Lie, University of Oslo
Session 3 Talk 1:
From Contract to Smart Legal Contract - A Case Study using the
1450 – 1525 Simple Agreement for Future Equity Online
Ron van der Meyden, University of New South Wales
Michael J. Maher, Reasoning Research Institute, Canberra
Session 3 Talk 2:
1525 – 1600 “Legal Big Data”: From Predictive Justice to Personalised Law? Online
Andrea Stazi, European University of Rome
1600 – 1615 Conference Closing

8
Keynotes
Speakers are listed in first-name alphabetical order.

Arthur Dyevre is Professor at University of Leuven, where he


conducts research in the fields of Empirical Legal Studies, Law &
Economics and Natural Legal Language Processing. He was
Principal Investigator of the ERC-funded EUTHORITY Project
(www.euthority.eu), a large research project investigating patterns
of conflict and cooperation in the EU-multilevel legal system. His
current research interests pertain to deceptive persuasion,
behavioural comparative law and inter-group biases in litigation.

Keynote: Friend or Foe: Exploring the Dynamics of State


Arthur Dyevre Third Party Submissions With Computational Legal
Methods (1735 – 1835, 2nd March 2022)
Professor, Leuven Centre for
Empirical Jurisprudence, KU Governments regularly file third party submissions in disputes
Leauven before transnational courts. Applying computational legal
methods, we leverage data on state third party submissions before
the European Court of Justice to answer questions about the role
of trade and intercultural affinity on interstate cooperation.

9
Charlotte S. Alexander holds the Connie D. and Ken McDaniel
WomenLead Chair as an Associate Professor of Law and Analytics
at the Colleges of Business and Law at Georgia State University.
She uses computational methods to study legal text, with a
particular focus on understanding how courts process and resolve
employment disputes and other types of civil lawsuits. She
founded and directs the university's Legal Analytics Lab, which
works toward a legal system that embraces data to solve
intractable problems and create a more just society. Alexander
has published in journals including Science, the N.Y.U. Law
Review, Texas Law Review, the American Business Law Journal,
Charlotte S. Alexander and the Harvard Civil Rights-Civil Liberties Law Review. Her
research has been funded by the National Science Foundation, U.S.
Department of Labor, and private foundations. She received her
Connie and Ken McDaniel
WomenLead Chair, Associate
J.D. from Harvard Law School.
Professor of Law and Analytics,
Legal Analytics and Innovation Keynote: Civil Litigation Outcome Prediction and Access
Initiative, Georgia State University to Justice (1000 – 1100, 4th March 2022)

This chapter reviews the current state of computationally driven


civil litigation outcome prediction tools and maps the ways they
might affect the civil justice system, with particular attention to
the United States' unmet civil legal needs crisis. Such tools may
have the potential to fill the civil justice gap by reducing
uncertainty, thereby reducing the cost of civil legal services. Yet
litigation outcome prediction has not reached maturity as a field.
I catalog the data, methodological, and financial limits that have
impeded further development in general and in relation to access
to justice in particular. Even if such problems were solved,
however, there may be unintended consequences.
Computationally driven tools might reify previous patterns, lock
out litigants whose claims are novel or boundary-pushing, and shut
down the innovative and flexible nature of common law reasoning.
Such tools might also reduce opportunity for voice and dignity,
even for litigants whose claims are accepted. I close with a set of
proposals for policymakers as well as for tools' producers and
consumers.

10
Professor Daniel Martin Katz is a scientist, technologist and
professor who applies an innovative polytechnic approach to
teaching law - to help create lawyers for today's biggest societal
challenges. Both his scholarship and teaching integrate science,
technology, engineering and mathematics.

Professor Katz teaches at Illinois Tech Chicago-Kent Law and


spearheads new initiatives to teach law students how to leverage
technology and entrepreneurship in their future legal careers. He
has published or forthcoming work in a wide variety of academic
outlets, including Science, Plos One, Scientific Reports, Journal of
the Royal Society Interface, Journal of Statistical Physics, Frontiers
Daniel Martin Katz in Physics, Physica A and Artificial Intelligence & Law. In addition,
his work has been published in legal journals including Cornell
Professor of Law, Director of the Journal of Law & Public Policy, Emory Law Journal, Virginia Tax
Law Lab, Chicago-Kent College Review, Iowa Law Review, Illinois Law Review, Ohio State Law
of Law, Illinois Tech Journal, Journal of Law & Politics and Journal of Legal Education.

Professor Katz is currently working on two book projects


including an edited volume entitled Legal Informatics (Cambridge
University Press, 2021) and a book on technology + innovation in
law. Professor Katz has an Erdős Number = 4. His work has been
highlighted in a number of media outlets, including the New York
Times, The Wall Street Journal, Financial Times, BBC Radio,
Wired, Vox, National Public Radio, Slate Magazine, Huffington
Post, 538, Bloomberg Businessweek, ABA Journal, Law
Technology News and The American Lawyer.

Professor Katz received his Ph.D. in political science and public


policy with a focus on complex adaptive systems from the
University of Michigan. He graduated with a Juris Doctor cum
laude from the University of Michigan Law School and
simultaneously obtained a Master of Public Policy from the Gerald
R. Ford School of Public Policy at the University of Michigan.
During his graduate studies, he was a fellow in Empirical Legal
Studies at the University of Michigan Law School and a National
Science Foundation IGERT fellow at the University of Michigan
Center for the Study of Complex Systems.

Keynote: Some Thoughts on the State of Computational


Legal Studies Circa 2022 (1000 – 1100, 3rd March 2022)

In this talk, I will review both the history as well as the current
state of the academic literature as well as the commercial
applications devoted to law + computation. Further, I will provide
my point of view regarding the best path(s) forward for the field.

11
Kevin D. Ashley, Ph.D., is an expert on computer modeling of legal
reasoning. He performs research in the field of legal text analytics
and studies how to prepare law students for its effects on legal
practice. In 2002 he was selected as a Fellow of the American
Association of Artificial Intelligence “for significant contributions
in computationally modeling case-based and analogical reasoning
in law and practical ethics.” He is co-editor in chief of Artificial
Intelligence and Law, the journal of record in the field of AI and Law
and has been a principal investigator of a number of National
Science Foundation grants. He is the author of Modeling Legal
Argument: Reasoning with Cases and Hypotheticals (MIT
Press/Bradford Books, 1990) and of Artificial Intelligence and Legal
Analytics: New Tools for Law Practice in the Digital Age (Cambridge
Kevin D. Ashley University Press, 2017). In addition to his appointment at the
School of Law, Professor Ashley is a senior scientist at the
Professor of Law and Intelligent
Learning Research and Development Center, an adjunct professor
Systems, University of Pittsburgh
of computer science, and a faculty member of the Graduate
Program in Intelligent Systems of the University of Pittsburgh. A
former National Science Foundation Presidential Young
Investigator, Professor Ashley has been a visiting scientist at the
IBM Thomas J. Watson Research Center, a Senior Visiting Fellow
at the Institute for Advanced Studies of the University of Bologna
where he is a frequent visiting professor of the Faculty of Law, and
a former President of the International Association of Artificial
Intelligence and Law.

Keynote: AI and Law: from Knowledge-based to Machine


Learning (and Back?) (1000 – 1100, 2nd March 2022)

Traditionally, the field of AI and Law has focused on representing


legal knowledge so that computers can perform legal reasoning,
or something like it, with legally intelligible results. Today, the
research paradigm in AI and Law has largely shifted to applying
new machine learning and natural language processing techniques
to legal texts. Although for some time ML models have been
predicting outcomes of cases directly from their texts, the models
cannot yet explain their predictions or support them with
arguments. This talk considers how text analytic methods could
extract legal concepts and argument structures from case texts.
These may enable knowledge-based computational models of legal
argument to assist ML in offering explanations along with
predictions, enabling legal professionals to better assess the
prediction’s reliability.

12
Speakers and Authors
Speakers are listed alphabetically alongside paper co-authors.

Andrea Stazi is an Associate Professor of Comparative Law and


New Technologies Law and Director of the Innovation Law
Laboratory at the European University of Rome. He holds the
Italian National Scientific Habilitation as Full Professor in
Comparative Law.

Andrea is also Visiting Professor in Biotechnology Law at the


National University of Singapore. He authored many scientific
publications, among which the books “Smart contracts and
comparative law”, Springer 2021, and “Biotechnological
Inventions and Patentability of Life”, Edward Elgar, 2015.
Andrea Stazi
Previously he was Visiting Professor in Law and Technology at the
Singapore Management University, Research Fellow at the
Associate Professor in
Comparative Law and New National University of Singapore, Research Associate at the
Technologies Law, Director of FLACSO University of Buenos Aires, Max Planck Institute for
the Innovation Law Laboratory, Competition & IP Law of Munich, Institute for Information Law of
European University of Rome the University of Amsterdam, and Coordinator of the Master in
Competition and Innovation Law at Luiss University of Rome and
Adjunct Professor at the University of Bologna.

Paper: “Legal Big Data”: From Predictive Justice to


Personalised Law? (1525 – 1600, 4th March 2022)

Benjamin Chen is an interdisciplinary legal researcher interested


in regulatory and judicial institutions. His current research
examines the scope for consequentialist reasoning in law, the
diffusion of policy through the courts, and the impact of artificial
intelligence on justice and its administration. Benjamin graduated
with a J.D., Order of the Coif, from the University of California,
Berkeley in 2017 where he also received his Ph.D in Jurisprudence
and Social Policy. Before joining the University of Hong Kong,
Benjamin was assistant professor in public policy at the National
University of Singapore. He was previously a postdoctoral
research scholar and lecturer-in-law at Columbia University in the
Benjamin Chen City of New York and served as a judicial law clerk on the United
States Court of Appeals for the Ninth Circuit.
Assistant Professor, University of
Hong Kong and Research Paper: The Silent Influence of the Chinese Guiding Cases:
Affiliate, ETH Zurich Center for A Text Reuse Approach (1345 – 1420, 3rd March 2022)
Law and Economics

13
David Colarusso is the Director of Suffolk University Law School's
Legal Innovation and Technology Lab. An attorney and educator
by training, he has worked as a public defender, data scientist,
software engineer, and high school physics teacher. He is the
author of a programming language for lawyers, QnA Markup, an
award winning legal hacker, ABA Legal Rebel, and Fastcase 50
honoree. Additionally, he has been named to the American Bar
Association's Web 100 for his Twitter presence.

Paper: Unsupervised Machine Scoring of Free Response


Answers (Validated Against Law School Final Exam
Questions) (1140 – 1215, 3rd March 2022)
David A. Colarusso
Practitioner in Residence and
Director, Legal Innovation and
Technology Lab, Suffolk
University Law School

Dirk Hartung is the Executive Director of the Center for Legal


Technology and Data Science at Bucerius Law School in Hamburg,
Germany and a Non-Residential Fellow at CodeX – the Stanford
Center for Legal Informatics at Stanford Law School, United
States.

Dirk developed and mostly (co-)taught the technology curriculum


at Bucerius Law School. This curriculum comprises the Legal
Technology Lecture Series and the Technology Certificate, which
consists of introductory classes in Computer Science, Statistics,
Programming, Ethics and Software Development. Since 2018, Dirk
Dirk Hartung has been Academic Director of the Summer Program Legal
Technology and Operations. He is the co-host of the global
MOOC Bucerius Legal Tech Essentials. Dirk supervises bachelor’s
Executive Director, Center for
Legal Technology and Data and master’s theses in computer science and law.
Science, Bucerius Law School and
CodeX Non-Residential Fellow His normative scholarship focuses on Legal Technology in the
areas of professional, procedural and comparative law. His
quantitative research interests include computational legal studies,
legal informatics, data science and natural language processing in
the legal domain. In a joint effort with the Center on the Legal
Profession and The Boston Consulting Group, Dirk regularly
publishes analyses of the legal market, e.g. about Legal Technology
(2016), Legal Operations (2018) and Digital Justice (2022).

Paper: Sharing and Caring: Creating a Culture of


Constructive Criticism in Computational Legal Studies
(1345 – 1420, 4th March 2022)

14
Corinna studied law at Bucerius Law School and Stanford Law
School, completing her First State Exam in Hamburg in 2015. She
obtained a PhD in law from Bucerius Law School and a BSc in
computer science from LMU Munich, both in 2018, as well as an
MSc in computer science from Saarland University in 2020. Her
legal dissertation, which introduces legal network science, was
awarded the Bucerius Dissertation Award in 2018 and an Otto
Hahn Medal in 2020.

Corinna is currently a research associate at the Max Planck


Corinna Coupette Institute for Informatics, where she pursues her PhD in computer
science, and a fellow at the Bucerius Center for Legal Technology
Research Associate, Max Planck and Data Science, where she advances legal data science. She
Institute for Informatics and regularly teaches computer science and programming to
Fellow, Bucerius Center for Legal audiences from other disciplines, including lawyers. Corinna is an
Technology and Data Science, open science enthusiast. Her primary research theme is data, with
Bucerius Law School emphases on exploratory methods for data analysis, expressive
representations of graph data, and transdisciplinary projects. In
the intersection of computer science and law, she strives to create
high-quality datasets and develop domain-specific methods to
measure, monitor, and manage legal systems.

Paper: Sharing and Caring: Creating a Culture of


Constructive Criticism in Computational Legal Studies
(1340 – 1415, 4th March 2022)

Elizaveta obtained a bachelor degree from Kutafin Moscow State


Law University and later graduated from Singapore Management
University with an LLM. She carried out legal research work for
different projects, including research for Asia Pacific Economic
Cooperation (APEC) where she co-authored one of the annual
publications, and Facebook (Meta) where she became one of the
grant awardees as part of ETPL.Asia, a research entity currently
housed under the Centre for Technology, Robotics, AI and the
Law in NUS Law. Elizaveta has also written several articles on the
topic of international law and legal technology, published both in
English and Russian. She’s currently pursuing a career in
Elizavetas Shesterneva international law. Her topics of interest include regulations in the
field of artificial intelligence, personal data protection and cross-
ETPL.Asia, contributor and border dispute resolution.
Promsvyazbank PJSC, Leading
legal counsel Paper: How Could We Perfect the Use of Artificial
Intelligence in Litigation (1415 – 1450, 2nd March 2022)

15
Frank Fagan is an Associate Professor of Law at EDHEC Business
School and Scientific Director of the EDHEC Augmented Law
Institute. He teaches and writes on corporate law and artificial
intelligence. He is co-editor of the Research in Law and Economics
and the popular AI Law Blawg. Papers can be viewed at
https://ssrn.com/author=468606.

Paper: The Un-Modeled World (1700 – 1735, 3rd March)

Frank Fagan
Associate Professor of Law,
EDHEC Business School and
Scientific Director, EDHEC
Augmented Law Institute

Gregor Behnke studied Computer Science at the University of


Rostock. He subsequently obtained his PhD from Ulm University
in 2019. During his PhD he was part of the Collaborative Research
Center Transregio 62 "A Companion-Technology for Cognitive
Technical Systems" and its industry transfer project "Do it
yourself, but not alone: Companion-Technology for Home
Improvement". His work is mainly focussed on model-based AI
and in particular planning. Gregor is currently also studying Law
at the University of Hagen. Since 2020 Gregor has been a post-
doctoral researcher at the University of Freiburg's Foundations of
AI group headed by Prof. Bernhard Nebel. He will be joining the
Institute for Logic, Language and Computation at the University
Gregor Behnke of Amsterdam this summer.
Chair of Foundations of Artificial
Paper: Quantifying Ambiguities in German Court
Intelligence, Department of
Computer Science, University of
Decisions – Identifying Problematic Terms of Legal
Freiburg Language (1610 – 1645, 3rd March 2022)

Niklas Wais studied law in Freiburg and Istanbul. He was a


research assistant at University of Freiburg's Institute for Media
and Information Law (Department 1: Civil and Commercial Law)
from 2019 to 2021, where he founded and supervised a project
aimed at computer science education for law students
(“FUTURE”). During the time Niklas was granted a fellowship
from “KI Campus”, which is i.a. funded by the German Federal
Ministry of Education and Research to strengthen interdisciplinary
university programs on AI. Since 2021, he has been a research
assistant and PhD candidate at the Institute for Media and Data
Law and Digitalization at University of Leipzig. His current focus
of interest lies in the application of NLP methods to law-related
Niklas Wais texts.

Research Assistant, Leipzig Paper: Quantifying Ambiguities in German Court


University Decisions – Identifying Problematic Terms of Legal
Language (1555 – 1630, 3rd March 2022)

16
Iris Schepers is a Dutch PhD researcher at the Department of
Legal Methods at the Faculty of Law of the University of
Groningen. She has a background in information science and her
research focusses on the use of machine learning techniques on
legal big data. She is associated with the ERC EVICT project and
her work will specifically focus on the empirical research into the
meaning of the right to housing and its impact.

Paper: Predicting Citations in Dutch Case Law with


Natural Language Processing (1555 – 1630, 2nd March
2022)

Iris Schepers
PhD researcher, Department of
Legal Methods, Faculty of Law,
University of Groningen

Jason is a proud husband and father of three. He is currently


engaged as Director of Rules as Code in the Canadian federal
public service. He holds a Master of Laws in Computational Law
from the University of Alberta. He is the author of Blawx, a user-
friendly web-based tool for Rules as Code. Jason was recently a
senior researcher with the Singapore Management University
Centre for Computational Law, and his work there has been
published in the proceedings of the International Conference on
Artificial Intelligence and Law. He also taught a course on the open
source legal automation tool Docassemble as a sessional
instructor at the University of Alberta Faculty of Law. His students
Jason Morris have won international competitions in developing legal service
applications for their work in that course.
Director of Rules as Code, the
Canadian federal public service Paper: Using OpenFisca to Power Expert Systems in the
Canadian Public Service: Lessons Learned (1340 – 1415,
2nd March 2022)

Jeffery Atik is Professor of Law at Loyola Law School in Los


Angeles, where he teaches courses in Artificial Intelligence and the
Law, Innovation and Competition Law, and Financial Technology.
He is also Guest Professor of Civil Law at Sweden's Lund
University where he is co-Principal Investigator of the Quantum
Law Project funded by the Wallenberg Foundation. Atik is a
graduate of Yale Law School and holds an economics Ph.D. from
the Universidad Autonoma de Madrid. He has practiced corporate
law in New York, Boston and Milano.

Paper: Parsimony - Using Feature Engineering to


Jeffery Atik Critique Private Law Rules (1135 – 1210, 2nd March 2022)

Professor of Law and Jacob


Becker Fellow, Loyola Law
School - Los Angeles

17
Professor Jonathan H. Choi specializes in tax law, statutory
interpretation, and computational analysis of law. His work has
appeared in the New York University Law Review, the Stanford
Law Review, and the Yale Law Journal, among others.

Professor Choi earned his B.A. from Dartmouth College, summa


cum laude, with a triple major in Computer Science, Economics,
and Philosophy, and a J.D. from the Yale Law School, where he
was Executive Bluebook Editor of the Yale Law Journal. Before
entering the academy, he practiced tax law at Wachtell, Lipton,
Rosen & Katz.
Jonathan H. Choi
Paper: Computational Corpus Linguistics (1135 – 1210,
Associate Professor, University 4th March 2022)
of Minnesota Law School

Joshua Tan is a mathematician and computer scientist whose


research focuses on applications of higher mathematics to the
design of intelligent systems. He is a practitioner fellow at
Stanford's Digital Civil Society Lab, a postdoctoral researcher at
the University of Oxford, and the executive director of Metagov,
a nonprofit research collective that builds standards and
infrastructure for digital self-governance. He is also an executive
editor of Compositionality, a peer-reviewed journal, and editor of
Applied Category Theory, a book series published by Cambridge
University Press.

Joshua Tan Paper: Clause2Game: Modeling Contract Clauses with


Composable Games (1100 – 1135, 4th March 2022)
Practitioner Fellow, Stanford's
Digital Civil Society Lab,
Postdoctoral researcher at the
University of Oxford, Executive
Director of Metagov, and
Executive Editor of
Compositionality

Philipp holds a PhD in economics. His research is at the


intersection of microeconomics and theoretical computer
science. He is currently assistant professor at the University of St.
Gallen, Switzerland.

Paper: Clause2Game: Modeling Contract Clauses with


Composable Games (1100 – 1135, 4th March 2022)

Philips Zahn
Assistant Professor, University of
St. Gallen

18
Kat Albrecht is an Assistant Professor in the Department of
Criminal Justice and Criminology in the Andrew Young School of
Policy Studies at Georgia State University. She holds a PhD in
Sociology from Northwestern University and a JD from the
Northwestern University Pritzker School of Law.

Dr. Albrecht’s work sits at the intersection of computational


social science and law, where she uses innovative computational
techniques to study fear, violence, and data distortions. She is
particularly interested in the nexus of fear and risk-taking
behaviors, digital trace data, and the impact of law on decision-
Kat Albrecht making. She frequently serves as a computational science expert
for the defense on active legal cases about life without the
Assistant Professor, Criminal opportunity of parole, felony murder, gang enhancements, and
Justice and Criminology empirical data.
Department, Andrew Young
School of Policy Studies, Georgia Dr. Albrecht’s work has been published in outlets like Law &
State University Policy, Nature Human Behavior, the Journal of Criminal Law and
Criminology, and Law, Technology, and Humans, among others.
She is also a long-time site organizer for the Summer Institutes in
Computational Social Science.

Paper: Public Records Aren’t Public: Systemic Barriers


to Quantifying Racism (1100 – 1135, 2nd March 2022)

Kaitlyn Filip is a JD-PhD student in Communication Studies:


Rhetoric and Public Culture at Northwestern University. Her
work focuses on how public discourses around crises and scandals
shape policy-making and legal advocacy and how access to court
data informs and constrains court argumentation. She is a fellow
with the Chicago Appleseed Center for Fair Courts.

Paper: Public Records Aren’t Public: Systemic Barriers


to Quantifying Racism (1100 – 1135, 2nd March 2022)

Kaitlyn Filip
JD-PhD student in
Communication Studies,
Northwestern University

19
Professor McCarty has taught both law and computer science
since the 1970s. He was a pioneer in the applications of artificial
intelligence to the law, starting with the TAXMAN project at
Stanford in 1972. In 1987 he was a co-founder and the first
Program Chair of the International Conference on Artificial
Intelligence and Law (ICAIL), and in 1991 he was elected the first
President of the International Association for Artificial Intelligence
and Law (IAAIL). He is the author of more than 70 articles and
papers in the field, on logic programming, knowledge
L. Thorne McCarty representation and reasoning, natural language processing,
machine learning, and more.
Professor Emeritus of Computer
Science and Law, Rutgers Paper: Bridging the Gap between Machine Learning and
University Logical Rules in Computational Legal Studies (1100 –
1135, 3rd March 2022)

Malcolm Langford is Professor of Public Law and Director of the


Centre of Excellence for Experiential Legal Learning (CELL),
Faculty of Law, University of Oslo. He was Chair of the Academic
Forum on Investor-State Dispute Settlement (which supports the
UNCITRAL WG III reform process) (2019-2021) and is Co-editor
of the Cambridge University Press book series, Globalization and
Human Rights. Langford is also an Associate Fellow at the
PluriCourts Centre of Excellence, where he leads the Norwegian
Research Council funded project, Compliance Politics and
International Investment Disputes (COPIID). He has been
awarded the John H. Jackson Prize for Best Article in the Journal
of International Economic Law for a paper using computational
Malcolm Langford network analysis and his publications include 'Computational
stylometry: predicting the authorship of investment arbitration
Professor of Public Law and awards', in R. Whalen, Computational Legal Studies: The Promise and
Director of Centre on Challenge of Data-Driven Research (Edward Elgar, 2020).
Experiential Legal Learning,
University of Oslo
Paper: Fait Accompli: Predicting the Outcomes of
Investment Treaty Negotiations (1415 – 1450, 4th March
2022)

20
Runar Hilleren Lie is a PhD Fellow at the PluriCourts Centre of
Excellence, University of Oslo. He holds a law degree from the
University of Oslo and has previously worked as an entrepreneur
and developer. As a core participant in the LEGINVEST research
project, Lie uses empirical and computational methods to study
the behavior of actors in the international investment law system.
He has been awarded the John H. Jackson Prize for Best Article
in the Journal of International Economic Law for a paper using
computational network analysis and his publications include
'Computational stylometry: predicting the authorship of
investment arbitration awards', in R. Whalen, Computational Legal
Runar Hilleren Lie Studies: The Promise and Challenge of Data-Driven Research (Edward
Elgar, 2020).
PhD Fellow at the PluriCourts
Centre of Excellence, University Paper: Fait Accompli: Predicting the Outcomes of
of Oslo Investment Treaty Negotiations (1415 – 1450, 4th March
2022)

Masha Medvedeva is a computational linguist and an


interdisciplinary PhD candidate at Center for Language and
Cognition Groningen and Department of Legal Methods at
University of Groningen, where she is working on automatic
forecasting and categorising judicial decisions. The main focus of
her project is forecasting decisions of the European Court of
Human Rights (ECtHR) and identifying patterns within documents
published by the Court. Masha also works with the Dutch case
law. She is part of a team of the EVICT project
(https://www.eviction.eu/) which focuses on automatic analysis of
eviction litigation in Europe.
Masha Medvedeva
Paper: Rethinking the Field of Automatic Prediction of
Court Decisions (1810 – 1845, 3rd March 2022)
PhD candidate, Center for
Language and Cognition
Groningen, Department of Legal
Methods, University of
Groningen

Matthew Waddington is the deputy head of Jersey's Legislative


Drafting Office. He has been a legislative drafter for over 18 years,
and qualified as an English solicitor in 1990 (now non-practising).
After attending the European University Institute's "Law and
Logic" summer school in 2018 he became involved in "Rules as
Code". He tries to contribute a legislative drafter's perspective to
computational law while encouraging other legislative drafters to
take up the opportunities presented by developments in encoding
and marking up legislation.

Paper: The Statute Data Swamp: NLP, Data Cleansing,


Matthew Waddington Data Interpretation and Legislative Drafting (1520 –
1555, 3rd March 2022)
Deputy Head, Jersey's Legislative
Drafting Office

21
Megan Ma is a residential fellow at CodeX. Her research considers
the limits of legal expression, in particular how code could
become the next legal language. Megan is also the Managing Editor
of the MIT Computational Law Report and a Research Affiliate at
Singapore Management University in their Centre for
Computational Law. As well, she received her PhD in Law at
Sciences Po and was a lecturer there, having taught courses in
Artificial Intelligence and Legal Reasoning, Legal Semantics, and
Public Health Law and Policy. She has previously been a Visiting
PhD at the University of Cambridge and Harvard Law School
respectively.
Megan Ma Paper: Clause2Game: Modeling Contract Clauses with
Composable Games (1100 – 1135, 4th March 2022)
CodeX Residential Fellow,
Managing Editor, MIT
Computational Law Report, and and
Research Affiliate, CCLAW
Lache-ing Onto Change: Object-Oriented Legal
Evolution (1415 – 1450, 3rd March 2022)

Dmitriy Podkopaev leverages his engineering skills to solve a


multitude of challenges as a senior Data Scientist at Simmons
Wavelength Ltd., the tech arm of the London-based law firm
Simmons & Simmons. With a wide variety of projects under his
belt and a background in Compliance and Anti-Money Laundering,
Dmitriy specialises in translating expert legal knowledge into data
modelling and analysis algorithms using Natural Language
Processing and other forms of ML/AI.

Paper: Lache-ing Onto Change: Object-Oriented Legal


Evolution (1735 – 1810, 3rd March 2022)
Dmitriy Podkopaev

Senior Data Scientist, Simmons


Wavelength Ltd and Research
Affiliate, CCLAW

22
Previously a mathematician with interests in graph theory, Avalon
Campbell-Cousins is now a PhD student in biomedical signal
processing at the University of Edinburgh and is working on
building brain network models for the analysis of Alzheimer’s
disease. Furthermore, he has an interest in building models of
natural language for the processing of legal text, aiming to increase
accessibility of the law and analyze its temporal development. He
is actively researching in this area with his multi-disciplinary team
and in partnership with SMU.

Paper: Lache-ing Onto Change: Object-Oriented Legal


Avalon Campbell- Evolution (1735 – 1810, 3rd March 2022)
Cousins
PhD student, University of
Edinburgh and Research Affiliate,
CCLAW

Having completed a degree in Theoretical and Applied Linguistics


at the University of Cambridge, Adam has recently accepted an
MPhil offer to pursue research in the same place. He specialises in
historical linguistics, combining cutting-edge nanosyntactic theory
with traditional philological methods in accounting for language
change. Adam is extending this approach into legal technology,
examining the potential role of core linguistics in analysing legal
text.

Paper: Lache-ing Onto Change: Object-Oriented Legal


Adam Nicholas Evolution (1735 – 1810, 3rd March 2022)

Historical Linguist, University of


Cambridge and Research
Affiliate, CCLAW

23
Jerrold is an assistant law professor, legal analytics startup
founder, and self-taught programmer. He teaches torts, a legal
area of under-appreciated relevance to technology and AI
regulation. His research centrally asks how we might best deliver
justice at scale to those who need it. It is well-recognised that
having more lawyers working harder, faster, and smarter, though
important, is not enough. He explores how technology can, will,
and should be used to do justice. 'Can' is a technical question on
how we might computationally represent law's logic, while
empirically analysing law's experience. 'Will' is a predictive
question informed by market study of legal technology and
innovation trends. 'Should' is a normative question which must be
Jerrold Soh informed by law, policy, and ethics, particularly where algorithmic
fairness or autonomous system liability is concerned. Jerrold
Assistant Professor of Law and draws broadly on his backgrounds in law, economics, and
Deputy Director, CCLAW programming to answer these questions.

Paper: Lache-ing Onto Change: Object-Oriented Legal


Evolution (1735 – 1810, 3rd March 2022)

Omer Faruk Erol graduated from Istanbul University Faculty of


Law in 2010. He completed his master's degree in 2012 with his
thesis entitled "Environmental Impact Assessment According to
the Environmental Law" and completed his doctorate in 2017 with
his thesis titled "Civil Aviation Activities of Administration" at
Marmara University. He worked as a visiting researcher at
Harvard University with the YÖK scholarship for master thesis
studies and as a graduate research trainee at McGill University
with the TUBITAK scholarship for doctoral thesis studies. From
the year he graduated, he worked as a research assistant at Yalova
University, Istanbul Medeniyet University and Marmara University
Ömer Faruk Erol respectively. He is currently serving as an assistant professor at
the Faculty of Law at Ibn Haldun University.
Assistant Professor in
Administrative Law, School of Paper: Legal Judgment Prediction and Availability of
Law, Ibn Haldun University Machine Learning in Turkish Jurisdiction (1520 – 1555, 2nd
March 2022)

24
Ahmet Kaplan received his BS degree of Electric-Electronic
Engineering in Bilkent University, along with his MSc and PhD in
Erciyes University. He worked as instructor at Erciyes University
Engineering Faculty and at Civil Aviation School. Between 2000
and 2005, he studied as post-doctorate at the Georgia Institute of
Technology in America and he worked with international
companies as Software Development Project Manager. From 2005
to 2011, he worked at Turksat as IT Director and Vice President
(CIO). With his team, he managed the rewriting of whole system,
the e-government portal (www.turkiye.gov.tr), which has a huge
impact on Turkey’s transition to Information Society. Between
Ahmet Kaplan 2012 and 2014, he fulfilled a duty as the Institute Director at
Turkish Academic Network & Information Center (ULAKBİM),
Information Technologies TUBİTAK and technical coordinator of the FATIH Project,
Director, Ibn Haldun University Ministry of Education. In 2014, he was assigned as vice president
at Turksat once again. At 2017, he started to work at Ibn Haldun
University as Information Technologies Director.

Paper: Legal Judgment Prediction and Availability of


Machine Learning in Turkish Jurisdiction (1520 – 1555, 2nd
March 2022)

Gulnihal Ahter Yakacak graduated from the Faculty of Law at


Selçuk University in 2010. She completed a double major program
at Selcuk University's Faculty of Communication, Department of
Journalism, the same year. In 2014, she completed a master's
degree program with a thesis titled " Presidency In The
Government System And Turkey". She is currently writing her
Ph.D. dissertation, titled "Electoral Timing as an Empowerment
Tool of Executive Power in Presidential Systems." She was a
visiting research fellow at Harvard University Law School for
master's thesis studies and at the Boston University Center for
Latin American Studies for her Ph.D. dissertation. She is currently
Gülnihal Ahter Yakacak a research assistant at Ibn Haldun University's Faculty of Law,
having previously worked at Yalova University and Istanbul
Research Assistant, School of
University. Her research interests include constitutional law,
Law, Ibn Haldun University human rights, government systems, and electoral law.

Paper: Legal Judgment Prediction and Availability of


Machine Learning in Turkish Jurisdiction (1520 – 1555, 2nd
March 2022)

25
Muhammet Talha Kaan graduated from Fatih Sultan Mehmet
Foundation University Faculty of Law in 2017 and from Anadolu
University, Faculty of Economics, Department of Public
Administration in 2018. After law education, he completed law
internship at the Istanbul Bar Association in 2018. After
completing his master's studies with his thesis entitled “Slot
Applications with Legal Dimension” in the International and
Comparative Law Program of Ibn Haldun University, he started
his doctorate studies at Marmara University Public Law Program
in 2021. He is fluent in English and introductory Arabic. His areas
of interest are; administrative law, international law,
Muhammet Talha Kaan computational legal science, air and space law.

Paper: Legal Judgment Prediction and Availability of


Postgraduate Student, Ibn
Machine Learning in Turkish Jurisdiction (1520 – 1555, 2nd
Haldun University
March 2022)

Safa Nur Altuncu Kaan graduated from Boğaziçi University,


Department of Sociology in 2020. In 2019, She visited School of
Oriental African Studies for a semester in UK. Right after her
graduation, she began her MA degree at Boğaziçi Sociology
Program. She is a research assistant in Ibn Haldun University
Sociology Department since June 2021. She is fluent in English and
Arabic. Research interests: National Identity, State-based National
Identity, Nation-State, Religion, Religiosity, Quantitative Methods,
Data Analysis and Visualization, Computational Social Sciences.

Paper: Legal Judgment Prediction and Availability of


Safa Nur Altuncu Kaan Machine Learning in Turkish Jurisdiction (1520 – 1555, 2nd
March 2022)
Academic Staff, School of
Humanities and Social Sciences,
Ibn Haldun University

26
Pierpaolo Vivo studied Physics in Parma (Italy), where he
graduated in Theoretical Physics in 2005. He obtained his PhD in
Mathematics in 2008 from Brunel University (West London). He
spent three years as a postdoctoral fellow in the Condensed
Matter and Statistical Physics group at the Abdus Salam ICTP
(Trieste). There he worked on applications of Random Matrix
Theory and Statistical Mechanics to the stability of financial
markets. During the period 2011-2014 he worked as a research
scientist at the Laboratoire de Physique Théorique et Modèles
Statistiques (LPTMS) in Orsay (France). He has been a permanent
member of the Disordered Systems group at King's College
Pierpaolo Vivo London since September 2014. He is an Associate Editor for the
Journal of Statistical Physics (Springer), for Scientific Reports
(Nature), and for Complex Systems. He has acted as Guest Editor
Director of the Quantitative and
Digital Law Lab, Department of in other guest-edited collections, notably ‘The Physics of the Law’
Mathematics, King’s College (Frontiers). He leads the Quantitative and Digital Law Lab in the
London (UK), and UKRI Future Department of Mathematics at King’s College London, a team of
Leaders Fellow five people supported by a UKRI Future Leaders Fellowship that
promotes a quantitative approach to issues around the complexity
of legal systems.

Paper: Information Retrieval and Structural Complexity


of Legal Trees (1700 – 1735, 2nd March 2022)

Ron van der Meyden is a Professor in the School of Computer


Science and Engineering at UNSW Sydney. He currently leads the
UNSW Interest Group on Blockchain, Smart Contracts and
Cryptocurrency. His research interests include the foundations
of distributed and multi-agent systems and computer security, and
his contributions have included work on deontic logic, deductive
databases, foundations of public key infrastructure, model
checking and synthesis for epistemic logic, and information flow
security. He received the ACM Distinguished Scientist Award in
2009.

Ron previously held positions at the University of Technology,


Ron van der Meyden Sydney, the Weizmann Institute of Science, and NTT Basic
Research Laboratories, Tokyo, and has held visiting positions at
Professor, School of Computer NYU and Stanford. He was a member of the teams that
Science and Engineering,
constructed successful bids for and established the Cooperative
University of New South Wales,
Sydney Research Centre for Smart Internet Technology and National ICT
Australia (now part of Data61). He served in 2001 as a research
program leader in the Cooperative Research Centre for Smart
Internet Technology, before establishing the Formal Methods
program of National ICT Australia, which he led from 2002 to
2006. Outcomes from this program include the verification of the
SeL4 micro-kernel, and Red Lizard Software, a static analysis
spinout which was acquired by Synopsys.

Paper: From Contract to Smart Legal Contract - A Case


Study using the Simple Agreement for Future Equity
(1450 – 1525, 4th March 2022)

27
Shaun Lim is a Research Assistant with the Centre for Technology,
Robotics, Artificial Intelligence & the Law. Shaun graduated from
NUS in 2018 with a Bachelor of Laws, and was called to the Bar
in August 2019. Shaun has interdisciplinary research interests in
the intersection of law and scientific fields such as technology,
computer science, as well as probability and statistics, particularly
the use of artificial intelligence in legal contexts such as judicial
decision making.

Paper: The Promises and Pitfalls of Computational Law:


A Meta-Analysis of the Existing Literature (1735 – 1810,
3rd March 2022)
Shaun Lim
Research Assistant, Centre for
Technology, Robotics, Artificial
Intelligence and the Law (TRAIL),
Faculty of Law, National
University of Singapore

Daniel Seng is an Associate Professor of Law and Director of the


Centre for Technology, Robotics, AI & the Law (TRAIL) at NUS.
He teaches and researches on information technology and
intellectual property law. He graduated with firsts from NUS and
Oxford and won the Rupert Cross Prize in 1994. His doctoral
thesis with Stanford University involved the use of machine
learning, natural language processing and data analytics to analyze
the effects and limits of automation on the DMCA takedown
process. Dr. Seng is a special consultant to the World Intellectual
Property Organization and has presented and published papers on
differential privacy, electronic evidence, information technology,
Daniel Seng intellectual property, artificial intelligence and machine learning at
various local, regional and international conferences. He has been
Associate Professor of Law and
a member of various Singapore government committees that
Director of the Centre for undertook legislative reforms in diverse areas such as electronic
Technology, Robotics, Artificial commerce, cybercrimes, digital copyright, online content
Intelligence and the Law (TRAIL), regulation and data protection.
Faculty of Law, National
University of Singapore Paper: The Promises and Pitfalls of Computational Law:
A Meta-Analysis of the Existing Literature (1735 – 1810,
3rd March 2022)

28
Papers and Abstracts
Papers are listed by order of conference presentation. Full papers may be found here.
Public Records Aren’t Public: Systemic Barriers to Quantifying Racism Day 1,
Kat Albrecht, Kaitlyn Filip Session 1,
Talk 1
In a new era of computational legal scholarship, computational tools exist with the
capacity to quickly and efficiently reveal hidden inequalities in the justice system.
Technically, the laws exist that legally entitle the public to the requisite court records.
However, the opaque bureaucracy of the courts prevents us from connecting the public
to documents they technically own. We exemplify this legal ethical problem by
investigating areas of law where codified protections against inequalities exist and where
computational tools could help us understand if those protections are being enforced. In
general, the computational requirements of such projects needn't be complex, making
them even more attractive as solutions for auditing justice. Using the backdrop Cook
County felony courts, the largest criminal court in the United States, we establish the
impossibility of securing the public records needed to quantify the illegal use of racially-
motivated peremptory strikes that serve as the bottleneck to the relatively simple
computational process of quantifying racism.

This problem is so egregious, that even the number of criminal trials in Cook County is
concealed from the public. The Administrative Office of Illinois Courts Annual Report
states that 7% of defendants went to trial, indicating they have the data, but do not
elaborate on the actual number of trials. This becomes crucial because of the burden
placed on the public when requesting the court transcripts that are necessary to reveal
the frequency of racially-motivated peremptory strikes. A requester is required to
provide the date of the hearing, the case name, the case number, the courtroom number,
and the name of the judge - for each transcript. This creates a scaling problem when
attempting to perform the type of macro-level computational analysis that can reveal
systemic patterns. The transcripts are also priced to make large scale analysis implausible,
with original transcripts starting at 4.00 per page for documents of unspecified lengths
that can easily be hundreds of pages long. This means that a relatively simple
computational project, requiring only basic natural language processing, is stymied by
court processes that turn accessing public records into an insurmountable barrier. This
article considers the ethical implications of the lack of access to public legal complaints
and both the limits that computational legal scholars are facing and ways that these
scholars can provide solutions.

29
Parsimony - Using Feature Engineering to Critique Private Law Rules Day 1,
Jeffery Atik Session 1,
Talk 2
Parsimony, as a design principle, is the insight that fewer features may be preferable in
many contexts. Features may be redundant, they may offset the effect of other features
and they may make little effective contribution to the outcome. Parsimony restrains the
impulse to accrete complexity through the addition of features. Adding features increases
the risk of overfitting the model to the data on which it is built. Each additional feature in
a model compounds the effect of variability. And there is an economy to modeling: the
respective contributions of each feature to the overall performance of the model must
outstrip the costs to justify the feature’s inclusion. Black letter law - the law we find in
Restatements and treatises - are models of law. They specify features for legal
consideration (‘factors,’ ‘elements’); the values of those features determine the output of
a legal rule.

This paper explores the principle of parsimony as applied to legal rules. Recasting legal
rules into algorithmic form and subjecting them to testing can reveal the strengths and
weaknesses of current legal formulations. Law accretes novel tests and added
considerations, evolving rules that appear to be more precise, yet which burden the law
with undesirable complexity. Lawmakers can learn to appreciate elegance in design - a
less-than-more notion that praises the achievement of an objective in a way that draws
fewer resources. A parsimonious approach to law queries whether every feature’s
presence within a legal rule can be justified. By selectively eliminating single features from
the model of a rule and then testing the resultant we can reveal opportunities for feature
reduction without overly impairing the power of the rule. Parsimony as a general design
principle and feature elimination intended to reduce or eliminate algorithmic bias interact.
Law, old and new, rejects the introduction of certain features - such as race or gender -
into legal rules on policy grounds. Algorithmic bias analysis reveals the reintroduction of
these attributes indirectly through feature proxies. Eliminating features that function as
conduits of bias can work in parallel with the application of parsimony - although at a
certain point our legal models may become too thin to be functional.

Using OpenFisca to Power Expert Systems in the Canadian Public Day 1,


Service: Lessons Learned Session 2,
Jason Morris Talk 1

Jason Morris is the Director of Rules as Code for Service Canada, a department of the
Canadian federal government. He is participating in a project to develop a prototypical
citizen-facing application to enhance Canadians' understandings of their entitlement to
four different government benefit programs implemented under Canada's Old Age
Security Act. The project is using a "rules as code" encoding of the relevant legislation on
the OpenFisca open source microsimulation platform. The paper will discuss the selection
of OpenFisca, the style of OpenFisca encoding we have used, the challenges of integrating
Rules as Code development with teams not familiar with the methodology, and our future
outlook on the use of OpenFisca as a platform for Rules as Code in the development of
legal expert systems inside public services. The project is anticipated to be the first
deployment of OpenFisca by a national government for use in a citizen-facing legal expert
system. It is likely that we would be possible to demonstrate a pre-release version of the
software as part of the presentation.

An online version of the paper may be found here.

30
How Could We Perfect the Use of Artificial Intelligence in Litigation Day 1,
Elizavetas Shesterneva Session 2,
Talk 2
The goal of the paper is to examine the ways AI can help lawyers, judges, and other legal
professionals in litigation proceedings and to determine the ethical considerations that
come with it. We have already witnessed that AI can perform a wide range of functions
– predicting court case outcomes, analyzing legal documents, with some researches
suggesting that in the future AI will replace human judges or create a “perfect” precedent
system. For instance, the Supreme Court of Victoria in Australia has welcomed the use
of Technology Assisted Review (TAR) in situations when the manual review of the
documents would be too expensive or would take too much time.

Legal Judgment Prediction and Availability of Machine Learning in Day 1,


Turkish Jurisdiction Session 3,
Ömer Faruk Erol, Ahmet Kaplan, Gülnihal Ahter Yakacak, Muhammet Talha Kaan, Safa Nur Talk 1
Altuncu Kaan

The judiciary, one of the three main organs of the state, is administered by independent
courts, according to the Turkish Constitution. The legal system in Turkey is divided into
three branches: judicial, administrative, and constitutional courts. Each branch of the
judiciary makes a distinct kind of decision. In order to ensure judicial consistency and
systematicity, "Legal Guide" published in 2013by the Department of Justice. This guide
was created to establish a minimum standard for the evidence that should be collected
by the judge and to reduce the mistakes that occur frequently in practice. Also, the
Council of State's "Decision Writing Guide" published in 2020, is an important source
that shows judges the principles and form conditions to be followed when writing a
decision. In this context, both guides both provide the formal content in terms of the
basis and content of the decisions, and also reveal the common rules for decision writing.
Examination on the basis of these common rules is a roadmap for legal prediction.

The Turkish Ministry of Justice has digitized the justice services through the National
Judicial Network Information System (UYAP). With UYAP, users and judicial bodies
exchange all kinds of information and documents electronically. However, in this system,
judicial decisions are not open access for the public. The Courts of the Turkish Judiciary;
The Constitutional Court has created databases with 14922 precedents, the Council of
State with 85374 and the Court of Cassation with 6137799 precedents. In this study, it
is aimed to evaluate the accessible court decisions and the decisions submitted by private
third parties, determining the technical and legal requirements for the accessible data set
and the requirements for determining and constructing an appropriate data set for the
ML technique to be employed in the research to be conducted using the computational
law study methodology.

This article is going to discuss the process of judgment in the Turkish Judiciary, and
alternative systematization of decision-making process. The alternative systematization
would help to uncover the logic behind the decisions that are made until today. Moreover,
it could also help to understand which methods and algorithms are able to help to make
a prediction of future judgments using the existing Turkish Judgment cases. Many
researchers used various methods such as text extraction, classification and clustering
methods (Liu and Hsieh, 2006: Aletras et al., 2016; Sulea et al, 2017), Natural Language
Processing (NLP) (Kim, 2014; Baharudin et al., 2010: Tang et al., 2015) and Multi Task
Learning (MTL) using both hard parameter sharing and soft parameter sharing. This
research article targets to build a model of the decision-making of Turkish Judiciary in

31
order to turn Turkish Juridical cases into a form which will make them appropriate to
process through machine-learning techniques. This preliminary methodological discussion
in this research article is going to make a contribution for the researchers who will use
the existing data pools in the field of Computational legal studies. In order to analyze
linguistic models of juristic data, BERT model structures will be used.

Predicting Citations in Dutch Case Law with Natural Language Day 1,


Processing Session 3,
Iris Schepers, Martijn Wieling, Masha Medvedeva, Michel Vols, Michelle Bruijn Talk 2

This paper investigates whether it is possible to predict with a reasonable degree of


certainty how often court decisions will be cited by other courts.

With the ever-growing accessibility of case law online, it has become challenging to
manually identify case law that is relevant to one’s legal issue. An example of a country in
which this problem is becoming increasingly prevalent is the Netherlands. Between 2016
and 2020 the percentage of published judgments has more than doubled, from 3.5% in
2016 to 7.2% in 2020. This amounts to 38,000 decisions per year, ranging from lower
level courts, such as district courts, to higher courts such as the Court of Appeal and the
Supreme Court. The ambition of the Dutch council for the judiciary is to implement a
system in which 75% of all cases are published. According to scholars this will lead to
problems regarding the searchability of the data.

Hence, datasets of this size call for ways to automatically analyse data, as doing so
manually is time consuming. One such method is machine learning. Over the past few
decades machine learning techniques have been used for a variety of tasks in the field of
Artificial Intelligence & Law. In legal judgment forecasting, the outcome of a court case is
predicted from the facts of the case with the help of algorithms and natural language
processing (NLP). Research shows that the texts of legal proceedings hold valuable
information for algorithms. Research has also shown that there is a significant relation
between citations and case authority, which is the extent to which a case is deemed
important for settling other legal disputes.

Because of the Dutch ambitions to eventually publish many more decisions, and because
there is a substantial amount of currently available data, we are using Dutch data. We aim
to gain insights into the most informative features of a case, and discover if there are
certain words, phrases, or characteristics that increase a decision’s citability. As such, this
paper aims to bridge the gap between citation analysis and NLP, and solidify the relation
between the text of a case and its importance, and its citation network.

By determining predictors, it will be possible to predict the authority of a decision before


a case has even been published. As such, a prediction could be used to help label cases
based on importance, thus helping legal practitioners and scholars to find the most
important decisions regarding their legal issue more easily and reducing the time spent
on preparation and analysis drastically. Two prediction tasks are performed; a binary task
in which we predict whether or not a decision is cited by other case law at all, and a
regression task in which we predict the exact number of incoming citations. The first
experiments show promising results that indicate a link between the texts of decisions
and the incoming citations, but further research will be presented using a larger, more
up-to-date dataset.

32
Information Retrieval and Structural Complexity of Legal Trees Day 1,
Pierpaolo Vivo, Alessia Annibale, Evan Tzanis, Luca Gamberi, Yanik-Pascal Förster Session 4,
Talk 1
We introduce a model for the retrieval of information hidden in legal texts. These are
typically organised in a hierarchical (tree) structure, which a reader interested in a given
provision needs to explore down to the ``deepest'' level (articles, clauses,...). We assess
the structural complexity of legal trees by computing the mean first-passage time a
random reader takes to retrieve information planted in the leaves. The reader is assumed
to skim through the content of a legal text based on their interests/keywords, and be
drawn towards the sought information based on keywords affinity, i.e. how well the
Chapters/Section headers of the hierarchy seem to match the informational content of
the leaves. Using randomly generated keyword patterns, we investigate the effect of two
main features of the text -- the horizontal and vertical coherence -- on the searching time,
and consider ways to validate our results using real legal texts. We obtain numerical and
analytical results, the latter based on a mean-field approximation on the level of patterns,
which lead to an explicit expression for the complexity of legal trees as a function of the
structural parameters of the model. Policy implications of our results are briefly
discussed.

Bridging the Gap between Machine Learning and Logical Rules in Day 2,
Computational Legal Studies Session 1,
L. Thorne McCarty Talk 1

Contemporary research on Computational Legal Studies uses two distinct


methodologies, sometimes described as “data-centric” and “logic-based” models. There
are compelling reasons to combine the two. This talk will analyze an example of both
technologies applied to a single statutory provision, and outline a theoretical synthesis of
machine learning and logical rules that bridges this gap, based on current work by the
speaker.

Unsupervised Machine Scoring of Free Response Answers (Validated Day 2,


Against Law School Final Exam Questions) Session 1,
David A. Colarusso Talk 2

This paper presents a novel method for unsupervised machine scoring of short answer
and essay question responses, relying solely on a sufficiently large set of responses to a
common prompt, absent the need for pre-labeled sample answers—given said prompt is
of a particular character. That is, for questions where “good” answers look similar,
“wrong” answers are likely to be “wrong” in different ways. Consequently, when a
collection of text embeddings for responses to a common prompt are placed in an
appropriate feature space, the centroid of their placements can stand in for a model
answer, providing a loadstar against which to measure individual responses. This paper
examines the efficacy of this method and discusses potential applications.

Current methods for the automated scoring of short answer and essay questions are
poorly suited to spontaneous and idiosyncratic assessments. That is, the time saved in
grading must be balanced against the time required for the training of a model. This
includes tasks such as the creation of pre-labeled sample answers. This limits the utility
of machine grading for single classes working with novel assessments. The method
described here eliminates the need for the preparation of pre-labeled sample answers. It
is the author’s hope that such a method may be leveraged to reduce the time needed to
grade free response questions, promoting the increased adoption of formative

33
assessment especially in contexts like law school instruction which traditionally have
relied almost exclusively on summative assessments.

Ranking by the algorithm is found to be statistically significant when compared to a


pseudo-random shuffle. To determine how similar a list’s order was to that produced by
a human grader, the lowest number of neighbor swaps needed to transform the ordering
of these lists into that of the human ordering was calculated. For a dataset including more
than eight hundred student answers to a set of thirteen free response questions, drawn
from six Suffolk University Law School final exams, taught by five instructors, the p-value
for a paired t-test of the two populations’ swaps, with the pseudo-random group acting
as the untreated group and the machine-grader acting as treatment, came to 0.00001,
allowing us to reject the null hypothesis that the machine's ordering is equivalent to a
random shuffle. Additionally, the Cohen’s d for the number of swaps between the pseudo-
random ordering and machine ordering is found to be large (i.e., 0.9).

The Silent Influence of the Chinese Guiding Cases: A Text Reuse Day 2,
Approach Session 2,
Benjamin Chen, Elliott Ash, Zhiyu Li Talk 1

In 2011, the Chinese Supreme People’s Court officially introduced the Guiding Case
system. Selected from judgments rendered by courts nationwide, Guiding Cases address
a range of legal issues and must be referred to by all judges when deciding similar cases.
It was hoped that the Guiding Cases would unify the application of law and safeguard the
quality and integrity of adjudication.

The Guiding Case system has, however, proved controversial. Judges in a socialist legal
system are supposed to apply the law, not make it, and socialist legality does not recognize
precedent as formally binding. These theoretical commitments have incited both doctrinal
and empirical debates about the legitimacy of the Guiding Cases. On the doctrinal front,
scholars have either tried to assimilate the Guiding Cases into existing sources of law or
rejected its authority qua law. On the empirical side, researchers have documented the
lackluster impact of the Guiding Cases on actual judicial practice. Citations to the Guiding
Cases are rare and many Guiding Cases are never cited at all. The extremely low
incidence of citations has been interpreted as evidencing the futility of the Guiding Cases.

We revisit the existing literature on Guiding Cases by looking beyond citations—or overt
references—to search for echoes of their reasoning—or implicit references—in the
judgments of lower courts. To do so, we employ text reuse methods that identify
portions of these judgments that uniquely repeat the key points of adjudication from the
Guiding Cases. The results suggest that the Guiding Cases are more influential than is
commonly assumed, thereby calling into question conventional narratives about the
hostility of the Chinese legal system to case-based adjudication.

34
Lache-ing Onto Change: Object-Oriented Legal Evolution Day 2,
Megan Ma, Adam Nicholas, Avalon Campbell-Cousins, Dmitriy Podkopaev, Jerrold Soh Session 2,
Talk 2
The growth of local jurisprudence has been a subject of particular interest in Singapore
given its young history as a nation. Although rooted in English law, Singapore law has
branched out significantly – especially in recent years – to establish itself as an increasingly
independent, autochthonous system of law (Andrew Phang et al, 2020). A close,
computational linguistic study of how this phenomenon occurred (as manifested in
changes in Singapore’s case law and legislative corpora over time) are likely to yield
fascinating observations on meaning-making and the evolution of legal concepts. We
hypothesize that, by extending the object-oriented approach outlined previously, a
precise usage-based and empirically defensible analysis of the diachronic changes in
terminology and legal reasoning can be achieved. Harnessing traditional linguistic methods
alongside cutting-edge advancements in NLP allows synergy in the understanding of legal
language evolution.

As discussed in our prior research, object-oriented software design maps neatly onto
concepts in language (Megan Ma et al, 2020). The use of graph databases facilitates the
storage and analysis of such linguistic structures optimally and at scale, thereby allowing
both testing and evaluating the viability of our approach. Subsequently, this fosters the
construction of a framework for deriving meaning from text in a manner that is more
human-readable. In short, this means distilling text into a structure that can then be
traversed, executed, and eventually implemented by an end-user. Our approach contrasts
with existing language models and analytical techniques that have actively concealed these
concepts through black boxes.

As a next phase of our research, we want to consider how far our model can extract the
semantic content of judicial decisions from a larger corpus. A key focus lies on how
techniques from historical linguistics may be useful for this task. Historical linguistics is
the study of language change. By delving into often the deepest roots of language, its
contributions vary widely: from a deeper understanding of the human mind, or mental
processes, to a reconstruction, or unveiling of forgotten historic cultures. A central sub-
branch of this line of study is comparative linguistics, which concerns the genetic
interrelations and divergences between linguistic varieties at varying stages in time.
Although primarily focused on the reconstruction of ancestral speech and culture, the
comparative method (amongst other historical linguistic methods) could prove fruitful in
the analysis of legal text and its development. For instance, the quantitative study of
particular constructions has potential to reflect deeper legal and jurisdictional shifts.

In brief, we aim to achieve through our paper an alternative method of unpacking and
interpreting the effects and implications of legal change. This goal may appear somewhat
ambitious given that the concept of legal change is something that has been studied
extensively in legal academia for decades. However, the novelty (and focus) of our
proposed approach lies in exploiting new computational linguistic and computational law
methodologies to shed new light on the subject. That said, because the topic is admittedly
extensive, the paper proposed above will be but a first step towards a larger
collaboration, limited to the subject area of laches and its evolutionary interpretations.

The Statute Data Swamp: NLP, Data Cleansing, Data Interpretation Day 2,
and Legislative Drafting Session 3,
Matthew Waddington Talk 1

What might happen if we treat collections of legislation, “the statute book”, as a data lake
(or swamp)? This paper takes a look, from a perspective informed by legislative drafting,

35
at the prospects and dead ends of trying to learn from computational analysis of large
quantities of statutory texts.

As with other areas, much will depend on the choice of the data, its quality and the
questions that we expect to be able to investigate with it. Various goals have been
suggested, including weighing regulatory burdens, detecting outdated, duplicated and
overlapping legislation and finding repeated patterns that could help legislative drafters
standardise their output. The paper considers these alongside the prospects for two
other questions: what we could learn about how legislation really works and whether
NLP could help extract the logical structure from existing statute books to facilitate
embedding new legislation drafted using “Rules as Code” into its context.

What statutory data would we want to pour into our data lake, and what data cleansing
might we apply to it? The paper looks at what “the statute book” and related terms mean,
the relationships between primary and secondary legislation and between amending and
consolidated legislation, and at the sorts of collections available. It considers the
significance of the age, and patchwork nature, of the various parts of many countries’
statute books, and the extent to which there is structured data available or that can be
readily produced. The paper uses the example of the implications for word frequency
analysis of the way that drafters have moved from using “shall” to using “must”. It also
considers how drafters believe they create rules (going back to Coode’s “legislative
sentence”) and whether analysis of bodies of legislative texts could help check how far
those beliefs are reflected in practice.

The paper reviews three recent projects analysing bodies of statutory material: the
“Open Regulation Platform” at the UK’s Department for Business, Energy and Industrial
Strategy; the “Regulatory Genome Project” of Cambridge University; and recent work
by Jason Morris in the Canadian Civil Service looking for the frequency of particular types
of expression in legislation. It particularly analyses the claims made about computational
analysis of regulations in work done by the New South Wales Treasury for their report
“Regulating for NSW’s Future”. It also reflects on the way these projects have used graphs
and networks to represent cross-references within and between items of legislation.

The paper finishes by considering the implications of work by drafters on “Common


Legislative Solutions” that came out of an interest in a “pattern language” approach to
looking at the body of legislation held by the UK National Archives. It comments on how
that work relates to the opportunities for NLP and similar approaches to be used to find
patterns in legislative wording, particularly focusing on what might or might not be useful
or misleading.

Quantifying Ambiguities in German Court Decisions – Identifying Day 2,


Problematic Terms of Legal Language Session 3,
Gregor Behnke, Niklas Wais Talk 2

Court rulings are stuck in a dilemma: On the one hand, the resulting legal documents are
a product of jurisprudence and are therefore characterized by the use of technical (legal)
language; on the other hand, the rule of law requires that they be comprehensible to
laypersons.

In order to improve comprehensibility, several scientific studies have focused on finding


and eliminating syntactic peculiarities that characterize written court decisions. While
these peculiarities undoubtedly have an effect on intelligibility, this purely syntactical
approach fails to take into account a semantic feature of legal terminology: The systematic

36
ambiguity caused by the both conscious and unconscious technical reuse of already
existing words.

We aim to provide a method to reveal these ambiguities and potential misunderstandings.


By extracting and comparing word vectors from different text corpora, our study
identifies problematic terms in the legal language used in German court decisions. We
use the word2vec methodology to extract a machine interpretable representation of a
word’s meaning. To allow for later comparison, we base our work on two corpora. First,
we use a general corpus of German texts by Goldhahn et al. Second, we extract more
than half a million publicly available decisions of German courts. After specific
preprocessing (removal of stop words, lemmatization, decomposition of compound
nouns tailored to the characteristics of legal language), we create word vectors for both
corpora.

In terms of subsequent evaluation, it must first be noted that a direct, cross-model


comparison of the generated word vectors is not possible. This is due to the fact that
semantic information in the word2vec scheme is captured by the positions of the words
with respect to each other in a particular model. What we can do, however, is define the
similarity of the semantics of a word between general and "legal" German, e.g. by the
number of common word vectors found close to the word of interest or by the mean or
maximum difference of similar word's vectors in both models. It is also possible to expose
differences by considering pairs of words in both models: If the two words have very
similar vectors in one model but highly different ones in the other, these words form a
contrast pair. Such pairs indicate a difference in meaning of which legal practitioners
should be aware.

In order to highlight these differences for legal practitioners, we use 2D embeddings of


word vectors. Given a set of words to display, we find similar (i.e. "defining") words for
each of them. We then compute a 2D embedding of the vectors using t-distributed
stochastic neighbor embedding. Our study applies this approach to contrast pairs as well
as to words with different similarity neighborhoods in both models. With the results
presented in this way, we enable legal practitioners to increase the comprehensibility of
legal texts for laypersons through a more conscious choice of words.

The Un-Modeled World Day 2,


Frank Fagan Session 4,
Talk 1
There is today a pervasive concern that humans will not be able to keep up with
accelerating technological process in law and will become objects of sheer manipulation.
For those who believe human objectification presents a problem, they offer solutions that
require humans to retake control, primarily by means of self-awareness and development
of will. This is not the only way. Technology itself offers the solution on its own terms.
Machines can only learn if they can observe patterns, and those patterns must occur in
sufficiently stable environments. Without learnable regularities and environmental
invariance, machines remain prone to error. Yet humans innovate and things change. This
means that innovation operates as a self-corrective—a built-in feature that limits the
ability of technology to fully objectify human life and law error-free. Fears of complete
technological ascendance in law and elsewhere are exaggerated.

The Promises and Pitfalls of Computational Law: A Meta-Analysis of Day 2,


the Existing Literature Session 4,
Shaun Lim, Daniel Seng Talk 2

37
Computational law is often written about as the next frontier of legal systems
development, promising to automate the operation of legal rules with “smart” contracts
and statutes, from which laws and justice can be administered digitally with greater
efficiency. However, this also tends to attract naysayers which point out the alleged
inflexibilities of computational law in handling the ambiguity inherent in legal language,
arguing that such inflexibility robs the law of beneficial inherent vagueness, and further
that human reaction to such inflexibility may be detrimental or lead to otherwise
unintended consequences. Part of this divide is due to an imperfect understanding of the
innovation spaces open to computational law, as well as a certain degree of imprecision
in talking about computational law and the various domains in which it can or should be
applied. Nonetheless, such a divide howsoever arising is not conducive to a common
understanding of the promise and limits of computational law. To provide a firm and
shared foundation upon which further analysis of the capabilities and limitations of
computational law can take place, the authors propose to perform a meta-analysis on the
existing literature on computational law to understand the positions taken by both
supporters and detractors and to identify general areas of overlap, conflict, and
consensus.

Rethinking the Field of Automatic Prediction of Court Decisions Day 2,


Masha Medvedeva, Martijn Wieling, Michel Vols Session 4,
Talk 3
In this paper, we review previous research in automatic prediction of court decisions.
We demonstrate how various studies in the field actually perform very different tasks
that we define as outcome identification, outcome-based judgement categorisation and
outcome forecasting.

Outcome identification is defined as the task of identifying the verdict within the full text
of the judgement, including (references to) the verdict itself. Given the growing body of
published case law across the world, the automation of this task may be very useful, since
many courts publish case law without any structured information (i.e. metadata) available,
other than the judgements themselves, and often one may require a database where the
judgements are connected to the verdicts in order to conduct research.

Outcome-based judgement categorisation is defined as categorising court judgements


based on their outcome by using textual or any other information published with the final
judgement, but excluding (references to) the verdict in the judgement. Since the
outcomes of such cases are published and no longer need to be ‘predicted', this task is
mainly useful for identifying predictors (facts, arguments, judges, etc) of court decisions
within the text of judgements.

Outcome forecasting is defined as determining the verdict of a court on the basis of


textual information about a court case which was available before the verdict was made
(public). This textual information can, for instance, be submissions by the parties, or
information (including judgements) provided by lower courts to predict the decisions of
a higher court, such as the US Supreme Court. While identification and categorisation
tasks only allow one to extract information and analyse already made court decisions,
forecasting allows one to predict future decisions that have not been made yet.

We discuss appropriate and inappropriate methods to approach different tasks given the
objective of each of them and examine why, while they clearly have such different
purposes, the above tasks are often confused with each other. We argue that it is likely
due to the cross-disciplinary nature of the field and confusion of natural language
processing terminology and concepts that are used by people on a daily basis.

38
We discuss how important it is to understand the legal data that one works with in order
to determine which task can be performed. Finally, we reflect on the needs of the legal
discipline regarding the analysis of court judgements.

Clause2Game: Modeling Contract Clauses with Composable Games Day 3,


Joshua Tan, Megan Ma, Philipp Zahn Session 1,
Talk 1
Substantial work in contract theory applies game theory to analyze the design of
contracts. This work tends to model the incentive effects of specific clauses or conditions,
e.g. a managerial compensation clause or the condition that contracts are incomplete.
Contract theory typically does not model whole contracts composed of many clauses
due to the practical limitations of hand-built models, nor does it address the institutional
diversity found in real-life contracts in the way that, for example, computational law does.
In this work, we present a software tool, Clause2Game, that extends contract theory to
whole-contract analysis as well as the results of an active implementation of the tool on
top of Lawgood, a widely-available platform for drafting contracts. The core of the tool
is a new data set of clause-level “open games'' that can be composed to generate
economic models in the same way that clauses can be composed to generate contracts.
In presenting the tool and data set, we emphasize a systems-theoretic, building-blocks
approach to contract analysis and design, and discuss the opportunities and obstacles to
introducing more ideas from systems theory to both contract theory and the practice of
contract law.

Computational Corpus Linguistics Day 3,


Jonathan H. Choi Session 1,
Talk 2
Scholars and judges increasingly interpret words in legal text by studying their use in real-
world documents, a method known as “corpus linguistics.” But the traditional approach
to corpus linguistics encounters several problems. Traditional corpus linguistics focuses
on word frequencies at the expense of subtler linguistic cues. It also recommends no
single method and presents no clear dividing line between correct and incorrect textual
meanings. This lets interpreters cherry-pick the method that supports their favored
meaning, makes the process of corpus linguistics more burdensome and more subjective,
and can yield contradictory results depending on the method chosen.

This Article proposes a novel computational approach to corpus linguistics. It applies


tools from the burgeoning literature on machine learning and natural language processing
to algorithmically evaluate word meaning. By measuring the semantic similarity between
words, we can answer questions of legal interpretation—for example, by testing whether
cigarette is similar to device, and therefore whether the FDA should have jurisdiction to
regulate cigarettes. Computational approaches produce quantitative, replicable estimates
of semantic similarity in ways that reflect the intuitive semantic relationships between
words. This Article develops a method to translate these quantitative estimates into
qualitative interpretations by benchmarking against a known scale of word similarity,
based on H.L.A. Hart’s famous “vehicles in the park” hypothetical. It also suggests
methods that intuitively explain the differences between words and extract the particular
aspects of word meaning most relevant to legal interpretation.

Applying computational corpus linguistics, this Article finds that semantic questions in
statutory interpretation generally lack clear answers in real-world cases. The traditional
corpus linguistics literature treats most cases of statutory interpretation as semantically
determinate, meaning they can be resolved through inquiry into word meaning alone. In
contrast, this Article finds that most statutory disputes fall within a “zone of

39
indeterminacy” where other evidence of meaning should also be considered, like
legislative history or canons of construction.

Sharing and Caring: Creating a Culture of Constructive Criticism in Day 3,


Computational Legal Studies Session 2,
Dirk Hartung, Corinna Coupette Talk 1

Code and data unavailable, available upon reasonable request, or from dead links only.
Little, if any, documentation of underlying assumptions or judgment calls. Lack of
sensitivity analyses, robustness checks, or ablation studies. Limited peer review, or peers
impressed by figures showing results produced by algorithms they do not understand, on
data whose provenance is unclear. Referenced sources behind paywalls or not indexed
by common search engines at all. The list of deficiencies affecting published papers in our
field goes on. How come?

The answer is simple, yet unsettling: The field of computational legal studies is hard.
Things can go wrong. Misspecified models, dirty data, buggy code. No individual
researcher is perfect, but as a community, we can strive to identify our mistakes, correct
them, and learn from them for the future. We can get better, individually and collectively,
and we can make progress. This, however, requires scientific hygiene routines that have
yet to be established. As our research develops at the intersection of law and computer
science, and articles using computational methods make their way into mainstream legal
research, we can no longer ignore the striking mismatch between the publication
procedures familiar from doctrinal scholarship and empirical legal studies on the one
hand, and the requirements of robust, reproducible computational legal research on the
other.

In this article, we argue that for computational legal studies to advance as a community,
the field needs a publication culture designed to meet its unique challenges. We find the
building blocks for such a culture in our parent disciplines. From computer science, we
can adopt the requirements of data availability, code availability, honest assessments of
the methodological and interpretive limitations of our research, and transparent,
constructive criticism of our own work and the work of others. Legal publication culture
offers other advantages: Less driven by conference deadlines and less overwhelmed from
mass peer-review, legal scholars can make time to focus on big ideas, rather than merely
pushing for incremental improvements. Hence, combining the best of both our worlds
can help us keep our studies both scientifically rigorous and comprehensible for a
heterogeneous audience comprising both computer scientists and lawyers.

We scrutinize the current state of best practices, peer review, and journal policies in our
field, and suggest a set of foundational principles on which we might build the publication
culture of computational legal studies. To operationalize these principles, we propose a
protocol for the quality control of computational legal studies. This protocol may serve
as a checklist for authors, reviewers, and readers alike, and we demonstrate its usefulness
in an application to some of our recent work. We further introduce ideas for fostering a
transparent review culture, and share our vision for a division of labor between
publication venues. By presenting the pain points of our daily research experience along
with potential solutions, we hope to contribute to the development of a community that
shares and cares, creating a culture of constructive criticism in computational legal
studies.

Fait Accompli: Predicting the Outcomes of Investment Treaty Day 3,


Negotiations Session 2,
Malcolm Langford, Runar Hilleren Lie Talk 2

40
Empirical research has indicated that states with more economic power are more
successful in bilateral negotiations on investment treaty law. Using computational
methods, Alschner and Skougarevskiy (2016) find powerful states’ negotiated treaty texts
are more internally coherent (similar to each other); and with quantitative methods, Allee
and Peinhardt (2014) find that capital exporting states use their bargaining power to
create stronger and more enforceable investor protections. However, Berge’s (2021)
qualitative analysis of negotiations suggests that this dominance may be both a function
of economic power and bureaucratic competence. This study takes these findings a step
further and asks: Is this dominance so strong that it is possible to predict in advance the
textual outcome of negotiations, without any knowledge of the actual negotiations?

Using a comprehensive temporal network of all clauses, mapped through to their original
drafters, we present a computational model that predict clause-level negotiation
outcomes with an accuracy of up to 96%. It is based on a dataset of over 3000 treaties,
4000 clauses, and a marking of which party obtains “their” clause in any given agreement.
The model, which is trained on treaties from 1950 to 2015, is largely able to predict the
textual outcome of treaties that match with the real-world treaties of 2016-2020 using
only the names of the two state parties, and clause type as input. The model shows that
treaty negotiations are even more predictable than previously assumed, and that power
dynamics largely pre-determine negotiation outcomes. With these empirical results we
review and test how current theories and research on negotiations fit with the highly
predictable nature of international Investment treaty law.

From Contract to Smart Legal Contract - A Case Study using the Day 3,
Simple Agreement for Future Equity Session 3,
Ron van der Meyden, Michael J. Maher Talk 1

The talk will give an overview of a case study on the development of smart contracts for
Y Combinator's Simple Agreements for Future Equity (SAFE), a type of legal contract
used in financing startups. SAFE contracts promise a seed investor that, in exchange for
their investment, preferred shares will be issued at the time of a future equity round,
according to a formula that mixes properties of debt and equity.

The precision required in developing smart contract code for these contracts has raised
a significant number of questions of financial analysis, game theory, legal formalisation,
software architecture, and formal verification.

Prima facie, the formula for conversion of a SAFE to shares is straightforward. Naively
applied, however, it has the effect of directly diluting the equity round investor as a
consequence of the equity round. The sophisticated equity round investor responds by
varying the conversion formula by decreasing their pre-money valuation, creating tension
with the SAFE investor that may be resolved using yet another approach to the
conversion calculation.

There is a further complication that the SAFE creates a circularity in the notion of "pre-
money valuation", rendering this term "open-textured" in the sense of having an indefinite
meaning in unexpected situations. In [1], we conduct a financial analysis of the variety of
ways that an issuance of shares can be calculated in the face of these complexities.

There is more indefiniteness in the language of the contract, in expressions such as "bona
fide transaction or sequence of transactions". We argue that this text is in fact necessary
to protect the SAFE investor against an collusive "attack" by the company and equity
round investor. Paper [3] discusses this and other problems of indefiniteness in the SAFE,

41
and how it bears on the form of a smart contract for the SAFE contract. One conclusion
is that a smart contract on its own does not suffice to protect the interests of the parties,
but that an associated legal contract is necessary. Our approach allows for a spectrum of
ways to manage indefinite terms in the source legal text, spanning from refinement to a
rigid formalization to the expression of a precise process by which human agents will
resolve indefiniteness during the performance of the contract.

A game theoretic analysis also proves to be necessary to understand how to implement


Liquidity events (e.g., a take-over of the company). The SAFE gives the investor an option
to either "cashout" and recover their initial principal, or to convert the contract to shares
and receive the value of the shares. Although some versions state that the investor
receives the maximum payout of the two options, the situation is actually game theoretic,
with the payout depending on the choices of other investors. Paper [2] characterizes the
Nash equilibria of this game.

Taken together, the analysis we have conducted on these issues illustrates how the
process of formalization can drive the development of a significantly deeper
understanding of a contract itself. This case study is also informative on the requirements
for smart contract languages and tooling for smart contract developers.

The talk will be based on the following papers, available at


http://www.cse.unsw.edu.au/~meyden/research/projects/SAFE.html

[1] Simple Agreements for Future Equity -- Not So Simple?, Ron van der Meyden and
Michael J. Maher
[2] A Game Theoretic Analysis of Liquidity Events in Convertible Instruments, Ron van
der Meyden
[3] Can SAFE contracts be smart?, Ron van der Meyden and Michael J. Maher

“Legal Big Data”: From Predictive Justice to Personalised Law? Day 3,


Andrea Stazi Session 3,
Talk 2
The phenomenon of Big Data intersects with comparative law and justice in several
noteworthy profiles. First, the comparative approach leads to identifying the peculiar
characteristics of the data through a conceptual framework of the same in the perspective
of other disciplines, in particular economics and information technology.
Then, in view of the different legal issues posed by Big Data, comparative law can help
develop and provide data management and analysis services across national borders.
Finally, the application of data analysis methods to legal issues can give rise to "Legal Big
Data" through which it might be possible to observe evolutionary patterns and paths of
law, foresee or adopt jurisprudential decisions, develop and apply laws or regulations
based on solid argumentative and comparative elements.

42
About the Centre
The Centre for Computational was founded in early 2020. It is led by Centre Director and Assistant
Professor of Law and Computer Science Lim How Khang. AP Lim also directs the School’s Computing
and Law degree programme. True to our research roots, our physical premises are located in the
basement of the SMU School of Law, alongside the School’s other research Centres.
CCLAW’s flagship and inaugural project is the Research Programme in Computational Law (“RPCL”),
whose primary aim is to “study and develop open source technologies for ‘smart’ contracts and ‘smart’
statutes, starting with the design and implementation of a domain-specific programming language (DSL)
that allows for laws, rules and agreements to be expressed in code”.1 That is to say, “computational
law” in the formal, Love and Genesereth (2008) sense. The RPCL is supported by a S$15.2 million
grant from the National Research Foundation of Singapore, and led by Principal Investigator Wong
Meng Weng as well as Industry Director Alexis Chun who together are co-founders of Legalese.com.

CCLAW team members at our Kwa Geok Choo Law Library.


From left: CCLAW Director Lim How Khang, RPCL Co-Investigator Lau Kwan Ho, Industry Director
Alexis Chun, YPHSL Dean Goh Yihan, RPCL Principal Investigator Wong Meng Weng

Beyond the RPCL, the Centre is also pursuing and developing projects in other areas at the
intersection of law and technology. No prizes, of course, for guessing that computational legal studies
– and the related fields of natural legal language processing and legal AI – fall within these areas. Other
developing projects relate to legal technology adoption and applications.
We are open to collaboration with the international research and legal community and warmly invite
conference participants to reach out to us on these fronts.

1
SMU and National Research Foundation Joint Press Release (2020), Available at
https://www.nrf.gov.sg/docs/default-source/modules/pressrelease/smu-awarded-15-million-grant-for-
computational-law-research.pdf.

43
About the Law School2
The Yong Pung How School of Law (“YPHSL”) at Singapore Management University (“SMU”) was
founded as the SMU School of Law in 2007. It is Singapore’s second oldest (and second youngest) law
school. In 2017, we moved into our present campus adjacent to the historic Fort Canning Park. In
2021, we were officially renamed as the Yong Pung How School of Law, in honour of former Chief
Justice Dr Yong Pung How, who was an important benefactor to the school, having served as the
founding Chairman of our advisory board.

Located at the periphery of Singapore’s central business and civics districts, we are within walking
distance to key legal institutions such as the Supreme Court, Parliament, Law Ministry, and Academy
of Law. We maintain close partnerships with government and industry that, in turn, informs our
research and teaching.

Aerial view of the YPHSL campus

Source: Wikimedia Commons (User: Manderiko) CC BY-SA 4.0

Under the leadership of our present Dean Professor Goh Yihan (the youngest law dean in Singapore’s
history), law and technology was made one of our core focus areas. In 2018, the Centre for AI and
Data Governance was formed with a generous grant from the Singapore National Research
Foundation. In 2019, we launched a brand new Computing and Law degree, an interdisciplinary
undergraduate course which sees students reading core computer science modules alongside legal
classics such as torts, crime, and contracts. This year, we will also see our first intake for a new PhD
programme in Law, Commerce, and Technology. More broadly, we have been angling our curriculum
towards preparing our students for a more technological future.

See our website for more information on our other research and pedagogical efforts, international
events and collaborations, job openings, visiting academic programmes, and more.

2
Please note that this is broad introduction to the school written by the conference committee and does not
represent the school’s official views. See the school website for those.

44

You might also like