Masterclass: Text-as-Data and Network Analysis for Lawyers

Level - Target audience

Open to PhD students doing research within the broad field of Law, Humanities and Social and Behavioural Sciences


Prof. Dirk Heirbout
Faculty of Law and Criminology - Department: Ghent Legal History Institute.

Organising & scientific committee

Organising committee

Prof. Frederik Dhondt (VUB, UAntwerpen, UGent)

Prof. Diederik Bruloot (UGent) 

Prof. Jan Dumolyn (UGent)

Prof. Christophe Verbruggen (UGent)

Scientific committee

Prof. Tom Ruys (UGent)

Prof. Eva Brems (UGent)

Dr. Annemieke Romein (UGent, HuC - Huygens ING Amsterdam)

Dr. Frederik Peeraer (UAntwerpen/UGent)

Dr. Matthias Van der Haegen (University of Maastricht)


Advances in technology are revolutionising the legal discipline. This masterclass seeks to teach legal scholars basic computational tools for the empirical analysis of law: text-as-data and network analysis. This includes relatively-low tech, but highly effective information retrieval technique, such as regular expressions (regex) that allow to extract patterns from legal texts. In addition, the class covers network analysis, which helps understand the law’s network structures and effects.


Topic 1: Introduction to theory and practice of computational legal studies
The first part will cover the theoretical and practical (i.e. data acquisition, storage, and publication) underpinnings of computational legal studies.
Topic 2: Text-as-data analysis for lawyers
Regular expressions and other natural language processing techniques look for patterns in text on a (semi-)automatic basis. Document segmentation and information retrieval techniques allow to segment e.g. contracts or treaties into constituent articles or cases and extract citations.
Topic 3: Network analysis (NAn) for lawyers
Law is full of networks: statutes that refer to each other, cases that cite prior cases, contracts that connect contractors,…. NAn helps understand the law’s network structures and effects. NAn comes with a toolkit to investigate network structures using metrics (e.g. degree centrality). NAn is scalable and allows to quickly analyse large amounts of information and visualise legal structures in an appealing way.


Learning outcomes part 1:
Besides the theoretical foundations of computational legal studies, participants will learn where to find, how to store, and upload legal data to R. Attention will be given to best practices with respect to the publication of data resulting from empirical research.
Learning outcomes part 2: Participants will learn
1. What is Regex (regular expression extraction)
2. How to write and integrate Regex into R code
3. How to use Regex for text segmentation and information retrieval
4. Basics of term-frequency representations of text
Learning outcomes part 3: Participants will learn how to
1. Use regexes to find citations
2. Create a citation list and find most cited cases
3. Visualise networks and apply network measures
After completion of the course, participants will be reasonably comfortable with using R and should be able to apply text-as-data and network analysis to the benefit of their own research.


Name: Wolfgang Alschner - Affiliation: University of Ottawa
Contact details:
Wolfgang Alschner is an empirical legal scholar specialized in international economic law and the computational analysis of law. He holds a PhD in International Law and a Master in International Affairs from the Graduate Institute (IHEID, Geneva), a Master of Laws from Stanford Law School, an LLB from the University of London and a BA in International Relations from the University of Dresden. Prior to joining the University of Ottawa, Prof. Alschner worked for several years for UNCTAD’s Section on International Investment Agreements and as a research fellow at both IHEID and the WTI in Bern. His research focuses on using social and computer science methods in order to empirically investigate international law.


14 - 16 June, 2021 

Program details

  • Monday, 14 June, 2021

14.00-17.30: Introduction to computational legal studies


1) Alschner, Wolfgang. “The Computational Analysis of International Law”, in: Rossana Deplano and Nicholas Tsagourias (eds.), Research Methods in International Law: A Handbook, 2019.
2) Šadl, Urška, and Henrik Palmer Olsen. “Can Quantitative Methods Complement Doctrinal Legal Studies? Using Citation Network and Corpus Linguistic Analysis to Understand International Courts.” Leiden Journal of International Law 30, no. 2 (June 2017): 327–49.

  • Tuesday, 15 June, 2021

14-17.30: Text-as-data Analysis


1) Grimmer, Justin, and Brandon M. Stewart. “Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts.” Political Analysis, 2013. (Links to an external site.).
2) Baturo, Alexander, Niheer Dasandi, and Slava J. Mikhaylov. “Understanding State Preferences with Text as Data: Introducing the UN General Debate Corpus.” Research & Politics 4, no. 2 (April 1, 2017): 2053168017712821. (Links to an external site.).

  • Wednesday, 16 June, 2021

14-17.30: Similarity and Network Analysis


1) Alschner, Wolfgang, and Dmitriy Skougarevskiy. “Mapping the Universe of International Investment Agreements.” Journal of International Economic Law 19, no. 8 (2016).
2) Alschner, Wolfgang, and Damien Charlotin. “The Growing Complexity of the International Court of Justice’s Self-Citation Network.” European Journal of International Law 29, no. 1 (2018): 83–112.

  • Follow-up Session (TBD): Student Presentations

Registration fee

Free of charge.


Register here.

Teaching materials

Participants are asked to complete the free online courses on, in particular classes 1 and 2.
• Grimmer, Justin, and Brandon M. Stewart. “Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts.” Political Analysis, 2013.

In addition, the preliminary reading list can be found infra in the programme.

Evaluation criteria (doctoral training programme)

100% attendance and active participation. Academic pitch and submission of written research project (or outline thereof) before the start of the course is optional, but recommended.

Number of participants

The course is limited to 15 participants.