PhD thesis
Analysis of Environmental Treaty Design: A Data Science Approach
Hard copies can be found at the Cambridge University Library and Seeley Library
Download an electronic copy from the Cambridge University repository: https://doi.org/10.17863/CAM.104673
PhD Thesis Reproducibility Package (with docker/podman image): https://doi.org/10.5281/zenodo.10078710
Gitlab repository: https://gitlab.com/martinakunz/phd
For spin-offs see Code & data section below
Abstract
There are hundreds if not thousands of international agreements governing all sorts of environmental problems, from endangered species and pollution to stratospheric ozone depletion and climate change. Analysing and describing the provisions of all these treaties using the traditional 'reading and writing' approach has become all but impossible. The main proposals for solving this epistemic challenge involve either time-consuming manual approaches to building datasets, or use statistical natural language processing (NLP) for a different kind of content analysis. This thesis proposes an intermediate approach, leveraging rule-based NLP for dataset construction and employing statistics and machine learning only for downstream analysis. Traditional legal research can thus be supported and complemented while taking advantage of data science and automation. The approach is developed with a set of about 120 open multilateral environmental agreements and about 50 treaty design variables. Regular expression pattern matching is found to be well suited for accurate and precise extraction of information from common treaty provisions such as those on entry into force, amendment, supplementary agreements, treaty organs, withdrawal, termination and dispute settlement. Implementation-related provisions, including national reporting, international verification of compliance, treaty progress review, non-compliance procedures and sanctions are more difficult to capture and compare across treaties, but this difficulty itself is of interest for the analysis of treaty design. The variables, their distribution and associations are described and the speed of entry into force is predicted using various techniques including linear regression and neural networks.
Regarding the larger epistemic challenge, the scalability of the approach is assessed and limitations of existing treaty databases and research practices are identified. Drawing from achievements of the bioinformatics and linked open data communities, I argue that a collaborative, incrementally expanding database, or findable, accessible, interoperable and reusable (FAIR) datasets would make the approach scalable. This relies on a standardised vocabulary or formal ontology for data integration. Accordingly, the thesis builds a proof-of-concept Public International Law Ontology and an NLP pipeline to populate the ontology with data gathered from treaty texts and participation records. Output formats and interfaces are designed for wide accessibility, without requiring programming skills. All software and data accompanying this thesis are available under a free and open source licence.
Academic journal articles and book chapters
Artificial Intelligence and Robotization
in Robin Geiß and Nils Melzer (eds.), The Oxford Handbook on the International Law of Global Security (Oxford University Press 2021), pp. 624-640 (with S. Ó hÉigeartaigh)
An updated overview of some of the primary sources discussed in this publication can be found on my research website at https://globalaigov.org (work in progress).
Abstract: This chapter provides an overview of international law governing the applications of artificial intelligence (AI) and robotics that affect global security, highlighting challenges arising from technological developments and how international regulators are responding to them. Much of the international law literature thus far has focused on the implications of increasingly autonomous weapons systems. The chapter seeks to cover a broader range of global security risks resulting from large-scale diffuse or concentrated, gradual or sudden, direct or indirect, intentional or unintentional, AI- or robotics-caused harm. Applications of these technologies permeate almost every domain of human activity and thus unsurprisingly have an equally wide range of risk profiles, from a discriminatory algorithmic decision causing financial distress to an AI-sparked nuclear war collapsing global civilization. Hence it is only natural that much of the international regulatory activity takes place in domain-specific fora. Many of these fora coordinate with each other, both within and beyond the United Nations system, spreading insights and best practices on how to deal with common concerns such as cybersecurity, monitoring, and reliability, so as to prevent accidents and misuse.
Environmental Approaches to Nuclear Weapons
in Gro Nystuen, Stuart Casey-Maslen and Annie Golden Bersagel (eds.), Nuclear Weapons under International Law (Cambridge University Press 2014), pp. 269-291 (with J.E. Viñuales)
Principle 11: Environmental Legislation
in Jorge E. Viñuales (ed.), The Rio Declaration on Environment and Development: A Commentary (Oxford University Press 2015), pp. 311-324
Abstract: This chapter analyses the rationale, travaux préparatoires, conceptual scope, and normative and jurisprudential influence of Principle 11 of the Rio Declaration. It then discusses different approaches to the coordination of domestic environmental legislation and regulations.
Ten-Year Assessment of the 100 Priority Questions for Global Biodiversity Conservation
Conservation Biology, 2018, 32: 1457-1463 (with T. Jucker et al.) open access
Abstract: In 2008, a group of conservation scientists compiled a list of 100 priority questions for the conservation of the world's biodiversity. However, now almost a decade later, no one has yet published a study gauging how much progress has been made in addressing these 100 high-priority questions in the peer-reviewed literature. We took a first step toward reexamining the 100 questions to identify key knowledge gaps that remain. Through a combination of a questionnaire and a literature review, we evaluated each question on the basis of 2 criteria: relevance and effort. We defined highly relevant questions as those that – if answered – would have the greatest impact on global biodiversity conservation and quantified effort based on the number of review publications addressing a particular question, which we used as a proxy for research effort. Using this approach, we identified a set of questions that, despite being perceived as highly relevant, have been the focus of relatively few review publications over the past 10 years. These questions covered a broad range of topics but predominantly tackled 3 major themes: conservation and management of freshwater ecosystems, role of societal structures in shaping interactions between people and the environment, and impacts of conservation interventions. We believe these questions represent important knowledge gaps that have received insufficient attention and may need to be prioritized in future research.
Implementation and Impact of Anti-Smoking Interventions in Three Prisons in the Absence of Appropriate Legislation
Preventive Medicine, 2012, 55(5):475-481 (with J-F Etter et al.)
This publication resulted from a part-time job during my undergraduate studies that introduced me to the joys and challenges of empirical research.
Abstract (excerpt):
Methods
A before–after intervention study in A) an open prison for sentenced prisoners, B) a closed prison for sentenced prisoners, and C) a prison for pretrial detainees. Prisoners and staff were surveyed before (2009, n = 417) and after (2010–2011, n = 228) the interventions. Medical staff were trained to address tobacco dependence systematically in prisoners. In prison A, a partial smoking ban was extended. No additional protection against second-hand smoke was feasible in prisons B and C.
Code & data
Data collection tools
AustLII scraper
Downloads treaty texts published by the Australasian Legal Information Institute
IEAdb scraper
Gathers texts from the University of Oregon's International Environmental Agreements db
Data integration tools
Information extraction & ontology population apps
Data visualization
Move the round time slider to see how treaty participation evolved over the years
Click on East or West to rotate the globe
Zoom in or out with the mouse or touch
Hover over countries to see their signature, consent to be bound, and entry into force date of the treaty
Other
I have also published a treaty references project for generating bibliographic databases and styled references from UNTS data that I used for my thesis.
Other code and data outputs of my PhD, such as the machine learning analysis and automated treaty report sample, are currently only in the PhD code repository (and reproducibility package) linked in the introduction. I am working on an online presentation of these components in addition to more traditional academic publications.
Ultimately the aim is to provide a free and open source toolkit for reproducible data science in the area of international law that is robust, secure, efficient and user-friendly.
Pursuing this goal also involves trying to convince database maintainers to publish their data in a more accessible, machine-readable format, e.g. with better bulk export options or application programming interfaces (APIs).
If you are interested in collaborating, advising or otherwise supporting this, please get in touch.