“Peace and security” cannot be taken for granted; they require communication. That’s why the diplomats of the Security Council confer almost daily. All meetings are documented, and their transcripts are available to the public. They constitute a treasure trove of data for scholars. “Two years ago, the proceedings of UNSC meetings were made available for the first time as a machine-readable text corpus,” says Stede. “That made it possible to study them with computational linguistic methods.” The idea of analyzing the extensive English-language material was born out of Stede’s long-standing collaboration with political scientist Dr. Ronny Patz. The aim is for political science to be able to process the prepared data more easily. “We want to make the tedious process of reading PDF files easier,” he says. The Potsdam researchers are also cooperating with two colleagues from the University of Dundee in Scotland.
Refined annotations
For computational linguistics, Prof. Stede admits, “There have hardly been any studies on diplomatic language so far.” Conflicts as such are not new territory in his field: Scholars can work out unambiguous expressions of opinion and situations of dispute quite reliably in other text types. But the polite language of diplomacy makes it difficult to work with machines. According to Stede, drawing on existing conflict research tools is not enough. The researchers therefore need to develop an extended system for their text annotations, which make machine processing of diplomatic language possible. That is what they are currently working on. Using many examples and counter-examples, they are writing detailed guidelines on which words or phrases are used to indicate conflict. They have to pay equal attention to a nuanced “we should” and a clear “we are against.” Once the refined guidelines have been tested after several rounds of editing, student assistants make the first annotations in the selected records by marking the passages that are linguistically relevant to the conflicts.
The computer uses this manually processed data to independently analyze and label additional material and to reliably recognize patterns. The team then evaluates these patterns. In particular, they want to investigate the theory that diplomatic conflicts are primarily resolved via justifying argumentation patterns. “The various parties justify their voting behavior – what they do and what they don’t do. That can be linguistically distinguished quite clearly.” Other types of argumentation, such as proving the truth of a particular statement, tend to be phrased differently.
Understanding the course of conflicts
With the help of a text corpus of nearly one million words, which consists of proceedings from over 25 years, the researchers can examine more than just individual moments of conflict. They want to trace the trajectories of issues over the years. “This way, we can explain whether arguments have changed in their clarity or in the given justifications,” Zaczynska explains. The participants hope to provide added value for conflict research.
The researchers look at different types of conflicts in the project: Indirect ones, where only the disputed issue is addressed, as well as direct disputes, when one state addresses another. “The way conflicts are expressed depends a lot on the subject matter. For example, if there is a military dispute, we expect more direct responses,” says Zaczynska.
The researchers began the annotation work with the women, peace, and security agenda – one of the three thematic areas they are focusing on. They will also analyze debates on climate change and the Ukraine conflict, which has been ongoing since 2014. But which of the many transcripts are relevant to their research? To find out, the team is drawing on metadata already noted in the UNSC proceedings. Thanks to the additional information, they know which diplomat from which country spoke about which topic. Unfortunately, the metadata do not reveal the nature of the speeches: For UNSC meetings, speeches are first prepared and read out loud, but it gets particularly interesting for the team when a country takes the floor spontaneously. Whether these types of speeches can be automatically distinguished from one another is also to be clarified from a computer linguistic perspective in the project.
Analyzing monologs and dialogs
In the subsequent project phase, the annotation data will be evaluated using the Rhetorical Structure Theory. “This theory assumes that a coherent text can be represented through a tree structure. The result of an RST analysis is meant to reconstruct the author’s intention from the reader’s point of view,” explains computational linguist Zaczynska. The text sections are determined according to their function and related to each other. The theory specifies certain relations between text sections. This results in tree diagrams with hierarchical structures. The “but” of diplomat Carolyn Schwalger, for example, is a qualification to the statement made before. Statistics are then generated from the tree diagrams. By using these, the researchers can see, for example, how often and how extensively reasons are given.
The Rhetorical Structure Theory is particularly suitable for speeches by individual speakers, which the Potsdam researchers have specialized in. If the discussion partners are engaged in a dialogue, the researchers from Dundee use another theory called Inference Anchoring Theory (IAT) to examine the material of the UNSC sessions. They are focusing on how inferences emerge when, for example, speakers follow certain rules in dialogue. Dialogues and discourse represent a particularly challenging area within language analysis and modeling. IAT is a theory that supports the analysis of so-called illocutionary acts for discourse. An illocutionary act is a technical term in pragmatics and refers to the actions performed through speech such as those of asserting, challenging, arguing, promising, asking, etc. For this purpose, IAT incorporates the contextual information offered by dialogical information into the annotations of speech acts.
Added value for many disciplines
Even though the project was launched with the aim of providing a tool for political science, other disciplines and institutions will also be able to benefit from the results. “The social sciences are happy to have reliable data,” Prof. Stede knows. In addition, the Potsdam part of the research group is in contact with the German United Nations Association. “They are very interested in having the language of the UN analyzed on a large scale,” he says. It is conceivable that this will provide insights into Germany's role during a membership in the United Nations Security Council. Until then, the researchers are doing important basic groundwork with their project for a holistic understanding of diplomatic language.
The Project
Trajectories of Conflict: The Dynamics of Argumentation in the UN Security Council
Participating researchers: Prof. Dr. Manfred Stede, Karolina Zaczynska, Prof. Dr. Chris Reed (University of Dundee), Dr. Alexandru Marcoci (University of Dundee), Dr. Ronny Patz (external consultant, Hertie School Berlin)
Duration: 2021–2024
Funding: German Research Foundation (DFG), Arts and Humanities Research Council (AHRC)
http://angcl.ling.uni-potsdam.de/projects/trajectories.html
The Researchers
Professor Manfred Stede studied computer science and linguistics at Technische Universität Berlin; in 1996 he earned his PhD in computer science at the University of Toronto. Since 2001 he has been Professor of Applied Computational Linguistics at the University of Potsdam.
Mail: manfred.stedeuuni-potsdampde
Karolina Zaczynska studied computer linguistics and Polish studies at Justus Liebig University in Giessen. She researches at the German Research Center for Artificial Intelligence in Berlin; since 2021, she has been doing her doctorate at the University of Potsdam.
Mail: karolina.zaczynskauuni-potsdampde
This text was published in the university magazine Portal Wissen - Two 2022 „Humans“ (PDF).