Computing Degree Show 2016

Text Encoding and Argument

Argumentation is a rapidly expanding field, covering both Computing, Computing Science and even branching toward Philosophy and the Social Sciences. The overall aim of this project from the start was to generate some form  of interesting, practical and valid research based on Text Encoding and Argument.

In the end it was decided to work on first converting and then analysing a particular corpus (group) of arguments created by a two people (Andreas Peldszus and Manfred Stede) from Potsdam University in Germany, the full name of which is “An Annotated Corpus of Argumentative Microtexts”.  Chris Reed, who works with the Arg Tech group and acted as a supervisor on this project, suggested it would be an interesting corpus for research, and so it was decided upon.

The first task of the project was to convert the corpus, from the format specified to its creators, to a format called AIF (Argument Interchange Format), created by Arg Tech. This was done automatically, using a group of Python Scripts, which can take the whole corpus, convert each file one by one and then automatically upload them to AIFDB (Argument Interchange Format Data Base), which is a central store for arguments in the AIF format.

After this conversion, a second script was created to analyse the arguments in the AIF format. It specifically locates the “Linear Argument Order” which represents the spoken order of the Argument. It then takes all of the “I-Nodes” or Information Nodes from the argument and places them in a tree-like structure, based on the supporting relationships between these nodes. The conclusion of the argument goes at the top of the tree, the nodes directly supporting it below it, and the nodes supporting those below them, and so on. The Argument Order was then compared with this to decide upon the overall order of the argument and sub arguments (smaller sections of a larger argument). If the conclusion to the argument comes at the start of the linear order, the argument is said to be of “Pre Order”. If the conclusion comes at the end, it is said to be “Post Order”. If the conclusion is somewhere in the middle, surrounded by nodes then it is “Neither”.

The idea of generating this data is to help start a conversation and pave the way for future potential research into how people construct arguments, and why. The results from the particular Corpus that I built this for, showed that there were twice as many Pre Order Arguments as Post Order, which suggests these are used a lot more commonly. Hopefully this information can lead to further developments in Argumentation and Social Science, as it could prove fundamentally vital to the reasons as to why people build arguments in a very specific way. The analysing script is capable (in theory) of running on any set of arguments as long as they are stored as valid AIF, so this could be used to help with this future work.

 

5 4 3