Argument mining involves automatically identifying structured arguments contained in natural language text and assessing their quality.
A logical argument normally begins with a set of assumptions, followed by the application of some reasoning steps and ending with a conclusion. Arguments can be represented as trees with the conclusion as the root and the assumptions as the leaves. The nodes can be labelled with various quantitative attributes which help evaluate the quality of the argument. The tree itself can be used to create a concise summary of the document.
However, finding argument features in natural language documents is highly challenging, even for human readers. Developments in this field is hampered by the:
- lack of suitably-labelled data sets ("corpora") for testing
- limited tool support for extracting argument structures from existing documents.
You will be involved in developing a software tool or framework that supports extracting argument trees from documents. This will occur through at least one of the following methods:
- manual highlighting
- automatically searching for keywords.
Appropriate data structures for representing the source documents and the generated argument trees must be defined.
Depending on your interests, it could also involve annotating the tree's nodes with quantitive features. This can include the trustworthiness of statements or generating a summary of the tree's overall properties, such as the breadth and depth of the argument.
A particularly useful addition would be the ability to identify relationships between sentences, including whether they refer to the same subject.
The outcome will be a software tool that:
- reads documents (typically in PDF)
- allows the user to highlight certain features
- generates argument trees (as XML structures) suitable for viewing or further analysis.
Skills and experience
To be considered for this project, you should have strong programming skills. Knowledge of graph theory and algorithms would be an advantage. An interest in linguistics is desirable but not essential.
Contact the supervisor for more information.