Knowledge Extraction From a Small Corpus of Unstructured Safeguarding Reports, Portorož

Aleksandra Edwards, a PhD student of the Crime and Security Research Institute (CSRI), has recently contributed to the 2019 ESWC conference in Portorož, Slovenia. The ESWC is a major venue for discussing the latest scientific results and technology innovations around semantic technologies. As part of the Poster Exhibition, Aleksandra presented her paper “Knowledge Extraction From a Small Corpus of Unstructured Safeguarding Reports” which explores the methods by which sentiments can be extracted from complex documents.

The paper evaluates the ability of different text analysis tools to extract information from a small corpus of unstructured safeguarding reports. These documents denote the events leading up to a crime, the agencies and practises involved in a case and recommendations for improvements. As the reports are lengthy and contain many emotive sentiments, Aleksandra tested the performance of analysis tools to overcome the time-consuming and bias-prone nature of human inspection.


The results showed a higher overall score for the non-trained human annotators than their software counterparts. The software’s poor performance in both entity extraction and sentiment analysis point to the need for domain-specific approaches, for example around safeguarding issues, for knowledge extraction on these kinds of document. In future work, the authors will look to improve such tools by using word and sentence vectors to discover themes to use as a base for creating an ontology.

