Sophisticated artificial intelligence (AI) techniques are becoming widely used in security and policing applications such as monitoring CCTV feeds. However, the current state-of-the-art AI technologies are still "black boxes": it is often unclear how they work and they usually lack explanations for their outputs, making it difficult for users to trust them.
Seeking to address this issue is a new paper by Crime and Security Research Institute (CSRI) PhD student Liam Hiley, recently presented at the International Joint Conference on Artificial Intelligence (ICJAI) in Macau, China.
For this paper, researchers created a new algorithm to allow an AI system that processes video, for example CCTV feeds, to generate explanations for the events it detects. This novel technique allows the AI system to separate the temporal and spatial aspects of the video, which are the most important components for its decision making process. In video, unlike still images, quick changes in a scene are often the most important aspect used for deciding what is happening. In a fight captured on CCTV, for example, the technique can highlight a rapid action such as a punch as being particularly relevant, assuring a user that the AI is focussing on the "right" aspects of a scene.
The example images here are from a sports video of a tennis match. The first set frames shows the original video, while the second set shows the explanation for why the AI system decides it is a tennis match. In this example, the ball is highlighted in red because the AI system considers it to be an important "temporal feature" - in other words, a highly relevant fast-changing feature of the video.
The research was undertaken as part of the DAIS ITA programme funded by Dstl and the US Army research Laboratory. The paper was presented by Dr Federico Cerutti as part of the IJCAI 2019 Workshop on Explainable Artificial Intelligence (XAI).