Machine learning for thematic categorisation and analysis of Islamic State propaganda pictures

FFI-Report 2023
This publication is only available in Norwegian

About the publication

Report number

23/02227

ISBN

978-82-464-3500-8

Format

PDF-document

Size

9.9 MB

Language

Norwegian

Download publication
Mathias Bynke Vidar B. Skretting Bernt Ivar Utstøl Nødland

For analysts and researchers studying terrorist groups and other non-state actors, propaganda materials provide important sources to understand who these actors are, what they want, and how they operate. Recent technological developments have made it significantly easier for such actors to produce propaganda and publish it online, in ever-increasing quantities. This has made it almost impossible for analysts and researchers to go through these materials manually. It is therefore imperative that we develop new methods to analyse such large amounts of data. This report explores how machine learning can support the analysis of one of the most common forms of propaganda: pictures.

In this report, we have chosen a typical use case – a corpus of over 30,000 propaganda pictures produced by the terrorist organisation IS (The Islamic State) – and explore how machine learning methods can be used to gain an overview of its contents. This is used as a starting point for an analysis of the propaganda pictures. The aim of the study is twofold: to contribute to method development in machine learning and to provide new insights into IS’ visual propaganda.

We use the machine learning model Contrastive Language-Image Pre-training (CLIP) to analyse the content of each individual image. CLIP translates each image into a vector that represents its content. We then use the clustering algorithm 𝑘-means to divide these vectors into a number of clusters, minimising the internal variation between the vectors in each cluster. This results in clusters of images with similar thematic content.

We use this thematic division as a starting point for an analysis of IS’ propaganda pictures in the period 2014–2022. Since each of the images contains metadata about the time and place of production, we can also analyse how the themes in IS’ image propaganda have developed over time and between IS’ so-called provinces.

Our main findings are that while IS portrayed itself both as a military organisation and as a civilian state apparatus early in the period (2014–2018), since 2019, it has almost exclusively presented itself as a military organisation and rebel group. Furthermore, the preponderance of the propaganda material has moved geographically from Iraq and Syria to West and Central Africa. The development in the propaganda materials reflects the major changes IS has undergone as an organisation in the same period. This is especially apparent after IS lost its last pockets of territory in Iraq and Syria in 2018–2019. However, our findings also suggest that IS deliberately chooses to downplay the civil-administrative parts of its own activity in parts of Africa where the group is on the rise. Moreover, the pictures produced by the African IS provinces feature significantly more violent content than the other provinces, while there is less content devoted to martyrs than what has been the norm in jihadist propaganda. These developments may give an indication of where IS is headed in the near future.

We find that combining CLIP and clustering algorithms is a quick way to get an overview of the contents of a large image corpus, providing a useful tool for analysts. The method is easy to employ and can easily be adapted to other use cases.

Newly published