How to improve the accuracy of predictions in international politics? – a literature review
About the publication
Report number
21/00735
ISBN
978-82-464-3347-9
Format
PDF-document
Size
1.1 MB
Language
Norwegian
Existing research on the accuracy of predictions in international politics, such as the outcome of the Brexit vote, the number of North-Korean nuclear weapons tests and the growth rate of the Chinese economy, is largely based on two research projects conducted in the US: Expert Political Judgment (EPJ) from 2005 and the Good Judgment Project (GJP) from 2011–2015.
On the one hand, the findings from EPJ were depressing. Here, the accuracy of 284 experts was measured on questions that looked 2, 5, 10 or 20 years ahead. The experts struggled to beat guessing when the time perspective approached 3–5 years. It was also found that levels of education or years of experience did not correlate with accuracy. Experts predicting inside their own domains of expertise were also often worse than those predicting outside theirs.
On the other hand, the results from GJP were far more encouraging. GJP was one of the teams participating in a four-year forecasting tournament sponsored by US intelligence. In order to achieve the highest possible accuracy, researchers experimented with various methods for aggregating predictions from thousands of participants. Hundreds of questions were posed, with an average time perspective of around 100 days. After only two years, GJP did so well that the other teams were dropped. The findings from GJP showed that it was possible to predict the outcome of questions of relevance to US intelligence. The winning recipe was a combination of recruiting the right people and taking measures that helped improve the overall accuracy.
A common finding in both EPJ and GJP was that there are systematic individual differences in accuracy. Better accuracy was associated with higher scores on tests of cognitive abilities, political knowledge and open-minded thinking. The best forecasters were also more motivated by the desire to win, had a higher need for cognition and a more probabilistic approach to future events. At the same time, GJP found that it was possible to improve accuracy through several measures:
1) training in probabilistic thinking, e.g. the use of base rates;
2) interaction between participants, both in the form of cooperation in groups and competition through prediction markets; and
3) algorithms that weighted the predictions made by participants who had previously been more accurate and who had recently updated their forecast.
However, these findings are not necessarily valid in a Norwegian defence and security policy context. Participants in both studies were largely US citizens. It is not given that the same individual variations exist among Norwegian experts and participants. Neither is it given that the results will hold on questions on the most important actors to Norwegian national security. Even though experts in EPJ struggled to beat guessing on questions that looked 3–5 years ahead, they were more accurate the shorter the time perspective. Thus, GJP’s time perspective of 100 days was likely easier to forecast within. The purpose of FFI’s forecasting tournament (2017– 2020) was therefore to examine these findings with Norwegian participants on questions of relevance to Norway and with a time perspective between 100 days and 3–5 years.