FFI’s forecasting tournament – dataset and preliminary results
About the publication
Report number
21/00737
ISBN
978-82-464-3385-1
Format
PDF-document
Size
3.2 MB
Language
Norwegian
The purpose of FFI’s forecasting tournament (2017–2020) was to measure how accurately it is possible to predict political events and developments of relevance to Norwegian national security and what characterises people who are more accurate than others. The participants were given questions such as: Will Russian military aircraft violate Norwegian airspace within the next year? Will Russia conduct live fire exercises outside the Norwegian coast? What share of its GDP will Norway spend on defence? If it is possible to predict the outcome of questions such as these, we can also be relatively certain about the future development of key topics.
FFI’s tournament included 240 such questions about armed conflict, Russia, the US, Europe, economy and technology. In total, the dataset consists of 465,673 predictions from 1,375 participants. FFI’s tournament was inspired by the Good Judgment Project (GJP)’s tournament (2011–2015). In fact, FFI’s participants have been measured on almost all of the same individual variables as in GJP. Thus, FFI’s tournament can be used to re-examine key findings from GJP, based on a comparably sized dataset with a completely different set of participants and questions.
On the one hand, the results from FFI’s tournaments find that the ability to predict international politics correlates with many of the same individual characteristics as in GJP, especially cognitive control, numeracy, knowledge, open-minded thinking and time used per question, but not with cognitive styles such as the need for cognitive closure or fox- vs. hedgehog-like thinking. However, these findings are nuanced through questionnaires with FFI’s participants, which show that the specific cognitive styles participants used when the actually predicted were still important. In fact, specific approaches reflecting need for cognitive closure and the distinction between foxes and hedgehogs are both associated with lower accuracy in FFI’s tournament, even though the participants’ general scores on tests of these styles are not.
On the other hand, FFI’s participants are significantly less accurate than GJP’s. However, this gap is mainly due to differences in how the tournaments were organised. First, GJP’s participants could update their forecasts every day until question closure, while FFI’s could only make predictions during the first week after the questions were published. While the former way of forecasting is relevant to intelligence, the latter is more representative of how prediction is done during defence planning processes. Second, the accuracy of GJP’s participants was improved through training and participation in collaborative teams, while FFI’s participants predicted alone with no training. When these two differences are taken into account, the gap is greatly reduced.
Yet, the most important finding is that the best participants («superforecasters») were about equally accurate in both tournaments when based on the same time of prediction, even though all of GJP’s superforecasters were both trained and part of collaborative teams. This raises a question of whether there is an «upper limit» of how accurate it is possible to predict politics and that this level of precision can be achieved simply by identifying the right people using forecasting tournaments. In fact, FFI’s and GJP’s superforecasters share a set of common characteristics, which makes it possible to identify them in advance.