We conclude by considering some implications of our findings. First, the persistent underrepresentation of artificial intelligence (AI) research in the COVID-19 mission field suggests some limitations in the generalisability of state-of-the-art algorithms into a new domain where data is fragmented, unreliable and sensitive, mistakes could cost lives and explainability is at a premium. Given this, and perhaps unsurprisingly, AI researchers have focused their efforts on computer vision and biomedical applications closer to their comfort zone at the risk of neglecting other important domains.
Second, we find some evidence of silos between AI researchers and those in medical and biological science disciplines in tackling the pandemic. AI researchers (and computer scientists) more broadly are sometimes accused of ‘solutionism’, looking for technological fixes for complex societal problems such as predicting and controlling the spread of a pandemic, ensuring the sustainability of public health systems or protecting the mental health of locked-down populations. It will be difficult for them to develop truly effective technologies to tackle these challenges without tapping on the knowledge of other disciplines, something that our analysis suggests is as common as it may be desired.
Third, there are concerns around quality. In recent years the AI research community has become increasingly concerned with low levels of reproducibility in AI research, as well as the prevalence of irresponsible modes of innovation where AI researchers ignore the repercussions and unintended consequences of the techniques that they develop. The comparatively low levels of citations received by AI researchers in our corpus, the weaker track record (in general) of AI researchers tackling COVID-19, the presence of a large number of research groups from unidentified institutions and the large thematic jumps from some researchers into COVID-19 field suggest that similar risks may be present in AI research oriented towards COVID-19. Researchers, policymakers and practitioners need to develop strategies to validate contributions from new entrants into the COVID-19 mission field, while ensuring that new voices and ideas can still be heard.
Of course, our analysis is not without limitations: we are focusing on COVID-19 research published in open preprints sites that may not be representative of wider research efforts to fight the pandemic. Going forward, it will be important to augment this analysis with similar studies of COVID-19 and AI publications in peer-reviewed sources. Our measures of quality and impact (citations) also have important limitations, not least heterogeneity in citation practices across disciplines and the risk of gaming. We have sought to address the first issue by comparing citation rates inside article sources and publication areas but more remains to be done – in particular, using other measures of influence (e.g. dissemination of research in social media) and impact (eg. identifying instances where research leads to patents, clinical trials, policy impacts and open source software applications). It is also worth noting that many of the methods that we have deployed in the paper, such as the topic modelling algorithms we use to segment COVID-19 research into topical clusters are experimental, so we advise caution in the interpretation of results. This is also why we are releasing all the code and data we have used in the paper, allowing other researchers to review, reproduce and build on our work.