What are the skills implications of the data revolution? We reflect about its impact on our day to day, and its link to recent research about how data analytics skills are creating value in UK businesses, and what this means for education and skills policy.
It has almost become a cliché to use mindboggling figures to illustrate how data is transforming the economy and society: billions of transactions in popular websites, years of content flowing through the internet, millions of devices generating streams of data day in and day out. But big as they may be, these numbers don't really give a sense of the actual applications of data, or of the effort and skills required to create value from it. They also hide the humans who (often if not always) generate the data and analyse it.
Maybe it's more fruitful to think about the way in which our lives at Nesta have been transformed by data. What are we doing at our jobs today that we couldn't have done, say, five years ago, when there was less data around?
Is it big data? Not really. Up until now, we haven’t had to run any Hadoop jobs (volume), store our data in NOSQL databases (variety) or work with streaming APIs (velocity), although we are getting there.
But that’s the wrong question anyway. Thinking of data in terms of size, or of the technology infrastructures required to create value from it can distract us from the truly important question: can we use this data to create value?
We believe so. Our business is to understand the dynamics of innovation and propose actions to support it in a way that creates economic and social value, and data is helping us to do that much more effectively, measuring innovation phenomena in a way that would have been uneconomical or impossible before: for example, to track connectivity between participants at innovation events, map informal communities of innovators, or identify companies operating in sectors that are poorly captured by standard codes, such as video games. We are even using data to understand innovation in areas like the arts and culture which traditionally have had little quantitative analysis, like theatre and literature.
Innovation in data outputs helps us communicate research findings in a more compelling and enlightening way, and engage new audiences. Going forward, we want to create “self-service” applications that policymakers and other agents in the ecosystem (investors, entrepreneurs, etc.) can use to answer their burning questions, instead of having to rely on us to do it.
Doing all the above requires new skills: to get online data, clean it and wrangle it. Skills to put data in a shape that can be analysed, and skills to analyse it and visualise it. Also skills to understand the limitations and pitfalls of these news datasets.
Without those skills, all that new data would be for nought, and might even be detrimental if, for example, it led us to make the wrong decisions, or recommend the wrong actions. As Nate Silver put it in the introduction to The Signal and The Noise, “Big data will produce progress – eventually. How quickly it does, and whether we regress in the meantime, will depend on us.”
Nesta’s new report in partnership with Creative Skillset, Skills of the Datavores, suggests that our personal and organisational experience with data – in terms of its opportunities and skills aspects - reflects the wider situation of other organisations in the UK. This has policy implications that we explore in Analytic Britain, a policy briefing that we have developed jointly with Universities UK.[1]
For starters, Skills of the Datavores shows that there isn’t a “one size fits all” for data: Different “data active” businesses are taking different approaches to creating value from data. We find:
We also find 30% of dataphobes who aren’t doing much with their data: apparently they have decided to give the data revolution a pass.
Our analysis of the impact of data in performance suggests that this is a big mistake: data-active companies, and particularly Datavores and Data Builders are significantly more productive than the Dataphobes, even after we control for other important firm level factors such as their sector, their age, their size and their self-reported levels of innovation.
Another striking difference between data active companies and dataphobes is that the former don't just use data to save costs, but also to discover new opportunities and develop new products and services: consistent with our experience at Nesta, data doesn’t just create value by allowing us to do the same things better, but also by allowing us to do completely new things.
What methods are data active companies using to do this?
As the figure below shows, a mix of data management and analytics methodologies coming from a variety of disciplines such as statistics, computer science and software engineering. As one would expect, Data active companies tend to rely on more innovative and sophisticated methods, ranging from advanced statistics (e.g. non-parametric methods and time-series analysis) to unstructured data analysis (e.g. social network analysis and text mining) or machine learning.
Applying these methods requires talent with the right skills (i.e. “data scientists”). The data active companies in our sample are much more likely to have sought to hire such people than Dataphobes in the 12 months before we surveyed them. 59% of Datavores tried to recruit at least one analyst over that period, compared to around a quarter of Dataphobes.
Worryingly, data active companies are struggling to find analytical talent to create value from their data: for example, two thirds of the Datavores that sought to recruit had difficulties filling at least one analytical vacancy.
The three hardest to find skills were:
We were very interested to find that the businesses that we surveyed are using innovative types of training to keep the skill of their analysts up to date. Leaving aside internal and external training, they were also:
This innovation reflects the speed with which the data field is moving, in terms of flows of knowledge and talent, and the opportunities for collaboration and networking between data analysts working in different industries.
Our findings show that data is enhancing innovation and productivity in UK businesses, and can help close the UK's productivity gap with other G7 countries, a priority for the government. Removing analytical skills shortages such as those identified in Skills of the Datavores, as well as allied reports by the British Academy, the Tech Partnership and Universities UK is therefore vital.
Analytic Britain, a policy briefing that we have developed together with Universities UK, makes recommendations on how to do this. These recommendations cover the whole “analytic talent pipeline”, including schools, universities and the labour market and industry networking. We need to:
Analytic Britain sets out in detail who we think should be doing what, covering a “broad church” of stakehoders which mirrors the disciplines and industries being transformed by the data revolution.
We believe that acting on our recommendations will greatly strengthen the supply of analytical talent in the UK, and our ability to create value from data, regardless of whether it is big, messy, fast...or simply data.
(Voronoi treemap of the Mammals via Anders Sandberg)
[1] Skills of the Datavores is based on a telephone survey of 404 medium and large businesses for whom data plays some role in operation, working in 6 sectors (Creative Media, Finance, ICT, Manufacturing, Pharmaceuticals and Retail).