Data analytics generalist. I publish notes, lessons, and tools for data analytics and investing.
In the fall of 2012, I remember my mother telling me about an article that said data scientists are the new, sexy profession. The moment stuck with me because nobody wants to hear their parents utter the word, “sexy”. Unbeknownst to me at the time, this Harvard Business Review article is claimed to be the catalyst for the huge onslaught of students entering the data science field. This wave of “data enthusiasm” would come to have a heavy influence on my own career trajectory.
Over the next eight years, the terminology used to describe data-related topics had changed dramatically. In 2012, the top 3 most common search terms were “Statistics”, “Artificial Intelligence/AI”, and “Big Data”, respectively. As of July, 2020, the top 3 most common search terms changed to “Machine Learning”, “Data Science”, and “Artificial Intelligence/AI”. Excluding the recent COVID bump (e.g. “COVID statistics”), the term, “statistics” has seen a sharp decline in its usage over this time period. If you knew in 2012 that enthusiasm for data would explode over the next decade, there is no way you would think that google searches for “statistics” could decline.
Published on Tableau Public
After seeing those trends, you might think data science is a threat to statistics and statisticians. A quick search for data scientist job openings on Indeed yielded 8,076 results in the United States, while there were only 1,526 job postings for statisticians. With the expected growth in demand for data-related skill sets, I think the more likely outcome is a drift away from employers looking for the unicorn data scientist. For those outside of this field, the unicorn data scientist is someone who can do it all; a full stack programmer, statistician, and machine learning engineer all in one. There will be a need for specialization across the spectrum, allowing statisticians to find their place, too. This change in hiring, job responsibilities, and titles will emerge for the following reasons:
- Machine learning is being added to computer science undergraduate curriculums
- APIs will make it easier to automate machine learning and integrate it into existing applications
- Data-related technology continuously expands in breadth making it impossible for a data scientist to learn everything
- Automation will free up time for more creative thinkers
I believe these trends will lead to a bifurcation of data scientists into two paths, software engineers, with a focus in machine learning, and decision scientists. Let’s take a look at the differences.
Because of their varying strengths, they will be responsible for different types of problems. The software engineer will be assigned projects that have a clearly defined scope and access to large, quality data sets. They will build applications that integrate with machine learning APIs. These APIs will automate the standard process of data ingestion, learning, training, and prediction.
Examples might include business intelligence applications that incorporate sales forecasts or designing a knowledge discovery system that utilizes natural language processing. Meanwhile, the decision scientist will be assigned projects that have an undefined scope. They will be responsible for framing the appropriate question to answer. These problems will typically have incomplete information, forcing the decision scientist to be more comfortable with uncertainty. They will need to come to conclusions safely beyond the data analyzed. This will require a concrete understanding of the domain.
Because of the nature of their work, I think decision scientists will have a wider variety of backgrounds as compared to the software engineer. The software engineer will typically be follow a standard computer science track, attend coding bootcamps, or come from former data engineers. Meanwhile, the decision scientists will likely be former statisticians, or folks with analytical backgrounds from the social or physical sciences. With competencies across multiple domains, this person could be seen as a generalist. Maybe a “data generalist“.
With technology changing faster than ever, data professionals need to be cognizant of the long term trends. In their 2020 emerging jobs report, LinkedIn listed data scientists as the #3 job with an annual growth rate of 37 percent. The excessive demand for data skills will drive a need to further refine the specific positions within data science. It will be interesting to see how this field unfolds over the next decade.
Image Sources Crystal ball
Previously published at https://thedatageneralist.com/the-future-of-data-scientists/