CrowdFlower 2015 Data Scientist Report Highlights Helps and Hindrances

By

A few weeks ago I posted a piece on San Francisco, CA-based CrowdFlower whose technology platform takes large, data-intensive projects and divides them into small tasks that are distributed to a multi-million person, on demand global workforce. In that article were CrowdFlower’s views on data science trends for 2015, and topping the list was the emerging importance of Chief Data Scientists.

In fact, while employees whose jobs for years have been to deal with various seemingly mundane but important data cleansing and security challenges, aka “data wrangling,” both the nature and importance of data scientists has changed markedly. The position is now considered a “must have” for enterprises around the world and has a level of cachet surrounding it that was not evident even in just the past several months.

The reasons for this are quite evident.  As we all are aware, it is hard to review any tech publication these days without running into the words “Big Data” and “sophisticated analytics.”  These are seen as the keys to creating “actionable insights” that will enable organizations to do a host of thing, which range from better profiling all of us (including anticipating our needs before we even know them) to being critical in transforming business processes and practices to exponentially improve an increasingly diverse array of operational challenges.

Where the data scientists, who now must be part mathematical genius and part business wizard, fit is in making all of that data—structured and unstructured, siloed and hopefully increasingly shared—literally and figuratively come to life.

The question that arises is where are we in terms of what current data scientists do, what they like to do and what tools they need to better perform what are becoming invaluable functions.  It is the subject of the recently released CrowdFlower 2015 Data Scientist Report.  As Lukas Biewald, co-founder and CEO of Crowdflower told TechZone360, “We had no agenda for the report except to provide more information on the critical importance of data scientists to their organizations as well as context as to why.  We hope that this will enable them to have more meaningful discussions across their organizations and in particular with C-levels.”

Minding the reality versus needs gaps

Data scientists that fit the most recent job description profile are a rare breed.  However, CrowdFlower was able to survey 153 General Population respondents from CrowdFlower's online research panel who all work for companies of varied sizes and sectors, mostly in the U.S. and have "data scientist" in their job title or job description on LinkedIn.

What was fascinating from the survey was how many of the respondents were satisfied with their job (79 percent) including 30.1 percent who said it was “totally awesome.”  Plus, the diversity of the roles outlined by respondents also holds some clues about their value and their skills now and going forward.  There were:

  • Researcher - 54.3 percent
  • Computer Scientist - 52.3 percent
  • Business Intelligence Analyst – 36 percent
  • Educator – 18.3 percent
  • Entrepreneur- 12.4 percent.

There is a nice infographic that highlights the survey responses. They include:

"Data science" is a new term for something that's been around for a while. In fact, as noted, while the term "data science" is seems new, 16 percent of data scientists reported that they have worked in this field for 10 years or more.

Messy, disorganized data is the number one obstacle holding data scientists back. Two-thirds of respondents say cleaning and organizing data was the least interesting and most time-consuming task, taking time away from more preferred tasks, such as predictive analysis and data mining.

In regards to the last point, three graphics illustrate a gap between what data scientists do and their wish list of what they would like to do. It starts with their challenges. A as can be seen they believe they are spending too much time cleaning dirty data and doing so with limited tools, human as well as non-human.

Source:  CrowdFlower 2015 Data Scientist Report

This compares with their wish list.

And, look at the chart on what they are happiest doing which speaks to the gap point.

There were also a couple of other findings of interest that are more than food for thought. The first is that while data scientists use a diverse toolkit dominated by open source. The survey found that although Excel is still the most commonly used tool (by 55.6 percent of respondents), data scientists also use at least 47 other tools and languages to do their jobs. Nearly all data scientists (98 percent) use open source software, and tried-and-true open source languages such as R remain major parts of data scientists' toolbox.

In addition, and not surprisingly, the most in-demand data science skill set is programming and coding. In addition to the survey results, CrowdFlower used its own data enrichment platform to collect and analyze 1,024 LinkedIn data scientist job postings and found that the top two skills companies are looking for are programming and coding (seen in 55.3 percent of job postings) and statistical tools (seen in 52.1 percent of job postings).

"We know that data scientists are valuable for their companies, but there's still a disconnect between what they actually do and what they want to do," said Biewald. "At the end of the day, the time they invest in cleaning data is time that could be better spent doing strategic, creative work like predictive analysis or data mining. If companies can give data scientists some of that data cleaning time back, they'll have happier teams that can focus on really exciting things."

If a data scientist is not in your present organization there is a very strong likelihood they will be in your future.  Obviously providing them the support they need to enable them to better help organizations succeed will be key. This includes obviously providing them the tools to free up time now spent on data cleaning.  As Biewald noted, “It will be interesting to see when we do this again next year how much the responses change in terms of closing the gap.”




Edited by Maurice Nagle
Get stories like this delivered straight to your inbox. [Free eNews Subscription]
SHARE THIS ARTICLE
Related Articles

ChatGPT Isn't Really AI: Here's Why

By: Contributing Writer    4/17/2024

ChatGPT is the biggest talking point in the world of AI, but is it actually artificial intelligence? Click here to find out the truth behind ChatGPT.

Read More

Revolutionizing Home Energy Management: The Partnership of Hub Controls and Four Square/TRE

By: Reece Loftus    4/16/2024

Through a recently announced partnership with manufacturer Four Square/TRE, Hub Controls is set to redefine the landscape of home energy management in…

Read More

4 Benefits of Time Tracking Software for Small Businesses

By: Contributing Writer    4/16/2024

Time tracking is invaluable for every business's success. It ensures teams and time are well managed. While you can do manual time tracking, it's time…

Read More

How the Terraform Registry Helps DevOps Teams Increase Efficiency

By: Contributing Writer    4/16/2024

A key component to HashiCorp's Terraform infrastructure-as-code (IaC) ecosystem, the Terraform Registry made it to the news in late 2023 when changes …

Read More

Nightmares, No More: New CanineAlert Device for Service Dogs Helps Reduce PTSD for Owners, Particularly Veterans

By: Alex Passett    4/11/2024

Canine Companions, a nonprofit organization that transforms the lives of veterans (and others) suffering PTSD with vigilant service dogs, has debuted …

Read More