By now, everyone knows that “Sexiest Job of the 21stCentury” is the data scientist. The Harvard Business Review made that declaration in 2012 and the race to become (or rename) oneself as a data scientist was on, at a feverish pitch. Bringing in data scientists with a strong educational and professional background is important, of course, but training the business in how to properly use the data scientist is critical as well. Otherwise, it would be like having a fancy new Tesla but not taking the time to learn how to charge it up.
So how does an organization get the most out of their data scientists? How do we alleviate the shortage of data scientists that threatens to stymie the business and societal benefits that these unique folks can bring forth? How do we ensure that data scientists are focused on helping the organization drive financial or business value (as opposed to publishing articles or speaking at conferences, a complaint that I have received more than once)?
Let’s start that discussion with some basic but important definitions.
What Is Data Science?
To get more value from our data scientists, we first must understand “What is data science?” The best definition of data science comes from the book Moneyball: “Data science is about identifying variables and metrics that might be better predictors of business performance.”
That’s a very simple description, but let’s deconstruct it anyway.
- Identifying variables and metrics. The data science process must be driven by a creative and curious mind in order to identify and brainstorm the variables and metrics (data sources) upon which to focus the data alignment, transformation, enrichment, and visualization efforts.
- Better predictors. The focus of the data scientist is on predicting what is likely to happen and prescribing what actions to take (versus reporting on what has already happened). This requires a thorough understanding of the decisions that must be made in support of the organization’s business initiatives.
- Business performance. The key deliverable must improve business or financial performance in order for the data science work to be relevant and meaningful to the organization.
As detailed in Moneyball, the Oakland A’s discovered several variables that were better predictors of the value of a baseball player – for example, that on-base percentage was a better predictor of a hitter’s value than batting average.
Finding The Next On-base Percentage
Organizations must help their data science teams to find that next more predictive, on-base percentage kind of variable. And the key to doing this actually lies with the business users, not the data scientists.
Table 1 highlights the roles that business users and data scientists play in collaborating to fully exploit the power and potential of data science.
|Data Science Objectives||Business User Responsibilities||Data Scientist Responsibilities|
|Identifying Variables and Metrics||Business users are best positioned to brainstorm variables and metrics (data sources) that might yield better predictors of business performance because they live the job every day and probably have a good idea of the different variables that they would like to test||Data scientists are responsible for gathering and ingesting the data from wherever it may be located, using the most effective data-acquisition techniques and then applying different data transformation and enrichment techniques to prepare the data for analysis|
|Better Predictors||Business users are responsible for determining which of the analytic results coming from the analytic modeling processes pass the Strategic-Actionable-Material (SAM) test.||Data scientists use data visualization techniques to better understand the interplay of the data, such as identifying variables that tend to move together under certain situations, or outliers in the data that may be indicative of something useful.|
|Business Performance||Business users own the identification of the decisions that the business is trying to make with the data science or analytic results in support of the organization’s business initiatives. Ultimately, the business users will tell you what is working and what is not working.||Data scientists build the analytic models that quantify cause-and-effect using a wide variety of analytic modeling algorithms. They then determine the quality of fit for those models. This process typically requires many iterations and provides many opportunities to learn from failure.|
Table 1. Roles of business users and data scientists.
The Power of Business Decisions
To ensure that the organization is getting the most value out of its data science operation, focus on the business decisions. These decisions the linkage points between business users and data scientists (ensuring that everyone is focused on the same objectives), and it’s around these decisions that the collaboration between business users and data scientists can deliver the most business value (Figure 1).
Figure 1. Decisions provide linkage between business and data science.
The decisions are key because:
- From a top-down perspective, decisions provide the framework around brainstorming the necessary variables and metrics (data sources), and also dictate your architecture and technology requirements.
- From a bottom-up perspective, the analytic models built from the different variables and metrics (data) create the analytic results (e.g., scores, recommendations, business rules) that will be applied to optimize the decisions that support the organization’s business initiatives.
No data science initiative should exist in a vacuum. The collaboration between business users and data scientists is central to optimizing business processes, uncovering new monetization opportunities, and realizing the most value from your big data analytics investment.
Authored by:- Mr. Amit Mehta, Country Manager, Isilon Storage Division at EMC India & SAARC at EMC India
(The views expressed in this article are by Amit Mehta. Technuter.com doesn’t own any responsibility for it.)
@Technuter.com News Service