Data Science: What the World Needs is Answers, Not Just Insights Part 2 (of 3)
October 8, 2012 at 10:57 am Mary Ludloff 5 comments
By Marilyn Craig, Managing Director, Insight Voices
As you may or may not know, we are in the midst of a 3-part series on data science, covering roles, skills, etc.—generally what you should think about as well as what’s not as important (no matter what the latest articles say!). For Part 2, we have a guest poster—Marilyn Craig of Insight Voices. Marilyn is what I like to call a “classic quant.” She has been at the forefront of big data and data science before most people knew these terms (and spaces) existed and has been my go-to person whenever I had an analytics question (see title) that I needed an answer to. In this post, Marilyn looks at insights and makes the case for why we should all care far more about answers. Take it away Marilyn!
Here’s an interesting question for this new world order of Big Data Analytics: what’s an Insight and what’s an Answer? Sometimes they are the same, sometimes not. An insight is a piece of information or understanding. It may or may not be useful. It may or may not help your business improve, solve world hunger, or even make sense. An answer is always useful. It is the result of asking a question. And the best kinds of answers are those that solve the questions that you really care about.
A lot of the excitement about Big Data is the “insights” we can get now where we couldn’t before (often due to the quality of our tools). But just having someone wander through a ton ‘o data looking for interesting “insights” is potentially a huge waste of time. Because an insight is not necessarily the answer to a question. There have been zillions of questions over the last couple of decades that I’ve never been able to answer for the businesses I’ve worked on. The insights I really want are the answers to those questions.
That’s why I completely agree with Mary’s assessment of where we should start looking for “data science” resources – inside our own organizations. Because those folks are much more likely to know the important questions that need answers. The risk in bringing in a bright, new, shiny Data Scientist from outside is that they might spend time digging through your data looking for insights that aren’t really important. Now, I’m not saying that all data scientists are undisciplined practitioners that don’t listen to the questions and hypotheses presented by business decision makers. Good DS’s know to ask the right questions first and then approach the data and analysis. My point is this: asking the right questions is the most important part of the data science process—this has always been the case and will always be the case. Just wandering around in the data may prove useful, but finding answers to the wrong questions or unhelpful questions is a waste of time and money. (This is also why I shudder when vendors come in and try convince me that the petabytes/second that they can process should be the most important thing to me – obviously, they’ve never walked in my shoes!)
A good example of asking the right question (or wrong, depending on your reaction to the “creepy quotient”) is the Target pregnancy prediction kerfuffle. On one hand the marketers at Target were absolutely asking the right question: How can I communicate with new parents before anyone else does to get a jump on selling them appropriate products? However, any experienced marketer (or data scientist with the appropriate mindset) who was focused on the customer experience would have said, “Stop!” when presented with the outcome of the first round of research. Target learned very quickly that is it not enough to just reach these prospects first. They need to know how to reach them first without freaking them out!! You have to have the right customer-focused people involved in your analytics process, asking the right questions, AND thinking through how you are going to use the answer.
I will guarantee you that you have individuals (not just statisticians or analysts, but marketers, sales people, accountants) in your organizations that are perfectly capable and very willing to fulfill the data scientist role given that you provide them with a toolset that is designed for them and not some über geek programmer. (No offense to the geeks out there as I am married to one which makes for some very interesting conversations about things like bloom filters.). These folks have the passion for figuring out what drives your business and have been repeatedly frustrated by the inability of the tools available to them (traditional BI and Big Data 1.0 tools being the worst) to help them get the answers they seek. Give them the right tools, set them loose, and Eureka! You will have gained two important objectives: answers about your business and employee satisfaction in your ranks.
This is what I know: Data science is not only the purview of specialists. Listen, Math/Statistics can be learned (and in the big data age, all of us should become better at both disciplines) but very, very few of us will need to get PhD’s. A company is always best served by first enhancing the skills of folks that already understand their business and then supplementing them with “specialists.”
I’ve been in and around the world of data for almost 25 years. IMHO, big data is only kind of new. Many industries, like financial services, retail, medicine, and logistics, have been generating “big data” for forever. And those of us working in those businesses as marketers, product developers, financial analysts, and decision makers have always been frustrated by not being able to harness all that data to drive our businesses, not to mention the dreams that folks in other industries had. We have always been grappling with these two issues:
- We are saddled with tools that don’t respect our time. You can’t do big data with Excel but you can do amazingly complex analysis on small data with it. So don’t tell me I have to become an advanced java programmer to solve my big data problems (Hadoop vendors are you listening?).
- We didn’t collect and store the data we had access to. To make our IT partners happy, we aggregated our data within an inch of its life, thereby losing a lot of interesting variation that we needed. Thankfully, storage costs are now within a normal person’s (i.e. company’s) budget.
As Cosma Shalzi, my favorite statistics professor, puts it: Our theories (and desires for answers) ran way ahead of what we were able to do with the data and tools available. He goes on to point out that statisticians have always had the skills now lauded as what is needed for a data scientist – statistics, computer science, data visualization, and the social sciences.
We are at a crossroads. Today, for many companies, the data is being collected and with Big Data 2.0 companies (like PatternBuilders) we finally have the tools we need to utilize it. But the discipline of statistics is not new, and there have always been talented, articulate practitioners willing, and certainly able, to tackle any analysis we could throw at them. As we move forward into the big data age, keep this in mind: focus on the answers, not just the “insights!”
Do you agree or disagree? Let me know in the comments! Next up: Part 3 of 3—Mary’s take on the data science team and some other roles.
Entry filed under: big data, General Analytics, Technology. Tags: analytics, big data, business intelligence, data science, data visualization.
1.
Big Data and Science: Focus on the Business and Team, Not the Data (Part 3 of 3) « Big Data Big Analytics | October 20, 2012 at 4:51 am
[…] Did you notice that technology and data science are not reflected in any of the characteristics? Some of you may consider this sacrilege—after all, we are operating in a world where technology (and I happily work for one of those companies) has changed the data collection, usage, and analysis game. Colleges and universities are now offering master degrees in analytics. The role of the data scientist has been pretty much deified (I refer you to Part 1 of this series). And we all need to be very worried about the “talent shortage” and our ability to recruit the “right analytical team” (I refer you to Part 2 of this series). […]
LikeLike
2.
A Big Data Showdown: How many V’s do we really need? Three! « Big Data Big Analytics | January 17, 2013 at 7:06 pm
[…] all things to do with big data as Marilyn and I pointed out in a series of blog posts (Part 1, Part 2, and Part 3) that sought to bring some pragmatism to the ever looming, hyperbolic data scientist […]
LikeLike
3.
Big Data Project: Let’s Start at the Very Beginning—The Big Data Playbook « Big Data Big Analytics | February 12, 2013 at 5:13 pm
[…] your best programmers, your best technologists (see our series on big data talent, Part 1, Part 2, and Part 3). What’s missing from this group? Business users—you know, the people responsible […]
LikeLike
4.
Big Data Project: Let’s Start at the Very Beginning—The Big Data Playbook | Big Data Big Analytics | March 21, 2013 at 10:50 am
[…] your best programmers, your best technologists (see our series on big data talent, Part 1, Part 2, and Part 3). What’s missing from this group? Business users—you know, the people responsible […]
LikeLike
5.
Big Data Project: Start with a Question that You Want to Answer | Big Data Big Analytics | April 3, 2013 at 5:39 pm
[…] question that needs to be answered which is something I spent a great deal of time writing about in our series on the data science team. However, Harvard Business Review is much more succinct in identifying the […]
LikeLike