Overcoming Data Mining Challenges:
Explaining Data Mining to Others
|Copyright (c) 2011 Rexer Analytics, All rights reserved
Overcoming Data Mining Challenges:
Explaining Data Mining to Others
In each of the four annual data miner surveys, "Explaining Data Mining to Others"
has been identified as a one of the most frequent challenges data miners face.
While completing the 4th Annual Survey (2010), 65 of the 735 data miners shared
their experiences in overcoming this challenge. Below is the full text of the "best
practices" they shared. A summary of data miners' experiences overcoming the
top three data mining challenges is also available.
Challenge: Explaining Data Mining to Others
- Leveraging "Competing on Analytics" and case studies from other
organizations help build the power of the possible. Taking small
impactful projects internally and then promoting those projects
throughout the organization helps adoption. Finally, serving the data
up in a meaningful application - BI tool - shows our stakeholders what
data mining is capable of delivering.
- Initiate Knowledge Sharing Sessions about DM basics and purposes.
- Graphical representations are very helpful (i.e., gain or lift charts).
- The problem is in getting enough time to lay out the problem and
showing the solution. Most upper management wants short
presentations but don't have the background to just get the results.
They often don't buy into the solutions because they don't want to see
the background. Thus we try to work with their more ambitious direct
reports who are more willing to see the whole presentation and, if they
buy into it, will defend the solution with their immediate superiors.
- Focus on dollars, overall benefit of model application to the Balance
Sheet and P&L.
- Measuring results compared to control groups is the best to convince
people about data mining results.
to manipulate the data on the fly based on their expertise to analyze
- I've brought product managers (clients) to my desk and had them work
with me on what analyses was important to them. That way I was able
different aspects that were interesting to them.
- Explaining results and their business impact with visual & graphical
presentation, explaining historical trends & variance analysis, logically
helps explain business trends in data to business users.
- One View of the Truth philosophy. Definitions of variables are
consistent across business functions.
- Visualize and explain models and model spaces. Explain and interpret
results. Show and explain evaluation and significance of results.
- It all depends, when you are explaining data mining or the techniques
you utilized completely depends on the audience. Example: For sales
it is how you can help them, what insights can you give them to better
understand their consumer base or how to make better decision...
keeping it simple and getting them on-board that this will not diminish
- Find analogies in the sports or gambling domain that everyone can
- Our organization is in a grass roots mode at this point in time and the
management teams are not as familiar with some of the DM and TA
concepts, so we are beginning a Lunch and Learn training program to
bring the organization up to speed with respect to CRISP-DM and the
- Using Portrait Customer Analytics, which has a great visual interface,
on the fly in front of clients while presenting analytic deliverables has
been great...delivers more value than static PowerPoints, and gets our
clients to collaborate with us on the analytics...also ends up generating
more projects because we discover interesting segments during
- Lots of graphics including standard control charting of root cause
variables helps. Connect standard process control charting to illustrate
data mining results.
- We try to explain data mining by giving successful business examples.
This is always challenging if you have not worked on a particular
subject or project.
- We're funneling all BI-related needs through a formal "Decision Support
Committee" meeting comprised of Accounting, Finance, IT, and
Strategic Financial Reporting through which needs are prioritized,
funded, and communicated. Also, all-hands meetings (monthly) are
being used to evangelize the needs, benefits, and requirements that go
into data mining/BI.
- Graphs and Charts. Also, I have a 2-dimensional example I created
that illustrates how a "rule" slices the picture into "puzzle-pieces" and
assigns each piece a confidence value of a point being "red" (the target
value is "red" for true, and "white" for false). Every since I started
sending this example out, and when necessary explaining it (usually in
a larger talk), the questions about why what I do can be trusted have
literally dried up.
- Education of DM-interested parties (DM consumers).
- I don't explain what the model looks like, but I show how much money
loose if they don't use the model well.
- I rely on visuals reflecting a models potential impact on operations.
- It seems to work best to show others a simple data mining exercise so
that you can take them from beginning to end - from input data to
results. This usually makes sense to most people, and they are more
ready to attempt their own complex data mining projects.
- Really spending time to simplify what the data mining is doing and
creating PowerPoint slides which show what is happening at a
- Simple example that worked well, use metaphors and use business
language for non data miners.
- Use of simple data sets, e.g. with 10 cases and 1 target and 5
explanatory factors and demonstrating chi square calculations on 2x2
tables is successful.
- Usually use graphics that are intuitive to the customer.
- Visualisation tools that allow users to explore, such as Vortex from
- We are not trying to explain data mining, just using ROI all the time, no
need to explain anything else.
- We don't explain data mining as such, but we explain the results of data
mining through graphical statistical & models exports into MS Office
- We occasionally hold seminars with business users to explain to them
how data mining exercises are accomplished and the value they can
- We use campaign results (take rates/revenue) to explain the
effectiveness of models on business outcomes.
- While our management is very hands-on and quantitative we don't
typically get push back on black-box models as long as we can show
relative variable importance and partial dependence charts that make
sense in business terms. For example, any at-risk model that doesn't
have a recency predictor as one of the top variables is suspect.
- Developed a dataflow diagram for the overall Data Mining processes.
- Where do I start? In order to obtain funding for my research (I am
only a graduate student at UW Madison), I had to provide evidence and
an explanation as to why what I was going to do at least had the
potential to provide some basic insight. On a different note, I have
found that MindMapping (I like NovaMind 5), is a very effective way of
not only communicating the potential benefits of data mining, but also
provides a very effective 'path' for the researcher to follow to organize
his/her thoughts and plan the data mining process/techniques, etc.
- Short courses and finding partners with "built-in" credibility in the group
we are trying to get the tools to.
- Simplify and explain in business terms.
- There is lack of basic understanding where client believes that being
analyst we can tell them what data we need. As we would on multiple
domains, its not possible to rely on past experience. We have created
some demo models and show the data insights to guide clients.
- Through past projects / references.
- Actually, its easier to explain to Engineers than Statisticians; Engineers
understand algorithms and methods to identify patterns.
- Using graphical representations and simple examples.
- I like to explain DM as a process of exploration of emergent patterns in
a data set. Let the data tell us what's happening (just as a media
company we have to listen to the consumers).
- Compiled PowerPoint presentation to compare a model's results built
internally vs. a model built by an external vendor. Created a story of the
model and showed a side by side comparison.
- Use of "decision trees in slow motion" PPT to explain what we do.
- Be careful whom you hire. Applicants with a Masters degree fare much
better than ones with a PhD degree.
- Short presentation of data mining on the yearly meeting, showing the
results of last year and their application.
- Using YouTube videos as medium for teaching simple algorithm.
- We have started to develop "easy-to-understand" presentation
methodologies, which attempt to explain data mining in non-
mathematical terminology. This gives our business users an intuitive
explanation of what we're doing.
- Start projects with a presentation containing a very simple example
(less than 10 data elements) to explain what DM does.
- Presenting case studies and challenging the audience.
- Dashboards, Visualization tools, Show and Tell presentations.
- Using graphical demonstrations.
- Use examples from real life to explain analytics - if you were presented
with a set of 40 pictures of people you don't know across your company,
how would you group them into buckets (clusters)? If you baked two
cakes with same ingredients in different proportions, how would you
know what changes made one cake better than the other?
- Giving (a) practical examples of the results of a project e.g., 87%
sensitivity and 70% specificity on a particular project. (b) examples of
automatically induced rules so that laymen can see that they are more
comprehensible than are regression coefficient, odds ratios, residuals,
- Using simple words rather than statistical jargon's, will help others in
understanding data mining and its implication.
- Keep explanations at a simplified level. Do not attempt to explain
- The most difficult is to explain that a real test-set is absolutely required
to benchmark predictive models. Nearly all so-called "data miners", still
compare the performances of data mining tools based on the AUC of
the models on the LEARNING dataset! Aaaargh!
- Graphics and a lot of reporting.
- Formally Teaching applications to interested parties.
- PowerPoint presentations. Simplifying statistical concepts.