Overcoming Data Mining Challenges:
Explaining Data Mining to Others
Copyright (c) 2011 Rexer Analytics, All rights reserved
Overcoming Data Mining Challenges:
Explaining Data Mining to Others

In each of the four annual data miner surveys, "Explaining Data Mining to Others"
has been identified as a one of the most frequent challenges data miners face.

While completing the 4th Annual Survey (2010), 65 of the 735 data miners shared
their experiences in overcoming this challenge.  Below is the full text of the "best
practices" they shared.  
A summary of data miners' experiences overcoming the
top three data mining challenges is also available.

Challenge:  Explaining Data Mining to Others

  • Leveraging "Competing on Analytics" and case studies from other
    organizations help build the power of the possible.  Taking small
    impactful projects internally and then promoting those projects
    throughout the organization helps adoption.  Finally, serving the data    
    up in a meaningful application - BI tool - shows our stakeholders what   
    data mining is capable of  delivering.

  • Initiate Knowledge Sharing Sessions about DM basics and purposes.

  • Graphical representations are very helpful  (i.e., gain or lift charts).

  • The problem is in getting enough time to lay out the problem and
    showing the solution.  Most upper management wants short
    presentations but don't have the background to just get the results.  
    They often don't buy into the solutions because they don't want to see
    the background.  Thus we try to work with their more ambitious direct
    reports who are more willing to see the whole presentation and, if they
    buy into it, will defend the solution with their immediate superiors.

  • Focus on dollars, overall benefit of model application to the Balance
    Sheet and P&L.

  • Measuring results compared to control groups is the best to convince
    people about data mining results.  

  • I've brought product managers (clients) to my desk and had them work
    with me on what analyses was important to them.  That way I was able
                   to manipulate the data on the fly based on their expertise to analyze
different aspects that were interesting to them.

  • Explaining results and their business impact with visual & graphical
    presentation, explaining historical trends & variance analysis, logically
    helps explain business trends in data to business users.

  • One View of the Truth philosophy.  Definitions of variables are
    consistent across business functions.

  • Visualize and explain models and model spaces.  Explain and interpret
    results.  Show and explain evaluation and significance of results.

  • It all depends, when you are explaining data mining or the techniques
    you utilized completely depends on the audience.  Example: For sales
    it is how you can help them, what insights can you give them to better
    understand their consumer base or how to make better decision...
    keeping it simple and getting them on-board that this will not diminish
    their role.

  • Find analogies in the sports or gambling domain that everyone can

  • Our organization is in a grass roots mode at this point in time and the
    management teams are not as familiar with some of the DM and TA
    concepts, so we are beginning a Lunch and Learn training program to
    bring the organization up to speed with respect to CRISP-DM and the
    software outputs.

  • Using Portrait Customer Analytics, which has a great visual interface,
    on the fly in front of clients while presenting analytic deliverables has
    been great...delivers more value than static PowerPoints, and gets our
    clients to collaborate with us on the analytics...also ends up generating
    more projects because we discover interesting segments during

  • Lots of graphics including standard control charting of root cause
    variables helps.  Connect standard process control charting to illustrate
    data mining results.

  • We try to explain data mining by giving successful business examples.  
    This is always challenging if you have not worked on a particular     
    subject or project.

  • We're funneling all BI-related needs through a formal "Decision Support
    Committee" meeting comprised of Accounting, Finance, IT, and
    Strategic Financial Reporting through which needs are prioritized,
    funded, and communicated.  Also, all-hands meetings (monthly) are
    being used to evangelize the needs, benefits, and requirements that go
    into data mining/BI.

  • Graphs and Charts.  Also, I have a 2-dimensional example I created
    that illustrates how a "rule" slices the picture into "puzzle-pieces" and
    assigns each piece a confidence value of a point being "red" (the target
    value is "red" for true, and "white" for false).  Every since I started
    sending this example out, and when necessary explaining it (usually in
    a larger talk), the questions about why what I do can be trusted have
    literally dried up.  

  • Education of DM-interested parties (DM consumers).

  • I don't explain what the model looks like, but I show how much money
    loose if they don't use the model well.

  • I rely on visuals reflecting a models potential impact on operations.

  • It seems to work best to show others a simple data mining exercise so
    that you can take them from beginning to end - from input data to
    results.  This usually makes sense to most people, and they are more
    ready to attempt their own complex data mining projects.   

  • Really spending time to simplify what the data mining is doing and
    creating PowerPoint slides which show what is happening at a        
    simplistic level.

  • Simple example that worked well, use metaphors and use business
    language for non data miners.  

  • Use of simple data sets, e.g. with 10 cases and 1 target and 5
    explanatory factors and demonstrating chi square calculations on 2x2
    tables is successful.

  • Usually use graphics that are intuitive to the customer.  

  • Visualisation tools that allow users to explore, such as Vortex from

  • We are not trying to explain data mining, just using ROI all the time, no
    need to explain anything else.

  • We don't explain data mining as such, but we explain the results of data
    mining through graphical statistical & models exports into MS Office

  • We occasionally hold seminars with business users to explain to them
    how data mining exercises are accomplished and the value they can

  • We use campaign results (take rates/revenue) to explain the
    effectiveness of models on business outcomes.

  • While our management is very hands-on and quantitative we don't
    typically get push back on black-box models as long as we can show
    relative variable importance and partial dependence charts that make
    sense in business terms.  For example, any at-risk model that doesn't
    have a recency predictor as one of the top variables is suspect.  

  • Developed a dataflow diagram for the overall Data Mining processes.

  • Where do I start?  In order to obtain funding for my research (I am     
    only a graduate student at UW Madison), I had to provide evidence and
    an explanation as to why what I was going to do at least had the    
    potential to provide some basic insight.  On a different note, I have
    found that MindMapping (I like NovaMind 5), is a very effective way of
    not only communicating the potential benefits of data mining, but also
    provides a very effective 'path' for the researcher to follow to organize
    his/her thoughts and plan the data mining process/techniques, etc.

  • Short courses and finding partners with "built-in" credibility in the group
    we are trying to get the tools to.  

  • Simplify and explain in business terms.   

  • There is lack of basic understanding where client believes that being
    analyst we can tell them what data we need.  As we would on multiple
    domains, its not possible to rely on past experience. We have created
    some demo models and show the data insights to guide clients.

  • Through past projects / references.  

  • Actually, its easier to explain to Engineers than Statisticians; Engineers
    understand algorithms and methods to identify patterns.

  • Using graphical representations and simple examples.

  • I like to explain DM as a process of exploration of emergent patterns in
    a data set.  Let the data tell us what's happening (just as a media
    company we have to listen to the consumers).

  • Compiled PowerPoint presentation to compare a model's results built
    internally vs. a model built by an external vendor.  Created a story of the
    model and showed a side by side comparison.

  • Using ODMr GUI.  

  • Use of "decision trees in slow motion" PPT to explain what we do.

  • Be careful whom you hire. Applicants with a Masters degree fare much
    better than ones with a PhD degree.

  • Short presentation of data mining on the yearly meeting, showing the
    results of last year and their application.   

  • Beginner tutorials.

  • Using YouTube videos as medium for teaching simple algorithm.

  • We have started to develop "easy-to-understand" presentation
    methodologies, which attempt to explain data mining in non-
    mathematical terminology.  This gives our business users an intuitive
    explanation of what we're doing.   

  • Start projects with a presentation containing a very simple example
    (less than 10 data elements) to explain what DM does.

  • Presenting case studies and challenging the audience.

  • Dashboards, Visualization tools, Show and Tell presentations.

  • Using graphical demonstrations.

  • Use examples from real life to explain analytics - if you were presented
    with a set of 40 pictures of people you don't know across your company,
    how would you group them into buckets (clusters)?  If you baked two
    cakes with same ingredients in different proportions, how would you
    know what changes made one cake better than the other?   

  • Giving (a) practical examples of the results of a project e.g., 87%
    sensitivity and 70% specificity on a particular project.    (b) examples of
    automatically induced rules so that laymen can see that they are more
    comprehensible than are regression coefficient, odds ratios, residuals,

  • Making graphics.

  • Using simple words rather than statistical jargon's, will help others in
    understanding data mining and its implication.   

  • Keep explanations at a simplified level.  Do not attempt to explain
    algorithms used.

  • The most difficult is to explain that a real test-set is absolutely required
    to benchmark predictive models. Nearly all so-called "data miners", still
    compare the performances of data mining tools based on the AUC of
    the models on the LEARNING dataset! Aaaargh!

  • Graphics and a lot of reporting.


  • Formally Teaching applications to interested parties.

  • PowerPoint presentations.  Simplifying statistical concepts.  

  • Analogies.  

  • Keep it at their level.