Overcoming Data Mining Challenges:
Explaining Data Mining to Others
In each of the four annual data miner surveys, "Explaining Data Mining to Others" has been identified as a one of the most frequent challenges data miners face. While completing the 4th Annual Survey (2010), 65 of the 735 data miners shared their experiences in overcoming this challenge. Below is the full text of the "best practices" they shared. A summary of data miners' experiences overcoming the top three data mining challenges is also available.
Challenge: Explaining Data Mining to Others
- Leveraging "Competing on Analytics" and case studies from other organizations help build the power of the possible. Taking small impactful projects internally and then promoting those projects throughout the organization helps adoption. Finally, serving the data up in a meaningful application - BI tool - shows our stakeholders what data mining is capable of delivering.
- Initiate Knowledge Sharing Sessions about DM basics and purposes.
- Graphical representations are very helpful (i.e., gain or lift charts).
- The problem is in getting enough time to lay out the problem and showing the solution. Most upper management wants short presentations but don't have the background to just get the results. They often don't buy into the solutions because they don't want to see the background. Thus we try to work with their more ambitious direct reports who are more willing to see the whole presentation and, if they buy into it, will defend the solution with their immediate superiors.
- Focus on dollars, overall benefit of model application to the Balance Sheet and P&L.
- Measuring results compared to control groups is the best to convince people about data mining results.
- I've brought product managers (clients) to my desk and had them work with me on what analyses was important to them. That way I was able
- to manipulate the data on the fly based on their expertise to analyze
- different aspects that were interesting to them.
- Explaining results and their business impact with visual & graphical presentation, explaining historical trends & variance analysis, logically helps explain business trends in data to business users.
- One View of the Truth philosophy. Definitions of variables are consistent across business functions.
- Visualize and explain models and model spaces. Explain and interpret results. Show and explain evaluation and significance of results.
- It all depends, when you are explaining data mining or the techniques you utilized completely depends on the audience. Example: For sales it is how you can help them, what insights can you give them to better understand their consumer base or how to make better decision...
- keeping it simple and getting them on-board that this will not diminish their role.
- Find analogies in the sports or gambling domain that everyone can understand.
- Our organization is in a grass roots mode at this point in time and the management teams are not as familiar with some of the DM and TA concepts, so we are beginning a Lunch and Learn training program to bring the organization up to speed with respect to CRISP-DM and the software outputs.
- Using Portrait Customer Analytics, which has a great visual interface, on the fly in front of clients while presenting analytic deliverables has been great...delivers more value than static PowerPoints, and gets our clients to collaborate with us on the analytics...also ends up generating more projects because we discover interesting segments during presentations.
- Lots of graphics including standard control charting of root cause variables helps. Connect standard process control charting to illustrate data mining results.
- We try to explain data mining by giving successful business examples. This is always challenging if you have not worked on a particular subject or project.
- We're funneling all BI-related needs through a formal "Decision Support Committee" meeting comprised of Accounting, Finance, IT, and Strategic Financial Reporting through which needs are prioritized, funded, and communicated. Also, all-hands meetings (monthly) are being used to evangelize the needs, benefits, and requirements that go into data mining/BI.
- Graphs and Charts. Also, I have a 2-dimensional example I created that illustrates how a "rule" slices the picture into "puzzle-pieces" and assigns each piece a confidence value of a point being "red" (the target value is "red" for true, and "white" for false). Every since I started sending this example out, and when necessary explaining it (usually in a larger talk), the questions about why what I do can be trusted have literally dried up.
- Education of DM-interested parties (DM consumers).
- I don't explain what the model looks like, but I show how much money loose if they don't use the model well.
- I rely on visuals reflecting a models potential impact on operations.
- It seems to work best to show others a simple data mining exercise so that you can take them from beginning to end - from input data to results. This usually makes sense to most people, and they are more ready to attempt their own complex data mining projects.
- Really spending time to simplify what the data mining is doing and creating PowerPoint slides which show what is happening at a simplistic level.
- Simple example that worked well, use metaphors and use business language for non data miners.
- Use of simple data sets, e.g. with 10 cases and 1 target and 5 explanatory factors and demonstrating chi square calculations on 2x2 tables is successful.
- Usually use graphics that are intuitive to the customer.
- Visualisation tools that allow users to explore, such as Vortex from Dotmatics.
- We are not trying to explain data mining, just using ROI all the time, no need to explain anything else.
- We don't explain data mining as such, but we explain the results of data mining through graphical statistical & models exports into MS Office applications.
- We occasionally hold seminars with business users to explain to them how data mining exercises are accomplished and the value they can provide.
- We use campaign results (take rates/revenue) to explain the effectiveness of models on business outcomes.
- While our management is very hands-on and quantitative we don't typically get push back on black-box models as long as we can show relative variable importance and partial dependence charts that make sense in business terms. For example, any at-risk model that doesn't have a recency predictor as one of the top variables is suspect.
- Developed a dataflow diagram for the overall Data Mining processes.
- Where do I start? In order to obtain funding for my research (I am only a graduate student at UW Madison), I had to provide evidence and an explanation as to why what I was going to do at least had the potential to provide some basic insight. On a different note, I have found that MindMapping (I like NovaMind 5), is a very effective way of not only communicating the potential benefits of data mining, but also provides a very effective 'path' for the researcher to follow to organize his/her thoughts and plan the data mining process/techniques, etc.
- Short courses and finding partners with "built-in" credibility in the group we are trying to get the tools to.
- Simplify and explain in business terms.
- There is lack of basic understanding where client believes that being analyst we can tell them what data we need. As we would on multiple domains, its not possible to rely on past experience. We have created some demo models and show the data insights to guide clients.
- Through past projects / references.
- Actually, its easier to explain to Engineers than Statisticians; Engineers understand algorithms and methods to identify patterns.
- Using graphical representations and simple examples.
- I like to explain DM as a process of exploration of emergent patterns in a data set. Let the data tell us what's happening (just as a media company we have to listen to the consumers).
- Compiled PowerPoint presentation to compare a model's results built internally vs. a model built by an external vendor. Created a story of the model and showed a side by side comparison.
- Using ODMr GUI.
- Use of "decision trees in slow motion" PPT to explain what we do.
- Be careful whom you hire. Applicants with a Masters degree fare much better than ones with a PhD degree.
- Short presentation of data mining on the yearly meeting, showing the results of last year and their application.
- Beginner tutorials.
- Using YouTube videos as medium for teaching simple algorithm.
- We have started to develop "easy-to-understand" presentation methodologies, which attempt to explain data mining in non-
- mathematical terminology. This gives our business users an intuitive explanation of what we're doing.
- Start projects with a presentation containing a very simple example (less than 10 data elements) to explain what DM does.
- Presenting case studies and challenging the audience.
- Dashboards, Visualization tools, Show and Tell presentations.
- Using graphical demonstrations.
- Use examples from real life to explain analytics - if you were presented with a set of 40 pictures of people you don't know across your company, how would you group them into buckets (clusters)? If you baked two cakes with same ingredients in different proportions, how would you know what changes made one cake better than the other?
- Giving (a) practical examples of the results of a project e.g., 87% sensitivity and 70% specificity on a particular project. (b) examples of automatically induced rules so that laymen can see that they are more comprehensible than are regression coefficient, odds ratios, residuals, etc.
- Making graphics.
- Using simple words rather than statistical jargon's, will help others in understanding data mining and its implication.
- Keep explanations at a simplified level. Do not attempt to explain algorithms used.
- The most difficult is to explain that a real test-set is absolutely required to benchmark predictive models. Nearly all so-called "data miners", still compare the performances of data mining tools based on the AUC of the models on the LEARNING dataset! Aaaargh!
- Graphics and a lot of reporting.
- Formally Teaching applications to interested parties.
- PowerPoint presentations. Simplifying statistical concepts.
- Keep it at their level.
© 2017 Rexer Analytics. All Rights Reserved.
30 Vine Street
Winchester, MA 01890