Copyright (c) 2011 Rexer Analytics All Rights Reserved
2010 Data Miner Survey:
2010 Data Miner Survey:  

Thank you for your interest in the 4th Annual Rexer Analytics Data Miner Survey.  

Each year this research examines the analytic behaviors, needs, preferences, and
views of data mining professionals.  It is conducted as a service to the data mining
community.  It s not conducted for, or sponsored by, any third party.  Rexer Analytics
is committed to freely disseminating our research findings through report
summaries, conference presentations, and personal contact.  If you would like a
copy of this year's FREE 37 page summary report, or summary reports from
previous years, please contact us at
DataMinerSurvey@RexerAnalytics.com.  
Please also contact us if you have questions about this research program or if you
have suggestions for topics that should be included in future data miner surveys.










































Rexer Analytics has been conducting the Data Miner Survey since 2007.  Summary
reports (PDFs of about 40 pages) of each of the six surveys are available FREE to
everyone -- simply email your request to
DataMinerSurvey@RexerAnalytics.com.  
Also, highlights of each Data Miner Survey are available online, including best
practices shared by respondents on analytic success measurement, overcoming
data mining challenges, and other topics.

2010 SURVEY HIGHLIGHTS:

  • SURVEY & PARTICIPANTS:  50-item survey of data miners, conducted on-
    line in early 2010.  735 participants from 60 countries.

  • FIELDS & GOALS:  Data miners work in a diverse set of fields.  CRM /
    Marketing has been the #1 field in each of the past four years.  Fittingly,
    “improving the understanding of customers”, “retaining customers” and other
    CRM goals are also the goals identified by the most data miners surveyed.

  • ALGORITHMS:  Decision trees, regression, and cluster analysis continue to
    form a triad of core algorithms for most data miners.  However, a wide variety
    of algorithms are being used.  This year, for the first time, the survey asked
    about Ensemble Models, and 22% of data miners report using them.

  • MODELS:  About one-third of data miners typically build final models with 10
    or fewer variables, while about 28% generally construct models with more
    than 45 variables.

  • TOOLS:  After a steady rise across the past few years, the open source data
    mining software R overtook other tools to become the tool used by more data
    miners (43%) than any other.  STATISTICA, which has also been climbing in
    the rankings, is selected as the primary data mining tool by the most data
    miners (18%).  Data miners report using an average of 4.6 software tools
    overall.  STATISTICA, IBM SPSS Modeler, and R received the strongest
    satisfaction ratings in both 2010 and 2009.

  • TECHNOLOGY:  Data Mining most often occurs on a desktop or laptop
    computer, and frequently the data is stored locally.  Model scoring typically
    happens using the same software used to develop models.  STATISTICA
    users are more likely than other tool users to deploy models using PMML.

  • CHALLENGES: As in previous years, dirty data, explaining data mining to
    others, and difficult access to data are the top challenges data miners face.  
    This year data miners also shared best practices for overcoming these
    challenges.  Read about their experiences overcoming these challenges.

  • FUTURE:  Data miners are optimistic about continued growth in the number
    of projects they will be conducting, and growth in data mining adoption is the
    number one “future trend” identified.  There is room to improve:  only 13% of
    data miners rate their company’s analytic capabilities as “excellent” and only
    8% rate their data quality as “very strong”.