Best Practices in Measuring Analytic
Project Performance / Success:
Other Measures
Best Practices in Measuring Analytic
Project Performance / Success:
Other Measures

In the 5th Annual Survey (2011) data miners shared their best practices in how
they measure analytics project performance / success.  The
previous web page
summarizes the most frequently mentioned measures.  In addition to the measures
described on the other pages, data miners described best practice methodologies
that included measures of feedback from users / clients / management, cross-
validation and a wide variety of other measures.

Below is the full text of the additional best practices the data miners shared.  Some
of the richest descriptions included measurements that cross several of these
categories.  If a data miner's multi-method best practice verbatim is already
included in one of the other verbatim lists (
model performance, financial
performance, and outside group performance), it is not repeated here.

  • Always build (at least) one metric which measures the experience from
    the most granular (boots on the ground) user.  You have to have a
    metric that connects to what your user's experience (read: complaint) is.

  • Positive feedback and changed habits.

  • We continuously monitor Cross-validation results. Feeding new
    incoming data for that the outcome is known into the validation loop. On
    decrease of performance, model is automatically rebuilt and optimized.

  • Diagnostics, cross-validation, post market evaluation.

  • We track the performance of the direct response models on a daily
    basis as campaigns are in the field and look for segments where the
    model seems to be under-performing.  Using this data, we sometimes
    re-train or build secondary models to compensate.

  • Conviction rate

  • Comparative analysis of manufacturing results against laboratory

  • Defects/1000; by months-in-service.

  • Best feedback is given by in-house project people, because they are
    specialists, and mostly an analytic project has the aim to automate or
    speed up their work.  This feedback is the measure.

  • Real user of the results defined?  Satisfaction of user (how does he talk
    about it, use results in his own decisions as proof).

  • Client acceptance/understanding.  Practical recommendations flowing
    from the analysis.

  • Cross-validation, user feedback.

  • Client feedback, site visits.

  • Satisfaction from clients (evaluated in terms of knowledge gained from
    data); comparison to existing methods with applicable metrics.

  • Based on business objectives determined by stakeholder (project
    manager). This sometimes has to be translated into a data mining
    objective, but we always report back in terms and with visuals that the
    program manager can understand.

  • Business people do that using their criteria.

  • Business people say if it works from the business perspective.

  • Client opinion

  • Client satisfaction

  • Clients web activities analysis.

  • Feedback from internal clients.

  • Feedback through end user questionnaires.

  • I think they use a very informal method. They do "listen to the customer"
    but I think in a very haphazard manner.

  • Impact on customer's business.

  • Most of our work is for clients who review results presentations.

  • Peer review

  • Peer review

  • Satisfaction of the primary requestor.

  • Support from management.

  • We usually ask our customer for info about model usage and

  • Estimate model performance by cross validation etc. before
    deployment, install routines to frequently check performance after

  • X-Validation, spot tests.

  • Cross validation

  • Resampling methods to get an idea of generality of conclusions.

  • (a) empirical or predictive validity, (as *formally* defined in
    analytic/measurement theory), relative to explicitly documented "a
    priori" expectations, and (b) comparison of risk metrics that are
    calculated/estimated *prior to* deployment and with results (*if* analytic
    results are deployed).

  • Comparing analytics outputs to "known" numbers or indicators.  
    Comparing various analytic outputs to each other to make sure that
    outputs that "show up" on various analytic projects are not wildly
    different from each other.

  • By the underlying...  Claims by % savings.  Customers by % cross- or
    up-sell increment.

  • Compare marketing effectiveness results to industry norms. Compare
    demographic results to census data.

  • Implementation leads to change in care process associated with
    desirable changes in clinical and functional outcomes.

  • In market testing.

  • Academic publications, adoption rates by industry.

  • We use our own written performance measuring algorithms, in terms of
    financial time series modeling and forecasting.   e.g. rate_of_return,

  • Applications are so varied that there is no consistent method, but value
    is always an important component of a project.

  • Information gained towards competitive advantage, reduce costs,
    observe trends.  Primarily using text mining to obtain real time
    information.  I don't wish to divulge more.

  • Recovery and subrogation.

  • Straightforward; number of defects; warranty claims.

  • Measuring constituents that were in the non-target pool at the time data
    was pulled for the model, but have since become in the target pool, and
    see what score those constituents original had in the model.

  • Mainly, "is it efficient?" and "does it work?" Different from statisticians,
    we don't really care if it is the theoretical best model etc. We care that it
    improves upon existing technology.

  • Do a "lessons learnt" part at the end every project.

  • Forecasting of sales.

  • To compare indicators (sales/cost/churn) before and after model

  • A/B testing

  • Comparison with experimental data.

  • A/B testing is not recommended when the costs for answering the
    question of whether marketing worked are substantially higher than the
    risks of not knowing if it worked.

  • Monthly KPI dashboard.  Ad-hoc campaign performance measurement.

  • Multi-disciplinary metric development iterative measurements and
    reviews, inclusive strategy to promote transparency and develop robust
    insights about metric variances.

  • Typically measured in terms of failure rates in final product testing;
    rework rate.

  • We have detailed metrics on failure rates (per 1000).

  • Do they get published or lead to a discovery.

  • DOE Test & Learn.

  • Sales upturn

  • Simple to do; get fined/charge-backs for warranty claims.

  • Straightforward in our business; it is about the number of chargebacks.

  • We measure by collecting data from the real-time POS systems where
    we have implemented SAS analytics.  We have a full loop back to
    capture responses from our models and campaigns and tests, and we
    utilize SAS Enterprise Data Management Server, and DataFlux to
    automate the data collection of the results.  We are looking to
    implement MDM in the future.

  • Time to production

  • Expected monetary benefits of the marketing campaigns subsequent to
    the analyses.

  • Warranty claims per 1000 parts against months in service.

  • A suite of model monitoring reports.

  • Acceptance of peer-reviewed manuscripts in scientific publications.

  • Actionable results

  • Always define measurement requirements and routines at the earliest
    stage of the project. Define project objectives in line with feasibility of
    outcome measurement.

  • Analysis of group performance; analysis of peer evaluation within group.

  • Apply results of project in terms or improvement compared to doing
    nothing or compared to other alternatives.

  • Before each project a measure is defined, if it is increased it is ok.

  • Benchmarking and updating is vital

  • Careful project planning with deliverables is key; careful project
    management and tracking.

  • Check if it work

  • Consulting so varies by client.

  • Conventional metric

  • Create a performance scorecard report.

  • Customer segmenting

  • Depending on the project - no one answer.

  • Depends on project, but should be something like "outcome per
    working hour".

  • Driven by agreed-upon objectives set by stake-holders and the
    modeling team.

  • Hard to say; in particular in fraud detection.

  • I have had only 1 case that met with very hard opposition, yet it was
    proven with no doubt that the data mined solution worked better that
    letting others "who knew better" operate the controls, with little success.

  • Importance of my group's member within project team.

  • Marketing campaigns are always tracked.

  • Measuring key indicators with and without applying analytic project.

  • Model tracking (track the results of model deployment are as expected).

  • No best practices.  Depends completely on business-driven KPI's.

  • Number of citations.

  • Part of 6 sigma projects.

  • Peer-reviewed publications.

  • Performance estimation of models, combining with expected costs.

  • Performance Management

  • Performance of production models is reviewed on a scheduled basis.

  • Prototyping first.

  • Since we focus primarily on customer intelligence, having a very good
    customer contact history is critically important, particularly feedback via
    the new customer "touchpoints".

  • Speed of implementation, ability to interact with the program fast and

  • Standardize model performance evaluation - procedure and

  • Test cases

  • Testing and Model Simulation.

  • The best results and benefits are seen in upsell/cross sell models
    deployed for credit card CRM department.

  • The most important factor in success is planning ahead:  asking
    yourself -- before the analysis is undertaken, "What will be different   
    once this analysis is complete."  If you can answer that question, then
    you can measure whether what you expected actually occurs.

  • Ticketing System

  • Training people

  • Use of software and other tools.

  • Use SAS Model Manager Software to evaluate performance of score

  • Useful results

  • vs results

  • We process images so best practice is to visualize the result.

  • We use analytic performance management of our analytic activities to
    measure the quality and nature of these activities.

  • Wiki pages

  • We aren't statisticians, we don't really know how.  We just know our
    clients love reports with tables and graphs, so that's what we give them.

  • Data cleaning is an important practice. Sometimes it is considered
    useless and a time consuming task. However, if data is dirty, we can't
    expect good results.

  • Most of the time I am doing academic research and I am just looking for
    patterns of behavior.  I don't have any kind of ROI to measure success.
Copyright (c) 2012 Rexer Analytics, All rights reserved