Best Practices in Measuring Analytic
Project Performance / Success
Best Practices in Measuring Analytic
Project Performance / Success
In the 5th Annual Survey (2011) data miners shared their best practices in how
they measure analytics project performance / success. 236 data miners shared
their best practices. There is great diversity in data miners' performance / success
measurement methodologies, and many data miners described using multiple
measurement techniques. The five methodologies mentioned by the most data
miners were:
- Model performance (accuracy, F, ROC, AUC, lift)
- Financial performance (ROI and other financial measures)
- Performance in a control or other group
- Feedback from users, clients, or management
- Cross-validation
Below are examples of the best practices they shared. As you can see, some of
the richest descriptions include measurements that cross several of these
categories. Complete lists of the data miners' verbatim best practice descriptions
are also available by following the links below.
Model Performance:
Fifty-three data miners participating in the 5th Annual Survey described best
practice methodologies that included measures of model performance. (All 53
responses can be seen here.)
Selected examples
- Cross-validation and sliding-window validation during model training
and data mining process and parameter optimization. Metrics:
accuracy, recall, precision, ROC, AUC, lift, confidence, support,
conversion rates, churn rates, ROI, increase in sales volume and profit,
cost savings, run times, etc. Continuous monitoring of model
performance metrics. Use of control groups and independent test sets.
- Standard statistical measurements (KS, ROC, R-square etc.),
profitability metrics, loss impact etc.
- Two phases: 1.- There are expected results for AUC or K-S test in
order for them to be accepted by the supervisor (in Credit Scoring is the
banking supervisor). 2.- Once implemented, we recommend
conducting stress testing and back testing at least once a month, and
we've developed tools to alert the users of potential disruption in the
original patterns of the model.
- For classification models, I use AUC, for regression models I use
RMSE. Everything is cross-validated during model building, and
performance is then assessed on a hold-out sample.
- Model quality: standard performance measures such as precision,
recall, accuracy, etc. Model complexity: memory usage & computation
time.
Financial Performance:
Forty-three data miners participating in the 5th Annual Survey described best
practice methodologies that included measures of financial performance. (All 43
responses can be seen here.)
Selected examples:
- We measure ROI, cost, gain, model accuracy, precision, recall, ROC,
AUC, lift charts, and customized metrics. The focus is on the benefit for
the business and for the customer.
- Longitudinal validation based on hard, objective outcomes, preferably
financial where sensible and achievable.
- We now know approximately how much it will cost to develop a
workable solution. We factor that cost into the feasibility. We also
continuously refine our cost/benefit analysis throughout the evolution of
the project. In the end, success is what the sponsor and team says it is.
It does not always end up as increased revenue or lowered costs.
- I work in Lean Six Sigma, so we routinely quantify the financial benefits
of analytic projects and opportunities.
- Real-world analysis of results, almost always tied to a financial measure
(i.e., something that can be expressed in dollars or readily converted to
dollars).
Performance in Control or Other Group:
Thirty-five data miners participating in the 5th Annual Survey described best
practice methodologies that included measures of performance in control or other
groups. (All 35 responses can be seen here.)
Selected examples:
- Evaluate model accuracy using cross-validation, or out-of-bag samples,
or hold out data (if data set truly large). Once happy with method,
conduct pilot study to measure accuracy to make sure model works in
real environment.
- Model Performance 1. Overall accuracy on a validation data set
2. Sensitivity and Specificity 3. ROC curve. Analytic Project Success
1. Significant increase in rates of marketing returns 2. Adoption of the
model by the pertinent business unit.
- Out of sample performance. Ease of implementation. Understanding &
buy-in from organization.
- Always against a hold out control group and tracked over time and
multiple campaigns.
Other Measures of Analytic Success:
In the 5th Annual Survey, twenty-nine data miners described best practice
methodologies that included measures of feedback from users, clients, or
management. Fourteen data miners described best practice methodologies that
included cross-validation. And 131 data miners described best practice
methodologies with other measures. They provided insights into a diverse
set of best practices. (Verbatim responses can be seen here.)
Selected examples:
- Always build (at least) one metric which measures the experience from
the most granular (boots on the ground) user. You have to have a
metric that connects to what your user's experience (read: complaint) is.
- Positive feedback and changed habits.
- We continuously monitor Cross-validation results. Feeding new
incoming data for that the outcome is known into the validation loop. On
decrease of performance, model is automatically rebuilt and optimized.
- Diagnostics, cross-validation, post market evaluation.
- We track the performance of the direct response models on a daily
basis as campaigns are in the field and look for segments where the
model seems to be under-performing. Using this data, we sometimes
re-train or build secondary models to compensate.
- Comparative analysis of manufacturing results against laboratory
results.
- Defects/1000; by months-in-service.



Copyright (c) 2012 Rexer Analytics, All rights reserved
|