Best Practices in Measuring Analytic
Project Performance / Success:
Best Practices in Measuring Analytic
Project Performance / Success:
In the 5th Annual Survey (2011) data miners shared their best practices in how
they measure analytics project performance / success. The previous web page
summarizes the most frequently mentioned measures. In addition to the measures
described on the other pages, data miners described best practice methodologies
that included measures of feedback from users / clients / management, cross-
validation and a wide variety of other measures.
Below is the full text of the additional best practices the data miners shared. Some
of the richest descriptions included measurements that cross several of these
categories. If a data miner's multi-method best practice verbatim is already
included in one of the other verbatim lists (model performance, financial
performance, and outside group performance), it is not repeated here.
- Always build (at least) one metric which measures the experience from
the most granular (boots on the ground) user. You have to have a
metric that connects to what your user's experience (read: complaint) is.
- Positive feedback and changed habits.
- We continuously monitor Cross-validation results. Feeding new
incoming data for that the outcome is known into the validation loop. On
decrease of performance, model is automatically rebuilt and optimized.
- Diagnostics, cross-validation, post market evaluation.
- We track the performance of the direct response models on a daily
basis as campaigns are in the field and look for segments where the
model seems to be under-performing. Using this data, we sometimes
re-train or build secondary models to compensate.
- Comparative analysis of manufacturing results against laboratory
- Defects/1000; by months-in-service.
- Best feedback is given by in-house project people, because they are
specialists, and mostly an analytic project has the aim to automate or
speed up their work. This feedback is the measure.
- Real user of the results defined? Satisfaction of user (how does he talk
about it, use results in his own decisions as proof).
- Client acceptance/understanding. Practical recommendations flowing
from the analysis.
- Cross-validation, user feedback.
- Client feedback, site visits.
- Satisfaction from clients (evaluated in terms of knowledge gained from
data); comparison to existing methods with applicable metrics.
- Based on business objectives determined by stakeholder (project
manager). This sometimes has to be translated into a data mining
objective, but we always report back in terms and with visuals that the
program manager can understand.
- Business people do that using their criteria.
- Business people say if it works from the business perspective.
- Clients web activities analysis.
- Feedback from internal clients.
- Feedback through end user questionnaires.
- I think they use a very informal method. They do "listen to the customer"
but I think in a very haphazard manner.
- Impact on customer's business.
- Most of our work is for clients who review results presentations.
- Satisfaction of the primary requestor.
- We usually ask our customer for info about model usage and
- Estimate model performance by cross validation etc. before
deployment, install routines to frequently check performance after
- X-Validation, spot tests.
- Resampling methods to get an idea of generality of conclusions.
- (a) empirical or predictive validity, (as *formally* defined in
analytic/measurement theory), relative to explicitly documented "a
priori" expectations, and (b) comparison of risk metrics that are
calculated/estimated *prior to* deployment and with results (*if* analytic
results are deployed).
- Comparing analytics outputs to "known" numbers or indicators.
Comparing various analytic outputs to each other to make sure that
outputs that "show up" on various analytic projects are not wildly
different from each other.
- By the underlying... Claims by % savings. Customers by % cross- or
- Compare marketing effectiveness results to industry norms. Compare
demographic results to census data.
- Implementation leads to change in care process associated with
desirable changes in clinical and functional outcomes.
- Academic publications, adoption rates by industry.
- We use our own written performance measuring algorithms, in terms of
financial time series modeling and forecasting. e.g. rate_of_return,
- Applications are so varied that there is no consistent method, but value
is always an important component of a project.
- Information gained towards competitive advantage, reduce costs,
observe trends. Primarily using text mining to obtain real time
information. I don't wish to divulge more.
- Recovery and subrogation.
- Straightforward; number of defects; warranty claims.
- Measuring constituents that were in the non-target pool at the time data
was pulled for the model, but have since become in the target pool, and
see what score those constituents original had in the model.
- Mainly, "is it efficient?" and "does it work?" Different from statisticians,
we don't really care if it is the theoretical best model etc. We care that it
improves upon existing technology.
- Do a "lessons learnt" part at the end every project.
- To compare indicators (sales/cost/churn) before and after model
- Comparison with experimental data.
- A/B testing is not recommended when the costs for answering the
question of whether marketing worked are substantially higher than the
risks of not knowing if it worked.
- Monthly KPI dashboard. Ad-hoc campaign performance measurement.
- Multi-disciplinary metric development iterative measurements and
reviews, inclusive strategy to promote transparency and develop robust
insights about metric variances.
- Typically measured in terms of failure rates in final product testing;
- We have detailed metrics on failure rates (per 1000).
- Do they get published or lead to a discovery.
- Simple to do; get fined/charge-backs for warranty claims.
- Straightforward in our business; it is about the number of chargebacks.
- We measure by collecting data from the real-time POS systems where
we have implemented SAS analytics. We have a full loop back to
capture responses from our models and campaigns and tests, and we
utilize SAS Enterprise Data Management Server, and DataFlux to
automate the data collection of the results. We are looking to
implement MDM in the future.
- Expected monetary benefits of the marketing campaigns subsequent to
- Warranty claims per 1000 parts against months in service.
- A suite of model monitoring reports.
- Acceptance of peer-reviewed manuscripts in scientific publications.
- Always define measurement requirements and routines at the earliest
stage of the project. Define project objectives in line with feasibility of
- Analysis of group performance; analysis of peer evaluation within group.
- Apply results of project in terms or improvement compared to doing
nothing or compared to other alternatives.
- Before each project a measure is defined, if it is increased it is ok.
- Benchmarking and updating is vital
- Careful project planning with deliverables is key; careful project
management and tracking.
- Consulting so varies by client.
- Create a performance scorecard report.
- Depending on the project - no one answer.
- Depends on project, but should be something like "outcome per
- Driven by agreed-upon objectives set by stake-holders and the
- Hard to say; in particular in fraud detection.
- I have had only 1 case that met with very hard opposition, yet it was
proven with no doubt that the data mined solution worked better that
letting others "who knew better" operate the controls, with little success.
- Importance of my group's member within project team.
- Marketing campaigns are always tracked.
- Measuring key indicators with and without applying analytic project.
- Model tracking (track the results of model deployment are as expected).
- No best practices. Depends completely on business-driven KPI's.
- Part of 6 sigma projects.
- Peer-reviewed publications.
- Performance estimation of models, combining with expected costs.
- Performance of production models is reviewed on a scheduled basis.
- Since we focus primarily on customer intelligence, having a very good
customer contact history is critically important, particularly feedback via
the new customer "touchpoints".
- Speed of implementation, ability to interact with the program fast and
- Standardize model performance evaluation - procedure and
- Testing and Model Simulation.
- The best results and benefits are seen in upsell/cross sell models
deployed for credit card CRM department.
- The most important factor in success is planning ahead: asking
yourself -- before the analysis is undertaken, "What will be different
once this analysis is complete." If you can answer that question, then
you can measure whether what you expected actually occurs.
- Use of software and other tools.
- Use SAS Model Manager Software to evaluate performance of score
- We process images so best practice is to visualize the result.
- We use analytic performance management of our analytic activities to
measure the quality and nature of these activities.
- We aren't statisticians, we don't really know how. We just know our
clients love reports with tables and graphs, so that's what we give them.
- Data cleaning is an important practice. Sometimes it is considered
useless and a time consuming task. However, if data is dirty, we can't
expect good results.
- Most of the time I am doing academic research and I am just looking for
patterns of behavior. I don't have any kind of ROI to measure success.
|Copyright (c) 2012 Rexer Analytics, All rights reserved