Professional Documents
Culture Documents
Ten Common Analytic Mistakes: 1. Optimizing Around The Wrong Metric
Ten Common Analytic Mistakes: 1. Optimizing Around The Wrong Metric
Collecting, analyzing, and making decisions from data is the heart of customer
analytics. But whether you’re new to data analysis or have been doing it a while, ten
common mistakes can affect the quality of your results. You should be on the lookout
for them. They follow, and I include some ideas on how to avoid them as well.
Metrics exist for just about anything in an organization and most probably are
collected for a good reason. Be sure the metric you want to optimize will achieve not
that pulls away from the gate and sits on the tarmac is a metric success even though
the customers feel the experience is disappointing as they arrive at their destination an
hour late. If you optimize around the number of calls answered in one hour at a call
center, you are placing quantity over quality. While customers generally want to get
Be sure your metrics are meaningful to your customer and that optimizing those
Mining customer transactions can reveal a lot of patterns in things like what products
customers purchase together or the average time between purchases. But this
behavioral data doesn’t necessarily help you understand the attitudes and motivations
behind why customers purchase things together. This attitudinal data can more easily
customer attitudes, and you’re measuring a sample of customers or data, be sure your
sample size is large enough to detect that difference. Use the sample size tables in this
book or consult a statistician to know what sample size you’ll need ahead of time.
A lot of cost and effort are wasted on looking for very small differences in customer
after making very small changes to products or websites with too small of a sample
size.
I call it “eyeballing statistics.” It’s the tendency to think you can detect patterns from
data by examining it without any statistics. For very large patterns, you can see these
easily without any computations, but these sorts of obvious patterns rarely show up.
To minimize the chance that you’re being fooled by randomness in data, use statistics
With a large sample size, you’ll be able to detect very small differences and patterns
that are statistically significant. Statistical significance just means that the pattern or
difference is not due to random noise in your data. But that doesn’t mean that what’s
detected will have much practical importance. Analytics programs will flag different
rates results will have a major or negligible impact. This depends on the context but
means you’ll need to exercise judgment and not blindly follow the software. Don’t
the business implications of the result carefully. See the appendix for more of a
If you have a stats PhD crunching numbers in your company basement, it may
generate the right insights; but if sales, marketing, service, or product teams aren’t
involved, it’s going to be difficult to get buy-in and implement the insights. Get the
right people and teams involved in your initiative early and look to have
product experience.
Garbage In, Garbage Out (GIGO) is a common phrase data junkies like to use to
explain that data that has problems before analysis will have problems after analysis.
This can be anything from mismatched data pulled from databases (customer names
don’t match transactions) or missing values. If the data is bad going in, you’ll have
bad insights coming out. Before running any analysis, do a quality check on your data
by selecting a sample of data and auditing it for quality. Corroborate it with other
When you analyze your data, at least half of the effort is spent formatting the data so
your software can properly analyze it. This often involves disaggregating and getting
Skimping on proper formatting usually means a lot of rework later, so be sure your
Sometimes it’s fine to have a fishing expedition and examine patterns in data. But
don’t stop with the fishing expedition; use what you find to form hypotheses about
customer behavior and look to confirm, refine, or reject these hypotheses with
additional data.
Every data set tends to have some problem of some sort. Some are minor, like a few
missing fields; others are major, with lots of missing fields and mismatched data. For
survey data, there always seems to be a concern about how a question was asked and
to whom it was asked. That said, expect some imperfection in all your datasets and
surveys. But don’t let it stop you from working with what you have. Just be cautious