What Is Cohort Analysis

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

2/12/2020 What is cohort analysis?

Cohort analysis

A cohort is a group of people who share a


common characteristic over a certain period of
time.

For example, let's look at a group of students. All of these students graduated in 2010. This group
of students is a cohort. All of the students graduated in the same year, and this is their
commonality.

https://www.stitchdata.com/cohort-analysis/ 1/10
2/12/2020 What is cohort analysis?

Cohort analysis is a study that focuses on the activities of a particular cohort. If we were to
calculate the average income of these students over the course of a ve-year period following
their graduation, we would be conducting a cohort analysis.

https://www.stitchdata.com/cohort-analysis/ 2/10
2/12/2020 What is cohort analysis?

Cohort analysis gets more interesting when we compare cohorts over a period of time. Imagine
another cohort of students who graduated in 2011.

Cohort analysis allows us to identify relationships between the characteristics of a population and
that population's behavior. Looking at the average income over the ve years after graduation in

https://www.stitchdata.com/cohort-analysis/ 3/10
2/12/2020 What is cohort analysis?

comparison to the income of the 2011 students over the same interval allows for a unique apples-
to-apples comparison of these groups. In this case, there appears to be a relationship between a
student's year of graduation and their income.

Here, we can see that both graduating classes increase in their average income per year. However,
by the third year out, the 2011 grads make more on average than their 2010 counterparts (by an
increasing margin).

Cohort analysis for business


Imagine that instead of graduating students, we were studying your customers. We could group
them by how they were originally referred to your business and track how much money they spent
over time.

https://www.stitchdata.com/cohort-analysis/ 4/10
2/12/2020 What is cohort analysis?

Here we see that customers referred by the blog deliver strong, consistent long-term spending.
Search engines and other channels, however, refer customers who spend a decreasing amount over
time.

Want to learn about setting the data strategy


for your organization?
Sign up for a free 30-day course to learn how to succeed with data. We've helped
more than 3,000 companies of all sizes build their data infrastructure, run analytics,
and make data-driven decisions. Learn how the data landscape has changed and what
that means for your company.

Type your email address G E T T H E CO U R S E

We will never share your email address, and you can opt out anytime.

Perhaps the most popular cohort analysis is one that groups customers based on their "join date,"
or the date when they made their rst purchase. Studying the spending trends of cohorts from

https://www.stitchdata.com/cohort-analysis/ 5/10
2/12/2020 What is cohort analysis?

different periods in time can indicate whether the quality of the average customer being acquired
is increasing or decreasing in over time.

In the chart above, the average customer in newer cohorts is spending less as time goes on. This
would be a red ag for many investors or acquirers because it implies that the value of recently
acquired customers is less than that of those acquired in the past.

Perform your own cohort analysis


Tip: Most professionals use tools like Stitch to consolidate their data for cohort analysis.

Step 1: Pull the raw data


Typically, the data required to conduct cohort analysis lives inside a database of some kind and
needs to be exported into spreadsheet software. In this example, we use MySQL and Microsoft
Excel.

If you're studying customer purchase behavior, you want to end up with a table of data that
includes one record per customer purchase. Each record contains the customer's ID (typically either
a unique number or an email address), the date and time of the purchase, the amount of the
purchase, and the customer's "cohort date" (typically the date of the customer's rst purchase). In a
typical "orders" database table, the MySQL query to pull such information might look like this:

1 SELECT orders.customerid,
2 orders.transactiondate,

https://www.stitchdata.com/cohort-analysis/ 6/10
2/12/2020 What is cohort analysis?

3 orders.transactionamount,
4 cohorts.cohortdate
5 FROM orders
6 JOIN (SELECT customerid,
7 Min(transactiondate) AS cohortDate
8 FROM orders
9 GROUP BY customerid) AS cohorts
10 ON orders.customerid = cohorts.customerid;

gistfile1.sql hosted with ❤ by GitHub view raw

Ideally, however, you would want to include additional attributes such as the customer's referral
source, the rst product they purchased, geographic and demographic information, and more. The
more information about the customer you have, the more ways you'll be able to segment your
cohorts. Each of these additional attributes may require additional database joins. Tools like Stitch
make all attributes accessible in the same database for you automatically.

Step 2: Create cohort identi ers


Open the data you've pulled into Excel. Since we pulled the "cohort date" attribute in the example
above, we'll conduct the popular cohort analysis in which we compare groups of customers based
on when they made their rst purchase. Assuming we want to group our cohorts based on the
month in which they made their rst purchase, we'll need to translate each "cohort date" value into
a "bucket" that represents the year and month of their rst purchase. Assuming cohort date is in
column D, the following Excel formula does the trick:

=YEAR(D2) & "-" & MONTH(D2)

Step 3: Calculate lifecycle stages


Once we know the cohort that each customer belongs to, we also need to determine the "lifecycle
stage" at which each event happened for that cohort member. For example, if a customer made
their rst purchase on January 10, 2012, and their second purchase on March 15, 2012, they would
be in the "January 2012" cohort, their rst purchase would be in the "Month 1" lifecycle stage, and
their second purchase would be in their "Month 3" lifecycle stage, because it happened in their
third month after becoming a customer. To calculate lifecycle stage, we need to determine the
amount of time between the customer's rst purchase and the purchase in question. Assuming
transaction date is in column C and cohort date is in column D, a function like the one below will
do the trick:

=ROUND((C2-D2)/30)+1

When you're done, you should have a table in Excel that looks like this.

https://www.stitchdata.com/cohort-analysis/ 7/10
2/12/2020 What is cohort analysis?

Step 4: Create a pivot table and graph


Pivot tables allow you to calculate an aggregation such as a sum or average across multiple
dimensions of your data. The pivot table we'd like to create is one that conducts a sum of
transaction amount, and shows one row per cohort and one column per relative time period. Its
data can be visualized on a basic Excel line graph.

https://www.stitchdata.com/cohort-analysis/ 8/10
2/12/2020 What is cohort analysis?

There you have it: an extremely basic cohort analysis built from the ground up. There are hundreds
of variations on cohort analysis that you can run based on your needs.

Bonus step: data perspectives


The chart we created is a cohort analysis, but it isn't easy to interpret in this format. Another way
to look at this chart would be to view each cohort's spending as a cumulative value over time. This
effectively builds a curve that allows you to watch total customer lifetime spending grow over time
per cohort.

Even more helpful is to normalize this data by the size of the cohort. To do this, you must divide
each data point for a cohort by the number of members in that cohort. That way, you can view the
average value per cohort member side by side without a bias from the size of the cohort. To do
this, you'll have to create a second pivot table to calculate cohort size and then divide one by the
other.

Want to learn about setting the data strategy


for your organization?
Sign up for a free 30-day course to learn how to succeed with data. We've helped
more than 3,000 companies of all sizes build their data infrastructure, run analytics,
https://www.stitchdata.com/cohort-analysis/ 9/10
2/12/2020 What is cohort analysis?

and make data-driven decisions. Learn how the data landscape has changed and what
that means for your company.

Type your email address G E T T H E CO U R S E

We will never share your email address, and you can opt out anytime.

Try it out using Stitch


Stitch offers a free 14-day trial, during which you can import your historical data to a data
warehouse and build and explore your cohorts in SQL or using a business intelligence tool. Give it
a try today!

https://www.stitchdata.com/cohort-analysis/ 10/10

You might also like