Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 15

Determine the Measurement Approach

In this course, we will explore how to formulate a hypothesis and how to select a measurement
approach that suits the needs of your clients.

Test Variables
This lesson prepares you to formulate a hypothesis based on your business goal and identify the
dependent and independent variables to include in your test or analysis.

Hypotheses
Your hypothesis should relate to your primary business goal. The more data or insights you have
available, the stronger your hypothesis will be. A strong hypothesis answers the questions of who, what,
where, when and why.
In media measurement, this would be:

• Who: Audience
• What: Behavior of the audience
• Where: Location
• When: Ad schedule
• Why: Rationale for the anticipated audience behavior or perspective

Without a well-formed hypothesis, any experiment or data analysis lacks the necessary focus to be useful
and actionable.

Formulate a strong hypothesis


To develop a strong hypothesis, first, start with a basic one, then build on it until you have a good one.
Then build further until your hypothesis is strong.

Let’s explore this with an example from Axion Electronics, a fictional electronics manufacturer with a
corporate office in London. Axion makes televisions and other home entertainment devices, and its
patented TV technology is arguably the best in the market. However, the price point is high and brand
awareness is low. The Axion team has a business goal to increase sales of their revolutionary TV by 20%
before the end of the second quarter.

Basic We believe men ages 24–45 are more likely to respond to our ad.
Better Men ages 24–45 in the UK who are more likely to respond to our promotional ad in the
second quarter.
Strong Men ages 24–45 in the UK are more likely to respond to our second quarter
promotional sale ad, because our creative highlights the innovative features of the product.
A basic hypothesis includes "who" and "what." We create it from an assumption and perhaps a few data
points. A better hypothesis builds on this and adds "where" and "when." A strong hypothesis includes
"why," which is the rationale for the anticipated audience behavior or perspective.
Create a test hypothesis
Once you have a strong hypothesis, you can develop your test hypothesis. While a hypothesis could be a
subjective assumption, a test hypothesis includes a null and an alternative hypothesis. The objective is to
test if you can reject the null hypothesis and accept the alternative.

To formulate your test hypothesis, choose a variable to test. In other words, isolate the independent
variable and dependent variable that your test will feature. The independent variable should have a
meaningful effect on the dependent variable. The dependent variable, which is also known as the outcome
variable, should align most closely with your business goal. When they align, your outcomes are more
likely to give you the most relevant, useful information.

DEPENDENT VARIABLES INDEPENDENT VARIABLES


Sales Audience
Ad recall Creative
CTR Placement
CPIC Delivery optimization
Leads Ad structure insights
App Installs Campaign objective

Axion test hypothesis


Let’s see another example from Axion. Now that the Axion team has their strong hypothesis, they
want to further test their theory.

To do this, they first develop their test hypothesis, which states "If video creative that highlights
sports is used in the campaign, then it will lead to more sales than general brand video creative." In
this case, the independent variable in this test is the creative and the dependent variable is the sales.

Now that they have their test hypothesis, the team can recommend a measurement approach to
prove or disprove this hypothesis.
Creative A
Creative B
Video creative that highlights sports
General brand video creative
A TV advertiser needs to attract more sports fans to watch their upcoming live events. According to
a global index survey, 90% of sports fans online reported that they use another device while they
watch TV. More than 50% of sports fans reported that their top activities are using social media and
chatting with friends.
To maximize ad reach, which hypothesis should the advertiser test?

The placement of ads on popular social media platforms is the most cost-efficient way to
generate conversions.
The placement of ads on top sports channels is the most cost-efficient way to generate reach.
Messaging apps provide incremental conversions.
Social media platforms provide incremental reach to TV.
Key takeaways
 Without a well-formed hypothesis, any experiment or data analysis lacks the necessary focus to
be useful and actionable.
 Your hypothesis should relate to your primary business goal. A strong hypothesis answers the
questions of who, what, where, when and why.
 A test hypothesis is an if/then statement that considers how a change in a variable can influence
an outcome.

Choose a Measurement Approach


This lesson prepares you to determine which measurement approach best addresses your
hypothesis.

Choose which measurement approach is most suitable


Once you have a test hypothesis, you can then determine the measurement approach that best
addresses it. There are many measurement approaches you can use to prove or disprove your
hypothesis. Some are Facebook solutions, while others are not.

As we go through this lesson, we’ll explore how the team from Axion Electronics uses different
measurement approaches to solve different business challenges and test different hypotheses.

Axion is a fictional electronics manufacturer that makes televisions and other home entertainment
devices, and its patented TV technology is arguably the best in the market.
Cross channel reach reporting
The Axion team had a goal to reach a broad audience at an efficient cost per person. Their
hypothesis stated that TV delivers more efficient reach relative to the digital channels they'd already
been using. They ran brand awareness ads on TV and digital channels during sporting events. To
test their hypothesis, they used cross-channel reach reporting as their approach to measurement.
Cross-channel reach reporting measures how different channels work together to generate business
outcomes. Channels include but are not limited to email, TV, direct mail, Facebook and paid search.

There are some limitations to cross-channel reach reporting. First, metrics can often vary by
channel. Additionally, not all channels share reach data, so it can be difficult to get accurate results.
Lastly, reach reporting doesn’t necessarily correlate with brand or conversion business outcomes.

Marketing mix modeling


The Axion team wanted to optimize spend across channels, which included TV, email, direct mail,
Facebook and other digital channels. They wanted to test their hypothesis that TV achieved a higher
ROI than direct mail, and that they should allocate more budget to TV in the year that followed.
They used marketing mix modeling to understand the effect each channel had on sales outcomes
from the previous year.
Marketing mix modeling, also called media mix modeling or MMM, is a data-driven statistical
analysis that quantifies the incremental sales impact and return on investment (ROI) of marketing
activities. It's an established solution for holistic, cross-channel sales measurement both online and
offline.

Marketing mix modeling isolates the impact of marketing and non-marketing factors on a KPI with
statistical models. As a result, you can quantify the incremental effects of your media investment on
sales both online and offline. Once you have this information, you can calculate key metrics, such
as ROI or cost per outcome.

MMM doesn't just report ROI or cost per outcome. You can use the model output to plan different
scenarios and perform forward-looking simulations and optimizations. Those can then help you
make informed strategic decisions. Here are a few examples of questions MMM can help you
answer:

 How much did my online advertising increase my performance versus my offline advertising?
 What’s the optimal allocation of budget spend across media channels?
 Should we raise our prices?
 Do promotions increase incremental business value?

How does MMM work?


MMM uses historical data to determine how different variables impact the KPI. The modeler first
identifies all of the different variables that may have an effect. Some of these variables may relate to
marketing, while others may not.

The models use econometrics (more specifically, a hypothesis-driven regression analysis) to


quantify the impact of relevant factors on a KPI.
MMM is a mathematical equation that shows the statistical relationship between variables and a
KPI.

KPIt = ß0 + ß2 Searcht + ....all other factors + errort


KPIt: The KPI at the time (t) you want to model
β0: The base performance, or what performance would be if all other factors were
at their minimum
β: The coefficients, or what a change in the variable means for the KPI

To give you the closest prediction of the KPI, this model finds the coefficients (β) that minimize
squared errors based on the independent variables included in the model.
Marketing mix modeling most commonly uses linear regression, a form of predictive analysis that
helps estimate all the values in the above equation. Once you have the model, you can predict your
KPI value based on the coefficients in the model. The most common way to assess model
performance is to compare actual sales to the predicted sales in that model. The closer the model
predicts real-world performance, the more effective it is. MMM is never perfect, but it is useful.

Axion MMM example


As the Axion team works on the budget, they'd like to estimate how sales units would look if they
had three variables: price, TV rating points (TVRs) and Facebook advertising impressions.
Based on data from last year, they create a marketing mix model. The coefficients in the equation
show what a change in each variable means for their KPI, which is sales units.

Sales units = 10,000 - (5,000*Price) + (500*TVRs) + (4,000*Impressions(m))

Now that they have this model, they hypothetically know that:

 If prices increase by $1, sales units go down by 5,000.


 If TVRs increase by 1, sales units go up by 500.
 1,000,000 more Facebook impressions means 4,000 more sales.

The team can use this information to optimize their budget for next year. Of course, this is a
simplified model.
In reality, there would be more variables that affect sales.

MMM considerations
While MMM is a popular and widely used tool, it has some limitations.

It may struggle to capture incremental sales if the increase is minimal. It requires collaboration
between modelers and the advertiser. Furthermore, it also often requires the advertiser to
collaborate across their departments, such as pricing or distribution.

Any analysis is only as good as its data inputs, but data collection can be a challenge given the
holistic nature of the models. MMM is also based on correlation rather than causation, and it can be
time-intensive to implement.

A/B tests
In July, the Axion team wanted to reduce their average CPA. At the time, they only ran ads on
News Feed, and they needed to confirm their hypothesis quickly. They hypothesized that the
incorporation of Instagram Stories as an additional placement would reduce their average CPA.
They therefore used an A/B test to understand how the additional placement affected their average
CPA.

A/B tests enable you to test different versions of your ads against each other, so you can determine
which elements achieve the best performance based on your current attribution settings. This type
of testing is ideal for businesses who just want to learn which tactics work best, so they can
optimize their future campaigns. In general, an A/B test works best for everyday tactical
optimizations. You can test variables such as audience, creative, placement, delivery optimizations,
budget structures or campaign objectives.

However, some A/B tests don't include control groups, only randomized test groups. In this case,
they don't measure causality or the incremental value of a strategy. Such A/B tests therefore only
identify correlation, not causation.
Randomized control trials
Randomized control trials, or RCTs, are experiments designed to measure causality. They include
randomization of participants into mutually exclusive groups. One or more of these groups receive
the treatment and are therefore called the test groups. Meanwhile, one or more don't receive the
treatment, and we call these the control groups. The control provides a standard of comparison and
can represent a standard practice, a placebo or no treatment. Facebook offers two types of self-serve
RCTs: Brand Lift and Conversion Lift tests.

The Axion team ran ads on multiple channels, including Facebook. To assess the accuracy of their
current attribution model, they wanted to measure the sales that their Facebook advertising
generated. The team hypothesized that their Facebook advertising achieved $500,000 in incremental
sales and a $50 cost per incremental purchase in the month of June. A conversion lift test helped
them measure the sales of
their Facebook advertising.

If your business needs are more advanced, you can use a managed lift solution.

Managed lift solution

Managed lifts offer more a customizable setup (such as nested tests), multi-cell tests, flexible and
custom questions, PowerPoint deliverables and centralized coordination of test setup from a
Facebook sales team.
Partner solution

You can enhance and optimize your Facebook ads and use a Facebook marketing solutions
provider to conduct your tests.

Statistical power
The results of your RCT rely on its statistical power. Statistical power is the ability of a test to
detect an effect, if the effect exists. Here’s what can affect statistical power:

 Reach and holdout percentage: It's generally most effective to have a holdout of at least 20% of
your audience.
 For Conversion Lift, the conversion rate: How many people do you expect to convert out of all of
the people you reach?
 Budget: Do you have sufficient overall spend to reach a high percentage of your target audience?
We recommend that you budget to reach 50% or more of your target audience.
 Lift percentage: How big of an effect can we expect to see?

Altogether, this helps us determine if your test is likely to have significant lift and, if you need
better evidence, to help you make adjustments.
RCT: single-cell, multi-cell and nested test
There are three ways to conduct a lift test: single-cell, multi-cell and nested. Each serves its own
purpose.

Single-cell
A single-cell is a simple test, whereas a multi-cell and nested test are more complex. Single-cell
tests seek to answer the question, "What is the incremental effect of the campaign?"

The difference in results between the test and control group, whether you measure brand or
conversion outcomes, is the incremental effectiveness of the campaign. This gives you a baseline
understanding of how your campaign has performed.

Multi-cell
Multi-cell lift tests seek to answer the question, "Which campaign strategy leads to greater
incremental impact?"

Multi-cell tests are more complicated, but in their simplest form, you set up your campaigns with
different treatment variables you want to test. The same way single-cell lift tests measure
incremental results, multi-cell tests tell you which campaign generated the greatest incremental
results.

Nested test

A nested test uses a test design where the test group for one test is divided into a control and test
group for another test. In other words, it's a test within a test.

In a nested test, we distinguish between the "parent" level (also called the master test), which is the
outer or broader test in this illustration, and the child level (also called the nested test), which is the
inner test contained within the parent.

In some cases, researchers opt for more than one child-level test, which can quickly result in very
complex test designs. A nested design helps you get a baseline understanding of incremental brand
or conversion outcomes for your total investment while simultaneously measuring the incremental
impact of a subset of that investment.
You should consider nested tests as a last resort, because:
 The nested structure isn’t obvious in reporting.
 The results are subject to a greater risk of misinterpretation.
 Planning and setup are more complex.
Knowledge check
A tour operator advertises on Facebook, Instagram and YouTube. Its brand team is evaluating
performance of these channels in Ads Manager and Google Analytics to determine how to allocate
their increased budget for the upcoming season. They suspect the significance of Facebook
platforms is over-represented in Ads Manager and under-represented in Google Analytics.

How should the company measure the causal contribution of the Facebook platforms accurately?
Adopt multi-touch attribution models for all platforms

Use Ads Manager results to evaluate Facebook platforms

Use Google Analytics to evaluate Facebook platforms


Run an ad account-level randomized control trial on Facebook

Key takeaways
 With a test hypothesis in place, your next step is to determine the measurement approach that
proves or disproves that hypothesis.
 Cross-channel reach reporting, attribution, A/B testing, MMM and randomized control trials
are some of the measurement approaches that can prove or disprove your hypothesis.

Design a Test
This lesson prepares you to design tests to effectively answer your business questions.

Types of tests
When you design a test identify the key performance indicators (KPIs) first to confirm that
they align with your business goal. Let's look at the three different types of tests.
A/B test
An A/B test can compare, for example, the effectiveness of a product image to an image of
people as they use that product. The KPI for an A/B test is the cost per result based on the
last touch from Ads Manager with a click precedence model.
Conversion Lift test
A Conversion Lift test measures the incremental conversions that your ads generate, either
at the account level or at the campaign level. The KPI is the cost per Conversion Lift based
on the conversion difference between the test and control groups. In other words, how many
more conversions did your campaign lead to, and what was the cost of each additional
conversion.
Brand Lift test
A brand lift test measures outcomes at a brand level, for example, ad recall, awareness or
favorability. The KPI for a Brand Lift test is the cost per brand lift based on the
conversion difference between the test and control groups.

A/B test
A/B tests help you understand which campaign strategies work best, so you can make
tactical optimizations. During an A/B test, you divide your audience into random, non-
overlapping groups. This randomization ensures the test is fair and reduces the likelihood
that a confounding variable might skew the results.
Each group sees ads that are identical, except for the variable you want to test. This variable
could be delivery optimization, placement, creative or even the audience itself.

The first group sees version A of your ad, while the second group sees version B. Then you
measure how each ad or ad set performs and measure that against your campaign objective.
The ad or ad set that performs best wins.
A/B tests work for the video views, reach, traffic, app installs, lead generation and
conversions ad objectives. They’re more affordable than lift tests and their recommended
duration is 3–14 days. This means they're well-suited for quick, tactical optimizations.

A/B test setup checklist


Before you start A/B testing, make sure you’re ready. Use this checklist to prepare for an
A/B test.

Duration: Run the test for 1–2 conversion cycles or at least two weeks.

Budget: Allocate equal budgets to both campaigns, and make sure these budgets are high
enough to exit the learning phase. The minimum weekly ad set budget must be equal to 50
times the one-day click cost per action (CPA) and a conservative minimum weekly ad set
budget must be equal to 100 times the average CPA.

Creative: Include a prominent call to action as well as branding throughout each ad.

Audience: To reduce the possibility of inflating the baseline, minimize audience overlap
with other campaigns or ad accounts.

Dark period: Consider a pre or post study media dark period to reduce the likelihood of
contamination.

Potential outcomes
There are two possible outcomes of the A/B test. If the test doesn't declare a winner,
evaluate your campaigns and decide whether to run the test again.

A few things to remember:


 Wait until the end of the study to evaluate your results.
 Any result greater than or equal to 90% is reliable.
 Combined A/B test groups need at least 100 people to convert before the tool can show your
results.
If there's a winner
 Proceed with the winning strategy.
 Explore other variables to test.
If there's no winner
 Adjust your campaigns and rerun the test. Consider optimizing the creative.
 Reference the test setup checklist to optimize your campaign and measurement practices.

Conversion Lift test


With a Conversion Lift test, you compare sales or conversion outcomes among people
who’ve seen your ads with people who haven’t. This comparison conveys the incremental
online, offline and mobile app business that your campaign has generated.
A Conversion Lift test first randomly assigns people from a target audience to either a test
group or a control group. People in the test group have the opportunity to see an ad, whereas
people in the control group don't.

After the campaign ends, you can evaluate whether the group that was able to see the ad
took action in greater numbers than those who didn't. You can also assess whether they
converted more. The degree to which people did or didn't convert is the measure of how
much the ad influenced conversions.

Conversion Lift setup checklist


Before you get started with a conversion lift test, make sure you’re ready. Use this
checklist to prepare for a Conversion Lift test.
Duration: 1–2 conversion cycles or at least two weeks.

Budget: Allocate enough to exit the learning phase.

Creative:Include a prominent call to action as well as branding throughout each ad.

Audience: To reduce the possibility of inflating the baseline, minimize audience overlap
with other campaigns or ad accounts.

Dark period: Consider a pre- or post-study media dark period to reduce the likelihood of
contamination.

Conversion events: Select events that reflect your primary business goals along the
marketing funnel.

Edits: If possible, don’t edit campaigns mid-test or end tests early.

Components of the Conversion test results


The measurement period and your overall lift results show the conversion lift percent, the
conversion lift number and the sales lift.The conversion percentage shows the difference in
conversions between people who did and didn’t see your ads during the test.
Lift in efficiency shows the lifts in cost per click (CPC) and return on ad spend (ROAS).
The CPC lift is the cost of each additional conversion that the ads in this test caused. The
ROAS lift is the additional revenue generated for each dollar that you spent on ads in this
test.
Your test and campaign details, and within that, the age and gender distribution. The test
details section provides information about the behaviors and actions of people in the test and
control groups.
The breakdown by demographic section shows the Conversion Lift results by age and
gender, which gives you more information about which segments provide the most lift.

Potential outcomes
Let’s examine how results affect the outcomes.
Positive results
If your test results are both positive and statistically significant, you may want to continue
with that strategy, or you may want to increase the budget and expand the campaign. This is
also an opportune time to optimize your strategy and further A/B test different variables.
Negative results
If your test results are flat or not statistically significant, you may want to adjust the
strategy and rerun the test. Reference the test setup checklist to confirm that you
followed the best practices for Conversion Lift tests.

Brand Lift Test


With Brand Lift, you can measure brand impact by your account or campaign strategy. A
Brand Lift test randomly assigns people from a target audience to either a test group or a
control group. People in the test group have the opportunity to see an ad, whereas people in
the control group don't. After the campaign ends, the system delivers polls to both groups.
After people respond, it analyzes the results to deliver key brand metrics.

Brand Lift setup checklist


Use this checklist to prepare for a Brand Lift test.

Duration: 4-6 weeks

Budget: Allocate a budget that's high enough to allow for an average of two impressions
per person per week and also high enough to reach required minimums (for example,
$30,000 in the US).

Creative: Include a prominent call to action as well as branding throughout each ad.

Audience: To reduce the possibility of inflating the baseline, minimize audience overlap
with other campaigns or ad accounts and achieve a minimum reach of 2 million people.

Dark period: Consider a pre- or post-study media dark period to reduce the likelihood of
contamination.

Polling questions: Create questions for different stages of the funnel to get a full picture
of your brand effect.

Edits: If possible, don’t edit campaigns mid-test or end tests early.


Questions based on Campaign Objectives
Category Type Question
Upper-funnel Standard ad recall Do you recall seeing an ad for [X]?
Upper-funnel Brand awareness Have you heard of [X]?
Mid-funnel Familiarity How familiar are you with [X]?
Mid-funnel Abstract favorability How would you describe your overall opinion of [X]?
Lower-funnel Recommendation Will you recommend [X] to a friend?
Lower-funnel Action intent How likely are you to consider [X]?
Just as we shouldn't use proxies such as clicks or likes for brand metrics, we also
shouldn't use proxies for sales metrics.
Brand intent is a proxy for sales, and it can be a misleading metric to use. Research shows
that there's no strong correlation between intent to purchase and actual sales. If your client
asks about sales, it may therefore be more effective to measure through our sales solutions.
Remember, though, that lower-funnel metrics can be harder to shift, and it might take more
time to make a difference.
UPPER-FUNNEL Ad memorability and recall
Brand memorability and recall
Awareness
MID-FUNNEL Familiarity
Favorability
LOWER-FUNNEL Consideration
Intent

Components of Brand Lift test results


 The measurement period
 The overall results and benchmarks for campaigns in your region and vertical
 Lift results organized by polling question
 Campaign results (the campaign details) and creative overview
 The results for the desired responses organized by demographic results and respective test
audience size
At the top, you have the measurement period for the test.
Then you have the overall results, which include the brand lift percentage and cost per brand
lift for this test. This also includes benchmarks for campaigns in your region and vertical.
Remember that the benchmark results are specific to either the country or the vertical, but
not both.

This report compares the results to tests for the technology vertical in Asia Pacific.
However, the information here is compared to all tests in Asia Pacific and to tests for the
technology vertical worldwide.

BRAND LIFT is the number of people you reached with your campaign multiplied by
your brand lift percentage.
In this case, 2.65 million times 11% equals 291K more people who would give the desired
response in the test group compared to the control group.

COST PER BRAND LIFT shows the estimated cost of each additional person who gave
the desired response in the test. The brand lift percentage is the difference in the percent of
people who submitted a desired response in the test group as compared to the control group.

The breakdown also includes campaign results (the campaign details) and creative overview.
The results for desired responses show up demographically and by respective test
audience size.

Potential outcomes
Positive results
 Continue with your strategy.
 Increase your budget and retest if you want to expand your campaign.
Negative results
 Adjust your strategy, consider optimizing creative and rerun the test.
 Refer back to the test setup checklist for ways to optimize your campaign and
measurement practices.
A few things to remember:
 Wait until the end of the test to evaluate results.
 Anything greater than or equal to 90% is a reliable result.
 Each polling question needs at least 250 responses before we can show you your lift results.

To promote a new product, a large sneaker brand plans to launch a national TV


campaign along with an online campaign across a few digital platforms, including
Facebook. The TV campaign runs through April, then the Facebook campaign starts the
second week of April and ends the first week of May.

The brand team wants to determine the effectiveness of Facebook in addition to other
media. They also want to know whether the national TV exposure might contaminate a
Brand Lift test on Facebook, particularly because both campaigns target people ages 18–
34.

How could the national TV exposure interact with the Brand Lift test on Facebook?

Some people in the control group might see the ads on TV, which would lead to an
underestimation of the incremental effect of the Facebook campaign.
TV exposure can't contaminate the test, because the Brand Lift test uses randomized
control trial methodology.
Some people in the control group might see the ads on TV, which would lead to an
overestimation of the incremental effect on the Facebook campaign.
TV exposure can’t contaminate the test, because the audiences of TV and digital
platforms are different.

A large quick service restaurant brand launches a new holiday-themed dessert. Its
marketing team runs an always-on campaign on Facebook to maintain brand awareness.

The marketing team launches a two-week campaign to direct traffic to a landing page
that offers a discount on the dessert during the holiday season. The discount campaign
runs during the two weeks prior to the holiday.

The team runs Brand Lift tests every quarter to measure the effectiveness of their
always-on campaign. They also plan to run a Conversion Lift test to determine how
many incremental coupon downloads the discount offer campaign generated. They set
up the Conversion Lift as a nested test within the Brand Lift test setup.

Which rationale should this team use to document that the conversion lift test results are
unbiased?

The nested setup accounts for possible contamination across studies.


The discount offer campaign is unlikely to reach the same audience as the always-on
campaign.
The Conversion Lift test measures a different KPI than the Brand Lift test.
The Brand Lift test only polls 1,000 respondents, so the discount offer campaign
likely won’t affect it.

Key takeaways
 Without a well-formed hypothesis, any experiment or data analysis lacks the necessary
guidelines to be useful and actionable.
 Once you have a hypothesis, you can determine the measurement approach that proves
or disproves it.
 When you design a test, identify the KPIs for each test and ensure that they align with
your business goals.

Sources
 About Experiments
 Best Practices When Getting Started With Experiments
 About A/B Testing
 About Facebook Conversion Lift Tests
 About Brand Survey Tests

You might also like