3.1. Data Gathering & Analysis​
3.2. Defining the hypothesis
3.3. Prioritizing hypotheses
3.4. Creating variant
3.5. Running the A/B test
3.6. Measuring the results
3.7. Implementing the variant
4. CASE STUDY:​​Where to start A/B testing

A/B/n test ​ Sample size

The scientific approach to comparing two The number of users required to conduct
or more page versions to identify the best the A/B test to reach statistical
performing one​. significance.​​

Hypothesis Statistical significance

An assumption that page changes will Statistical proof that A/B test results are
result in its performance improvement.​​ not due to random chance. ​

Control Heatmap
The existing version of your website's A visual representation of user
page that you want to improve. ​ interactions with the website.​

Variant Heuristic analysis

An improved version of the website's Review of the website by CRO analysts
page based on the hypothesis. ​ to identify problems users might face​.​

Over the last decade, successful eCommerce businesses have been perfecting their
websites using lean process optimization as part of their company strategy. For many of
them, A/B testing has proven to be a trusted go-to technique.

of companies perform A/B tests on their

77% websites​​

of companies run two or more A/B tests

71% per month​​

of companies believe A/B testing is highly

60% valuable for conversion rate optimization​​

Source: Invesp​​

In order to maintain a competitive edge, most online stores and apps constantly find
themselves looking for ways to improve their customer experience. At the end of the
day, this is essential for enabling conversions and, ultimately, increasing sales.
In this guide, we dive into the benefits of website optimization using A/B testing, and
go over the main steps necessary to set up an effective testing procedure.​Gather
insights on A/B/n testing to make sure you stay ahead of the competition.​

Our success at Amazon is a function of
how many ​experiments we do per year,
per month, per week, per day…"
Jeff Bezos, CEO at Amazon​​

If it can be a test, test it. If we can’t test it,

we probably don’t do it.
Stuart Frisby,​

A/B testing is a scientific approach to comparing two or more page versions in order to
identify the best-performing one.
Control (Variant A) is the current version of a webpage you are looking to improve.​
Variant (Variant B) is the improved page based on the hypothesis defined earlier.​​
The variant includes any changes to the page you need to test – for example, changes
in the copy or functionality alterations. In the best-case scenario, you should be able to
attribute KPI increase or decrease to a specific change on the page. Therefore, it is
recommended to test different versions of the same element: one headline option vs.
the other, video vs. image, two locations of one element on the page, etc.​

The basic concept of A/B testing​:

When the A/B test is set up, your website shows two versions of the same page
simultaneously. Site visitors randomly see either control or variant on their device, and
user interaction with both variants gets recorded. Later this data can be analyzed in
order to identify which version performed better.

As soon as visitors open the page, they enter the experiment. Their browser cookies are
registered in the system along with the information of which page version was shown to
them. This ensures that during the experiment users always see the same page version
as they saw the first time they visited the website.

Businesses often find themselves asking the same question: “Why should I
invest in A/B testing?” In fact, there are several benefits of introducing A/B
testing to your optimization program:

Reduce risks
The introduction of radical changes to your website can sometimes cause more harm
than good. A/B testing helps ensure that the changes causing drastic KPI decrease will
not be implemented.

Gain knowledge
Each new A/B test brings additional data about your website visitors. This helps
understand how users respond to certain page elements and makes it easier to come
up with more robust and data-driven optimization hypotheses.

Exclude HiPPO effect

A common trend among businesses is to make decisions based on the highest-paid
person’s opinion (HiPPO effect). Even though on many occasions this may be justified,
human beings are vastly influenced by irrational factors such as biases, previous
experience, and intuition. This, in turn, has the potential of affecting the ability to make
the best choice on the spot. With A/B testing decisions are more likely to be made on
science rather than gut feeling.

Motivate co-workers
Company employees may grow reluctant to generate and share ideas if they don’t see
them implemented over time. A/B testing allows you to safely test even the craziest
ideas, and let the data determine which ones will be applied. Thus, everyone feels
involved and heard.

Further on, we'll walk you through the most important steps that you need to take in
order to run successful A/B tests.
Continual eCommerce business growth relies heavily on how well your processes are
aligned. Testing multiple random ideas at once will hardly result in continuous
improvement of your website and growth of your business. At the same time, a clear
and well-considered optimization workflow can and often does bring a significant KPI

Representation of A/B testing process:

A/B testing process can be divided into 6 cycle steps:​

1. Gathering & analysing the data,
2. Defining the hypothesis,
3. Prioritizing hypotheses,
4. Creating variant,
5. Running the A/B test,
6. Measuring the results.
Step #7 is taken based on the results of each A/B test you perform. This means that the
changes are only implemented if they are approved and proven to be effective.
In the following chapters, we'll dig deeper ​into each step.

STEP 1: Data Gathering & Analysis
Before proceeding with any optimization measures, you need to make sure that you
have the data to guide your efforts. This includes understanding who your users are,
how they interact with the website, and what are the main optimization opportunities
(i.e., where you’re losing money).

You can achieve this by quantitative and qualitative ​data ​gathering and analysis.
Quantitative data allows you to identify various bottlenecks across your website. To
get your hands on this information you need to first set up a digital analytics tool, such
as Google Analytics, Adobe Analytics, Yandex Metrica, or any other tool that fits your
needs. Next, you need to analyze the data collected in your current digital analytics
In general, to achieve accurate results, we recommend performing analysis on the
following data set:
Page reports –​what are the most popular pages on the website?
Demographic reports –​who are your users?
Technology reports –​what technologies are they using? what devices should you
account for?
​Top events –​if you have custom events configured, analyzing them can give you a great
insight into how prospects use your website.​
​ ur personal favourite report to analyze is the shopping behaviour report. It shows the
shopping stage in which most of your visitors leave the website. ​

Shopping behaviour report:

TIP: Analyse data using different segments, splitting your users into smaller groups to
identify specific patterns, e.g., separate data for each device type using segments.

After completing the quantitative analysis and identifying the main areas of
improvement, it is time to proceed with the qualitative data research.
Qualitative data helps you see what kind of problems visitors face on the website.
This research will depend on resource availability. Some of the most efficient ways to
gather this data are as follows:​
Watching session recordings
Analysing heatmaps
Conducting user surveys
Conducting user tests
Conducting interviews with customer care center workers
Performing heuristic analysis.

STEP 2: Defining the hypothesis

The data that you have gathered and analyzed in the previous step should give you an
idea of what is stopping your website visitors from making or closing the purchase.
These assumptions should be turned into hypotheses that describe possible solutions.
Hypotheses are commonly defined using the following formula:​
By implementing change A, we expect metric X to increase/decrease.​
​For example, the hypothesis could be:
Due to the low cart-to-checkout rate, we assume that if users see a prominent notification
that the product was added to the cart, they will proceed to complete the purchase, hence
cart-to-checkout rate and conversion rate could increase. ​
The hypotheses you have come up with will be A/B tested at a later stage. Given that
you have multiple hypotheses it is important to decide which ones are more likely to
bring higher ROI. These should be prioritized and tested during the first iteration.

STEP 3: Prioritizing hypotheses
Here comes the interesting part. Let’s assume you have come up with 10 hypotheses
for your website optimization. In order to prioritize them and decide which ones to test
first, you need to rank them 1 to 10.
Note: You need to verify that your hypothesis is valid for A/B testing, i.e. it can generate
300 – 400 conversions per variant, per segment that interests you during the
experiment. If this threshold isn’t met, running the test might be ineffective, as there is
a high chance of getting false-positive results.
While there are multiple prioritization methods online, below we will focus on one
particular approach that has proven to work reliably – including, for our team.
The approach consists of 2 steps:
•​​Funnel-based prioritization
•​​Evidence-based prioritization.

Funnel-based prioritization

First, you need to determine which of the shopping funnel steps each hypothesis is
related to. eCommerce websites usually comprise the following 5 steps:​
1. Landing page (most likely Homepage)
2. Category page
3. Product details page
4. Shopping cart
5. Checkout.
Hypotheses that are the closest to the end of the funnel are of the highest priority.

Funnel-based prioritization is significant in identifying the highest potential ROI of the

test hypothesis. The deeper your prospects are in the shopping behavior funnel, the
more interested they are in a product, and more likely to make a purchase.​

For example, you might have a hypothesis about the Homepage (1st step in the
shopping behavior funnel) – it'll have the fifth priority, whereas a hypothesis related to
Checkout (the last step in the funnel) will have the first priority.​​

Funnel-based prioritization table example:​

​Hypothesis ​Funnel step ​Priority

By making the Checkout form ​Checkout - 5th step ​1

enclosed, i.e., isolated, more
users will complete the purchase,
hence conversion rate will
increase. ​

​ y changing CTA "Proceed to

B ​Shopping cart - 4th step ​2
Checkout" into a more
noticeable color, more prospects
will proceed to Checkout, hence
cart-to-checkout rate will

Evidence-based prioritization

One shopping funnel step can have several related hypotheses. Thus, additional
prioritization may be necessary to rank hypotheses within the same funnel step. This is
where the so-called evidence-based prioritization comes in handy.

Evidence-based prioritization is inspired by PXL methodology​, and the idea is to

determine and prioritize the hypotheses that have the most data to back them up. By
contrast, hypotheses that are based mainly on intuition, should be tested last.​

Evidence-based prioritization consists of 7 questions:

1. Is the hypothesis confirmed by quantitative data analysis?
2. Is it confirmed by heuristic analysis?
3. Is it confirmed by user tests?
4. Is it confirmed by user surveys/polls/feedback?
5. Is it confirmed by session recordings?
6. Is it confirmed by heatmap analysis?
7. How many hours does it take to develop the test?
The first 6 are Y/N questions, and the hypothesis receives 1 point for every ‘Yes’ and 0
points for every ‘No’. The 7th question is answered by calculating hours: up to 4h – 3
points, up to 8h – 2 points, up to 16h – 1 point, more than 16h – 0 points. The
hypothesis that receives the highest score should be tested first.

STEP 4: Creating variant
Now that you’ve ranked and prioritized your hypotheses, you are ready to start
preparations for the test. Test hypotheses with the highest results first. ​

Create variant

As you are developing the variant, make sure the design changes are communicated
clearly enough. All parties involved – stakeholders, developers, etc. – should be on the
same page as to what the new design looks like.

Implement the test

When the design is ready and approved you can proceed with implementing the test.
In most cases, it can be executed using ready-to-use A/B testing solutions. Overall, the
implementation of the test will depend, among other things, on its difficulty, the
resources available, the tools you are using to run the test.
Most tests are executed by applying ready-to-use A/B testing tools. There are multiple
testing tools available on the market. ​

Top A/B testing tool platforms used across the Internet:

MVT example:

STEP 5: Running the A/B test

At this point, you should have identified both the type of test you are planning to run,
as well as the testing tool you are going to use. The only two things left to do before
you launch the test are calculating test duration and sample size.

A/B test duration is the timeframe recommended for running the test in order to collect
the data about each version’s performance. The requirements it has to meet are as
1. no less than one business cycle long (for most eCommerce stores it is one week)​
2. last long enough to collect the necessary sample size.

A/B test sample size is the number of participants required to make valid decisions
about the results of the experiment.​

Calculating both of these indexes is fairly straightforward and can be done using any of
the free A/B test calculators. One of the better options is CXL’s sample size calculator​,
but there are multiple others available online.
Calculating this will let you know exactly when you have enough data to end the test
and start analyzing the results. Now, feel free to launch the test and wait for the results.

STEP 6: Measuring the results
After the test has run for at least two business cycles and reached the required sample
size, it is time to close the test and evaluate the results.
Most A/B testing tools have built-in reporting capabilities. However, these standard
reports often lack the capacity of in-depth and segmented analysis. This makes it
difficult to evaluate test results correctly and make the right conclusions.
An important side note is that A/B test result evaluation is rooted in statistical analysis.
While there are multiple tools that allow automating parts of the calculation, having an
understanding of at least the basics of statistics definitely helps. Let’s take a look at
some of these core concepts.

Statistical significance

Statistical significance is the statistical proof that A/B test results are valid and not
based on a random chance. Statistical significance reflects how confident you want to
be in your test results. In A/B testing it is common to set a significance level to 95%,
which allows only a 5% chance of error.​

​Statistical significance depends on two factors:

Sample size
As covered before, this is the number of users who participated in the experiment.
The bigger the sample size is, the more confident you can be about the results being
​ alculating statistical significance is a two-fold process. On the surface level, there are
multiple ready-to-use online calculators that will evaluate test results and calculate
whether statistical significance has been reached. Nevertheless, it takes additional
expertise to see correlations draw conclusions from numerical data. As a rule, in the
Web/eCom context, this is a responsibility of a dedicated CRO team.​
Minimum detectable effect
This is the minimum difference between each variant’s performance within the
observed experiment. Greater difference means that you can be more confident
about your observations. Additionally, it requires a smaller sample size to estimate
the results.​

Usually, within the test, several objections are tracked. Let's look at a real example. We
ran an A/B test recently for the 2-step checkout, with the following hypothesis:​
“Many users assume that upsell block is unexpected products in their cart, therefore by
removing the upsell block from the Checkout more users could proceed to the 2nd step
(Review) and therefore more users might complete a purchase.” ​

Control and variant of the experiment:

For this hypothesis, we tracked the 3 following objections:

1st-to-2nd step
Results in Google Optimize showed that in terms of transactions the variant performed
better, but it wasn't enough to make the right decision. There can be a case when the
number of transactions increases due to removing upsell, but the revenue actually
decreases, as there's a risk of decreasing avg. product quantity in a transaction, when
removing the upsell.​
Hence, each objective should be evaluated thoroughly before making any conclusions.
However, evaluating only the objections you set before the experiment, is still not
enough to make correct conclusions.

It's important to complete the segmented analysis. There might be several specific
segments that your test should be analyzed under. Despite those, you should always
analyze A/B test results for each of the device categories separately, i.e., what worked
brilliantly on a desktop could look bad or not work at all on mobile and vice-versa.​​

To sum up, it's essential to always evaluate all objections based on device type and any
other test-specific segments.​

Result visualization
This is the step where you should have completed your test, evaluated the data,
discovered the results of the experiment, and drawn conclusions. But before you can
proceed with the implementation, you need to make sure that the changes are
validated and accepted by various parties, such as shareholders, developers, and other
teams involved in the process. This means that the results of the experiment have to be
clearly communicated. This is done best with the help of data visualization tools.
At first glance, showing multiple rows of data might look credible and acknowledge the
amount of analysis you did, but it will be very hard to understand what this data
actually means.
Consider the two report examples below:
While table data representation is commonly used and looks credible, it lacks
readability, making it difficult to make sense of on the fly.

Result visualization using Excel:

The second example presents the same information but does so in a much more
comprehensible manner. Another benefit of such a report is that it can serve the
purpose of test documentation over time. This makes it easy for all parties involved to
go back to it and see what and when was tested and what outcomes were achieved.

This particular example was created using Google Data Studio.
prepared a​detailed guide on how to create live-mode dashboards for A/B test results​​. ​

Evaluate monetary gains for the variant and each of the segments during every test
analysis, as it will clearly show how much did the variant earn or lose.​Additional revenue
is the strongest argument there can be in persuading management to implement the
changes tested with the variant.​

After the test is evaluated and decisions are made, it is time to go back to the first step
of the iterative A/B testing process, i.e., analyze new data, review the hypothesis and
kick off a new test.

STEP 7: Implementing the variant

How can you tell whether the changes tested in the variant are ready to be
implemented permanently?
The answer lies on the surface: you are good to proceed if the test results clearly
suggest that the variant has outperformed the control, and an agreement has been
reached to implement the changes. However, keep in mind that the experiment doesn’t
end here. You still have to make sure you
monitor the objectives
set during the experiment, even after the changes are permanently implemented. This
will help ensure that the improvements are real and sustainable, rather than influenced
by external factors during the experiment, such as seasonal demand changes.

There are plenty of lists with 'go-to things to A/B test' out there on the Internet, but be
careful with those. They're good for brainstorming sessions and finding the direction
you'd like to move forward in, but remember to tailor those suggestions to your

To make sure you run valuable tests and to increase the possibility of them being
winning tests, each tested hypothesis should be based on prior data analysis. ​

Hence, to get you inspired, we would like to share case studies of the A/B tests​we've

A/B test #1: Removing upsell block in the Checkout


During quantitative analysis for one of our clients, we noticed a very high drop-off from
the Shopping cart. Based on the session recordings and heatmap analysis, we
concluded that drop-off this high could be caused by the upsell block in the cart, as we
saw many users trying to remove products from there, assuming these are unexpected
and unwanted products in their Shopping Cart. ​​


Due to the high cart abandonment rate and heatmap analysis, we can conclude that
many users assume that upsell block shows unexpected products in their cart, therefore
by removing the upsell block and adding a progress bar to the Checkout, more users
could proceed to the 2nd step (Shipping) and therefore more users might complete a
purchase. ​


- Removed cart alike upsell block

- Added progress bar
- Hid promo code under expandable field
- Added “Proceed to Payment” button.


Cart-to-Checkout rate, Transactions, Revenue.​


The new Checkout design had a ​7.25% higher conversion rate, and this resulted in
USD 31 798,26​higher revenue in 3 weeks. You can see a chart with the number of
transactions per variant below. ​

Transactions by date and variant:

Control and variant of the A/B test:​

A/B test #2: Add-to-cart notification

During quantitative analysis for one of our clients, we noticed a very high cart
abandonment rate on mobile devices. Based on heuristic analysis results, we concluded
that prospects do not receive a prominent confirmation that a product was successfully
added to the cart when shopping on mobile devices.


Due to the high cart abandonment rate on mobile devices, we expect that displaying
the pop-up on mobile devices will inform customers of the product being successfully
added to the shopping cart. We expect customer motivation of purchasing the product
to increase, therefore Cart-to-Checkout rates on mobile devices could increase.​​


After users add the product to the cart on tablet or mobile, they receive a pop-up
confirmation that the product was added to the cart.


Cart-to-Checkout rate, Transactions, Revenue.


​The pop-up notification about products being successfully added to the cart had an
8.37% higher Cart-to-Checkout rate.

Cart-to-Checkout rate by date and variant:

Control and variant of the A/B test:​​

When introducing the iterative A/B testing process to your optimization program, you
should be ready that 8 out of 10 tests will most likely lose or will not make a difference.
But this definitely should not stop you, as the more test you run (even if some will not be
successful) you will gain valuable knowledge about you users and eventually will be able
to turn it into valid hypotheses that will most often win.​

We've covered the topic and process of A/B testing, now let's answer questions
we get frequently asked:
1. ​What websites qualify for A/B testing?
​ enerally, any website can qualify for A/B testing. Before running the test, you
need to calculate whether your website will be able to reach the required
sample size in a decent time frame (you don't want the experiment running for
3 months). Hence, if your website doesn't receive high volumes of traffic yet,
you may test those pages with the most traffic and test big changes. As
discussed previously, the bigger the difference in the test objective, the smaller
the sample size is required.
If your website has extremely low traffic and there aren't at least 350 - 400
conversions made on your website per variant, you should consider re-
evaluating and investing in your traffic acquisition campaigns. Check out our
SEO program​offering ways to increase organic traffic that converts.
2. What is the cost of running A/B tests?
​The cost of running A/B tests consists of 4 variables:
TOOL - the A/B testing tool (can be free, e.g., Google Optimize)
CONVERSION STRATEGIST - hours required ​for prioritization, pre & post-test
analysis, results evaluation, A/B test management;
DESIGNER - hours required ​for creating variant designs (not always needed)
DEVELOPER - hours required for implementing the test by a developer (not
always needed)​.​
Nevertheless, you should be ready for hidden costs that can appear if your
variant performs worse than the control. You can attribute this cost to
insurance that negative changes will not be permanently implemented. ​

​3. Can I run several tests at once?​​
It's not recommended to run multiple tests on different pages simultaneously,
as traffic can overlap, which can skew the results. You can run multiple tests at
once only if you can ensure that the participants of each experiment won't
overlap and you'll be able to properly analyze the results.
4. Are A/B tests exclusive to websites?
Not necessarily, you can run an A/B test wherever you want. You can A/B test
email campaigns, PPC campaigns, social media campaigns, and most likely any
other marketing campaign you have in mind.
5. Can A/B testing hurt SEO?​
It's quite of a popular concern that A/B tests can hurt your SEO due to duplicate
content, but let us assure you that it is only a myth! When running a split URL
test, just add "no index" and rel="canonical" to the variant to avoid Google
assuming you have duplicate content on the website.​
6. How to get started with A/B testing?
​You'll need to put into practice the A/B testing process we described in this
eBook. Identify the phase you're at and begin continuously running different
experiments. If you need any advice or assistance, feel free to get in touch with
If you need any advice or assistance, feel free to get in touch with
us at​
to get inspired for your first A/B test.

