Distribtution of Returns PDF Guide

DISTRIBUTION OF RETURNS
QUICK REFERENCE GUIDE
Institute of Trading & Portfolio Management

1
Introduction
The following PDF provides a quick reference guide to the steps required to complete a Distribution of
Returns analysis on any asset you can find the underlying price data for. This guide will focus on the more
practical elements of obtaining data from web sources and then building out the analysis within
Microsoft Excel. For some statistical background, please refer to the Basic_Statistics_Guide PDF in Video
2. The steps provided in this document are to remind you of the overall procedure required, and are not
a comprehensive guide. For more detail on these steps and how to analyse and interpret the data, make
sure to watch the accompanying video in full.
Obtaining the Data

The first step in a Distribution of Returns (DoR) analysis is always to obtain the data. One of the easiest
places to (currently) obtain free US Stock data from is Yahoo Finance and so that is the example used
both here and in the accompanying video. However, you should note that there are many different data
sources you could use, both free and paid for, and the methodology you subsequently use will be almost
identical once you have downloaded the data and exported it to Excel.
For Yahoo Finance the steps are as follows:
1. Navigate to www.Finance.Yahoo.com on your web browser.
2. Enter the asset name (or ticker) you are searching for in the Search bar at the top of the
homepage, and click on the search icon or press enter on the keyboard. This takes you to the
summary page for that asset.
3. From that page, select the Historical Data tab and choose the parameters relevant to your
analysis which will include the time period over which you want to obtain data for, and the
frequency of the price data (e.g. Daily, Weekly, Monthly).
4. After you have chosen the options you wish, just click Apply followed by Download.
5. With the data downloaded, you will be able to open the file in Excel and then start your analysis.
Cleaning the Data
DISTRIBUTION OF RETURNS QUICK REFERENCE GUIDE | Institute of Trading & Portfolio Management
2
When you open the downloaded file (from Yahoo Finance) in Excel, you will see 7 columns of data –
columns A to G. Generally, there will be a few things to do straight away before working on any
calculations:
1. Adjust any column widths so that all the data is visible and doesn’t appear as #### symbols.
• To do this, just click and drag the intersection between columns to adjust them to the
desired width.
2. Delete the Volume column if applicable, as it is unnecessary for a DoR Analysis.
• To do this, select the entire volume column by right-clicking the column letter and then
selecting delete from the drop-down menu.
3. Arrange the data in descending order (meaning the most recent price data is at the top of the
spreadsheet).
• To do this, a filter needs to be added to the column headings by clicking on any one of
the headings and navigating to Data on the ribbon, followed by Filter. Once the filters
are added across all columns, the Date column filter can be used to sort the data from
Newest to Oldest.
Calculating Returns
There are various ways you can calculate percentage returns for the purposes of a Distribution of Returns
analysis:
• Close-to-Close returns [daily, weekly, monthly, quarterly data]
o Formula = (Closeperiod[x+1] – Closeperiod[x]) / Closeperiod[x]
• High-to-Low returns [daily, weekly, monthly, quarterly data]
o Formula = (Highperiod[x] – Lowperiod[x]) / Lowperiod[x]
• Open-to-Close returns [daily data]
o Formula = (Closeperiod[x] – Openperiod[x]) / Openperiod[x]
All of these methods yield valid “returns” to analyse, despite being structurally different. The main idea in
a DoR analysis is to get an understanding of how volatile the asset has been in the past, both in absolute
terms and relative to other assets. We can then use that as a short-term assumption for asset volatility
going forward. If just one calculation method were to be used it should be the Close-to-Close technique
since it is flexible in its use, can be applied across all assets and timeframes, and yields an unconstrained
distribution (as opposed to High-Low returns which be definition cannot be negative).
3
Assuming you are using daily data and you want to analysis the DoR for all 3 types of returns (as in the
video) then the next steps are as follows:
1. Calculate the Close-to-Close, High-to-Low and Open-to-Close returns in the adjacent columns to
your price data, using the formulas above. Make sure to give the columns appropriate headings
as well so you know which calculations are in which columns.
2. Format the returns to display themselves as percentages.
• To do this, select the returns columns and right-click within the selection. From the
drop-down menu, go to Format Cells and choose the Percentage category followed by
typing in the number of decimal places you would like the values to be rounded to
(usually 2 or 3 is fine).
Histogram Table and Chart

This is the stage at which the analysis really starts, since we are working out the distribution of the
returns we have calculated. What that means is that we split the returns into various ranges known as
“bins” and count the number of data points that lie within those ranges. For example, if you were to just
choose two “bins” for Close-to-Close returns you might define 1 range as anything above 0% and another
as anything below 0%. You could then quickly understand how many observations in the data are positive
and negative. By creating many “bins” you can get an even deeper understanding of the data you are
looking at and plot a distribution of that data which allows you to visually interpret the frequency with
which observations have been within certain ranges in the past.
From this point onwards we will just assume that we want to analyse the DoR for Close-to-Close returns.
The procedure can be replicated for other forms of returns using the same methodology.
The first goal at this stage is to create a Histogram Table, and from there you can create a Chart to
visualize that data. Creating a histogram table requires a number of steps in Excel:
1. Choose intervals that define the Bin Ranges you want.
• For example, do you want to count all the data points that lie within 1% intervals from -
10% to +10%? If so, your intervals need to be -0.1, -0.09, -0.08, …, +0.08, +0.09, +0.1
• The intervals (and hence bin ranges) you choose will need to reflect the volatility of the
asset you are looking at and might require a little bit of trial and error.
4
2. Once the intervals are defined, you can create a histogram table based on those intervals from
the relevant set of returns. A simple way to do this is by using the Data Analysis tool in Excel,
which can be found in the Data tab on the ribbon. Within the Data Analysis menu you can choose
the Histogram option and then fill in the form in the following way:
• Input Range – This is the cell range that covers the returns you want to analyse.
• Bin Range – The cell range that cover the intervals you defined.
• Output Range – Choose the cell in the sheet you wish to output the histogram data to.
3. You will see this creates a small two columns table, where the left-hand column is the intervals
that define various bin ranges and the right-hand column counts the number of returns
observations that lie within those ranges. At this stage it is important to understand how Excel
actually counts those data points, and this is more easily described with the help of the
screenshot below (taken from the accompanying video):
• You can see that the intervals were originally defined in cells K3:K12 and ranged
from -2% to +2% in steps of 0.5%. From there, the Histogram Data Analysis tool was
used to create the two columns of data adjacent to intervals. Column M counts the
number of observations that lie within the various ranges defined by our intervals
and Bin ranges. To be clear, this is how it works:
o In cell M4, Excel has counted the number of observations that have a value
of less than -2%. In cell M5, Excel has counted the number of observations
that have a value of greater than or equal to -2% and less than -1.5%. This
same logic is applied down the rest of column M until you get to the last
count in cell M13 where the Excel counts the number of observations that
are greater than or equal to +2%.
At this stage if you want to make things clearer it is advisable to create an adjacent column where you
can manually define the ranges that these counts refer to.
Before moving on to expanding the histogram table further by converting the absolute counts to
percentages, it is a good idea to create a summary statistics table in the following way:
1. Select the Descriptive Statistics option from the Data Analysis tool we just used to create the
histogram.
5
2. Following that, fill in the form by making sure the input range refers to the same set of returns
you are working with for the histogram and then tick the checkbox for Summary Statistics and
choose the desired cell for your output range.
3. Format the descriptive statistics table by adjusting column widths where necessary to make the
data readable. Also, it is a good idea to format the table as follows:
• Re-title the table appropriately (e.g. Descriptive Statistics)
• Mean, Standard Error, Median, Standard Deviation, Sample Variance, Range, Minimum,
Maximum: Percentage to 2 or 3 decimal places
• Kurtosis, Skewness, Sum: Number to 2 or 3 decimal places
• Delete the redundant 0 value at the bottom of the table
Once you have run the descriptive statistics analysis tool you can then move back to the histogram table
to calculate percentage frequencies. So, using the example above, we know that there are 1762
observations that lie between -1% to -0.5%, but it would be useful to know what percentage that is of the
entire data set – so that is what should be calculated next. The percentage frequency for each range of
returns can be referred to as the empirical probability of that range of returns occurring – meaning if we
were to assume that the future distribution of return for the asset we are looking it is the same as the
historic distribution, then there is an X% probability of observations falling within various ranges. To
calculate the probability distribution:
1. In an adjacent column to the Frequency column, calculate probabilities by taking the Frequency
divided by the Total Count (from the descriptive statistics table) for each bin range.
2. Format the values as percentages to 2 or 3 decimal places.
Another way to view these probabilities is in their cumulative form:
6
1. In an adjacent column to the Probabilities column, calculate the Cumulative Probabilities

summing the probabilities from the current Bin Range to all the lower Bin Ranges for every Bin
Range.
2. Format the values as percentages to 2 or 3 decimal places.
With the Histogram Table and Descriptive Statistics Table complete, now you can move onto charting the
data distribution in a Histogram Chart:
1. Insert a 2D column chart by navigating to Insert on the ribbon and choosing the 2-D Column
Chart option.
2. Move and resize the chart area appropriately.
• Move = Click and Drag the centre of the chart
• Resize = Click and Drag the edges of the chart
3. Add the histogram data to the chart.
• Right-click within the chart area and go to Select Data
• Add Series and within the Series Values box, select the cells that represent the
Frequency data from the histogram table you just created.
• Edit the Horizontal Axis by selecting the Text Range defined previously.
• Delete the chart title.
• Add Axis Labels as appropriate.
o To do this, go to Design on the ribbon, followed by Add Chart Elements, Axis
Titles, Primary Horizontal and Vertical Axis Titles. Write over the default axis
title boxes as desired.
• Various other formatting options can be considered at this stage – see video for some
examples.
7
Positive and Negative Returns Analysis Table

If you want to analyse things further you can add a positive and negative returns analysis table after
looking at the Histogram data. This table is used to understand the empirical probability of positive and
negative returns observations and also the probability adjusted average negative/positive returns.
For both Positive and Negative Data Points, we are going to calculate the Average Return, the Count, the
Frequency Percentage (or Probability), and the Frequency (or Probability) adjusted Return:
1. To find the average of a subset of data point, you can use the averageif function. In this case we
want to find the average positive and average negative returns in the data set.
• Average positive =AVERAGEIF(Returns Range, “>0”) where the Returns Range refers to
the cell range of our returns dataset.
• Average negative =AVERAGEIF(Returns Range, “<0”) where the Returns Range refers to
the cell range of our returns dataset.
2. To find the count of a subset of data point, you can use the countif function. In this case we want
to find the positive and negative return counts in the data set.
• Count positive =COUNTIF(Returns Range, “>0”) where the Returns Range refers to the
cell range of our returns dataset.
• Count negative =COUNTIF(Returns Range, “<0”) where the Returns Range refers to the
cell range of our returns dataset.
3. Calculate the Frequency % of positive and negative returns.
• For the positive Frequency % divide the positive count by the total count.
• For the negative Frequency % divide the negative count by the total count.
4. Calculate the Frequency Adjusted Return of positive and negative returns.
• For the positive Frequency Adjusted Return multiply the Positive Frequency % by the
Average Positive Return
• For the negative Frequency Adjusted Return multiply the Negative Frequency % by the
Average Negative Return
You can also repeat the process for returns that are equal to 0% for completeness.
Standard Deviation Analysis Table

The final analysis table to create is the Standard Deviation Analysis Table. This is where we count the
number of returns observations that lie within certain standard deviation ranges. For more information
about standard deviations and normal distributions please refer to the Basic Statistics Guide and Video 4
of the IPLT.
The idea of this table is to firstly find the upper and lower bounds that represent 1, 2 and 3 standard
deviations from the mean of the returns data set. Following that, the count function can be used to
calculate the number of observations that lie within those bounds. Finally, the number of observations
can then be compared to that which we would expect to see in a normally distributed data set.
1. To calculate Upper and Lower Bounds for 1, 2 and 3 standard deviations from the mean, the
following formula can be used:
• Upper Bound: = 𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀 + 𝑋𝑋(𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷)
• Lower Bound: = 𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀 − 𝑋𝑋(𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷)
o Where X = 1, 2 or 3
• Actual Count = COUNTIFS(Data Set,”>”& Lower Bound, Data Set, “<”& Upper Bound)
• Actual Count Percentage = Actual Count / Total Count
• Normal % Count = 68.27% (1 std dev), 95.45% (2 std dev), 99.73% (3 std dev)

Distribtution of Returns PDF Guide

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Distribtution of Returns PDF Guide

Uploaded by

Copyright:

Available Formats

DISTRIBUTION OF RETURNS

QUICK REFERENCE GUIDE

Institute of Trading & Portfolio Management

Obtaining the Data

Cleaning the Data

Histogram Table and Chart

1. In an adjacent column to the Probabilities column, calculate the Cumulative Probabilities

Positive and Negative Returns Analysis Table

Standard Deviation Analysis Table

You might also like