Chap 1 PZ

1.
Summarizing Data
1.1 Graphical summaries of the data
Dot plot and histogram
The time series plot
1.2 Numerical descriptive measures
1.3 Measures of central tendency
The sample mean
The median
Mean versus median
1.4 Measures of dispersion
The sample variance
The sample standard deviation
1.5 The empirical rule
1.6 How to relate two things
1.7 Linearly related variables
Linear functions
Mean and variance of a linear function
Linear combinations
Mean and variance of a linear combination
1.1 Graphical Summaries of the Data
Two key ideas
• Exploratory (descriptive) issues: Look at the

data (sample). Understand its structure without
generalizing
• Inference issues: Use data (sample) to generalize

results to a larger population of interest
© Imperial College Business School

Example:
Problem: How many of 100,000 voters (population) prefer A over B?
We can’t ask them all!
Solution: Ask a sample of 500 voters.
Summarize, describe the 300
data: 300 voters for A (A =

1), 200 for B (B = 0). 200
Frequency
We will learn how to

generalize to the population. 100
For now, we just learn how

to analyze (describe) the 0
data. 0 1
C1
•Data is the statistician’s raw material, the numbers that
we use to interpret reality
• All statistical problems involve either the collection,

description and analysis of data, or thinking about the
collection, description and analysis of data
•There are many aspects of data e,g. data may be

univariate (one variable per case) or multivariate (more
than one variable per case). Let us look at some data…

The Canadian Return Data
Here is a specific data set (or sample). We have 107 monthly
returns on a broad based portfolio of Canadian assets (more on
portfolios later).
Canada
0.07 0.05 0.02 -0.04 0.08 -0.02 -0.05 0.02 0.03
0.00 0.03 0.08 -0.03 0.01 0.03 0.01 0.02 0.08
0.02 -0.02 0.00 0.01 0.02 -0.09 0.00 0.01 -0.07
0.07 0.00 0.02 -0.05 -0.04 -0.03 0.03 0.04 0.00
0.07 0.00 0.01 0.04 -0.02 0.02 0.01 -0.03 0.05
-0.02 0.00 0.01 -0.01 -0.05 -0.01 0.01 0.00 0.02
-0.02 -0.07 0.03 -0.04 0.03 -0.02 0.06 0.03 0.04
0.01 -0.01 -0.01 0.01 -0.05 0.09 -0.02 0.05 0.06
-0.05 -0.04 -0.01 0.01 -0.06 0.05 0.06 0.02 -0.01
-0.06 0.02 -0.05 0.06 0.04 0.02 0.04 0.02 0.02
0.00 0.00 -0.01 0.04 0.01 0.05 -0.01 0.02 0.04
0.02 -0.03 -0.03 0.05 0.04 0.08 0.07 -0.03
Interpret: Each number corresponds to a month. They are given in

time order (go across columns first).
Our first observation is .07. In the first month, the return was .07, in
the 11th .03.
A little finance: what are returns?
• The return on an asset is the percentage increase

in wealth invested in the asset over a given time
period
• If you invest B at the beginning of the time period

you get E = (1+r)B at the end of the time period,
where r is the return
• (1+r) is the factor by which your wealth increases

Example:
Given E and B we can calculate r (the return): r = (E-B)/B
E=110, B=100, r =.1 or 10%
E = (1+.1)B = (1.1)B
For an investment in a stock, E is comprised of the amount

you would get from selling the stock and any dividends paid.
B is the price you pay at the beginning of the time period

to acquire the stock.

Histograms
We are interested in ways to summarize or “see” the data.

The previous table was very unclear.
To display the returns we can use a simple graphical tool: the histogram
(made by the histc command in matlab).
To each point on the number line we draw a bar as high as the number of
elements with that value point.
Interpret:
The returns are
centered or
located
at about .01.
The spread or
variation
in the returns is
huge.
8
Dotplot for canada
-0.05 0.00 0.05
canada
center or
location of the data
variation or spread about the center

Notice that the data has a nice mound or bell shape.
There is a central peak and right and left “tails” that
die off roughly symmetrically.
Dotplot for Volume
Some data
does not
have the
mound
shape.
0 1000 2000 3000 4000 5000 6000
Volume
It is skewed
to the left.
We also have data on countries other than Canada.
Let us compare Canada with Japan.
It really helps to get things on the same scale.
How is Japan different from Canada?
Mutual fund data
• Let us use histograms to compare returns on some

other kinds of assets
• We will look at returns on different mutual funds

such as the equally weighted market and treasury
bills (T-bills)
• The equally weighted market represents returns on

a portfolio where you spread your money out
equally over a wide variety of stocks
Data on 4 different kinds of returns:
Dreyfus
growth fund
Putman
income fund
Equally weighted
market
T-bills
The beer data:
nbeerm: the number of beers male MBA students claim
they can drink without getting drunk
nbeerf: same for females
We call a
point
like this an
outlier
Generally the males claim they can drink more, their numbers are
centered or located at larger values.
The number of bars you use affects how “smooth "the
picture looks.

The time series plot:
We just looked at two kinds of data:

1. the return data
2. the number of beers
• For the return data, each number corresponds to a month
• For the beer data, each number corresponds to a person
• The return data has an important feature that the beer data does
not have
• It has an order!
• There is a first one, a second one, and ....

• A sequence of observations taken over time is
often called a time series
• We could have daily data (temperature), annual

data (inflation), quarterly data (inflation, GDP) and
so on
• For time series data, the time series plot is an

important way to look at the data

Time series plot of the Canadian returns:
On the
vertical
axis we
have
returns.
On the
horizontal
axis we
have “time”
Do you see a pattern?

Monthly US beer production
Now do you
see a pattern?

1.2 Numerical Descriptive Measures
• We have looked at graphs. Suppose we are

now interested in having numerical
summaries of the data rather than graphical
representations
• We have seen that two important features of

any data set are how spread out the data is,
and the central or typical value of the data set
• In this part of the notes we will describe methods to
summarize a data set numerically
• First, we will introduce measures of central tendency to

determine the “center” of a distribution of data values, or
possibly the “most typical” data value
• Measures of central tendency include: the mean and the

median
• Second, we will discuss measures of dispersion, such as

the sample standard deviation and the sample variance
1.3 Measures of Central Tendency
The sample mean
Suppose we collect n pieces of data. We need some way of

describing the data. We write:
x1, x2 , x3 , xn
the last number, n is the number
of numbers, or the “number of
the first
observations.” You may also hear it
number
referred to as the “sample size.”
They are the values that we observe.

Here, x is just a name for the set of numbers,
we could just as easily use y (or Buddy).
x1 5
n=5
2
x3 8
6
2
Sometimes the order of the observations means

something. In our return data the first observation
corresponds to the first time period.
Sometimes it does not. In our beer data we just have a
list of numbers, each of which corresponds to a student.
The sample mean is just the average of the numbers “x”:
sum x1  x2  xn
x 
n n
We often use the x symbol to denote the mean of the
numbers x
We call it “x bar”
Here is a more compact way to write the same thing…
Consider x1  x2  xn
We use a shorthand for it (it is just notation):
x
i 1
i  x1  x2  xn
This is summation notation

Using summation notation we have:
The sample mean
n
1
x   xi
n i 1
Graphical interpretation of the sample mean
Let us go back
to our standard
histogram
In some sense, the

men claim to drink
more
To summarize this
we can compute
the average value
for both men and
women
(I deleted the outlier, I do not believe him!).

I bit of fuss because there are NaN (Not-a-Number), I explain in the next page.
“On average women claim they can drink 4.2 beers. Men claim
they can drink 7.8 beers”
In the picture, I think of the mean
(this deals with NaN) as the “center” of the data
>> bm=isnan(nbeerm);
>>Tm=size(nbeerm,1)-sum(bm);
>>bf=isnan(nbeerf);
>>Tf=size(nbeerf,1)-sum(bf);
>>mean(nbeerm(1:Tm))
>>ans = 7.862500000000000
>>mean(nbeerf(1:Tf));
>>ans = 4.222222222222222
• Let us compare the means of the Canadian
and Japanese returns
>> mean(canada)
ans = 0.009065420560748
>> mean(japan)
ans = 0.002336448598131
• This is a big difference

• It was hard to see this difference in the dot plots
because the difference is small compared to the
variation
More on summation notation
(take this as an aside)
Let us look at summation in more detail.
 xi
i 1
means that for each value of i, from 1 to n,
we add to the sum the value indicated,
in this case xi
add in this value for each i

To understand how it works
let us consider some examples:
Think of each row as an

x y
observation on both x and y.
year
To make things concrete, think
of each row as corresponding to 0.07 0.11 1
a year and let x and y be annual 0.06 0.05 2
returns on two different assets. 0.04 0.09 3
0.03 0.03 4
In year 1 asset “x” had return 7%

In year 4 asset “y” had return 3%

compute x bar.
compute y bar.
(here, we do not
sum over all
observations: we
sum only the
second and
third.)

For each value of i, we can add in anything we want:

The median
• After ordering the data, the median is the

middle value of the data
• If there is an even number of data points, the
median is the average of the two middle values
Example
1,2,3,4,5 Median = 3
1,1,2,3,4,5 Median = (2+3)/2 =2.5

Mean versus median
• Although both the mean and the median are good

measures of the center of a distribution of measurements,
the median is less sensitive to extreme values
• The median is not affected by extreme values since the
numerical values of the measurements are not used in its
computation
Example:
1,2,3,4,5 Mean: 3 Median: 3
1,2,3,4,100 Mean: 22 Median: 3

We call extreme values in a data set “outliers”. We
used to call them funny points but outliers sounds
more scientific. Outliers are sometimes the most
interesting aspect of a data set, and sometimes they
are just coding errors.
The sex survey: how many partners?
“The median number of sex partners over a lifetime is 6

for males and 2 for females. One quarter of men
reported only one lifetime partner, but the range varied
markedly. One man reported 1,016 and one woman
reported 1,009” (Likely outliers, am I wrong?)
1.4 Measures of Dispersion
The mean and the median give us information about the

central tendency of a set of observations, but they shed
no light on the dispersion, or spread of the data.
Example: Which data set is more variable ?
5,5,5,5,5 Mean: 5
1,3,5,8,8 Mean: 5
Do you only care about the average return on a mutual
fund or you need a measure of risk, too? Here is one …

The Sample Variance
. . . .
-+---------+---------+---------+---------+---------+-----x
. . . .
-+---------+---------+---------+---------+---------+-----y
The
y numbers
0.030 0.045 are more spread
0.060 0.075 out0.090
than the x0.105
numbers.
We want a numerical measure of variation or spread.
The basic idea is to view variability in terms of distance

between each measurement and the mean.
xi  x
. . . .
-+---------+---------+---------+---------+---------+-----x
. . . .
-+---------+---------+---------+---------+---------+-----y
0.030 0.045 0.060 0.075 0.090 0.105
Overall, these are smaller than these.

• We cannot just look at the distance between each
measurement and the mean. We need an overall
measure of how big the differences are (i.e., just
one number like in the case of the mean)
• Also, we cannot just sum the individual distances
because the negative distances cancel out with the
positive ones giving zero always (Why?)
• We average the squared distances and define
n
1

n i 1
( xi  x ) 2
So, the sample variance of
the x data is defined to be:
Sample variance:
n
1
s 2
x 
n  1 i 1
( xi  x) 2
• We use n -1 instead of n for technical reasons that will

be discussed later (the intuition does not change, though)
• Think of it as the average squared distance of the
observations from the mean

Questions
1. What is the smallest value a variance can be?

2. What are the units of the variance?
It is helpful to have a measure of spread which is in the

original units. The sample variance is not in the original
units. We now introduce a measure of dispersion that
solves this problem: the sample standard deviation

The sample standard deviation
It is defined as the square root of the sample variance (easy)
The sample standard deviation:
sx  s 2
x
The units of the standard deviation are the same

as those of the original data

Example 1 (numerical)
Assume as before: YY = .04, -.02, .02, -.04
XX
= .02, .01, .01, .02
The sample
standard deviation
for the y data
is bigger than
that for the x data.
This numerically
captures the
fact that y has
“more variation”
about its mean
than x.
Example 2 (graphical)
The standard
deviations
measure the
fact that there
is more spread
in the
Japanese
returns
• Variable N Mean StDev

• Canada 107 0.00907 0.03833
• japan 107 0.00234 0.07368

1.5 The Empirical Rule
We now have two numerical summaries for the data
x sx
where the data is how spread out,
how variable the data is
• The mean is pretty easy to interpret (some sort of “center”

of the data)
• We know that the bigger sx is, the more variable the data is,
but how do we really interpret this number?
• What is a big sx? What is a small one ?

Empirical Rule
For “mound shaped data”:
Approximately 68% of the data is in the interval
( x  s x, x  s x )  x  s x
Approximately 95% of the data is in the interval
( x  2s x , x  2s x )  x  2s x
The empirical rule will help us understand sx and relate the
summaries back to the histogram

Let us see this with the Canadian returns
x  2s x x  2s x
x .00907
s x .03833
10
The empirical
rule says that Density
roughly 95%
5
of the
observations
are between the
dashed lines and 0
roughly 68% -0.1 0.0 0.1
between canada
the dotted lines.
Looks reasonable. x  sx x  sx
Same thing viewed from the
perspective of the time series plot.
x  2s x
5% outside
would be
about
5 points. x
There are 4
points
outside,
which is
pretty close.
x  2s x
A little finance: comparing mutual funds
Let us use the means and standard deviations to compare mutual
funds.
For 9 different assets we compute the means and standard deviations.
Then, we plot the means versus the standard deviations.
The assets
are:
Variable N Mean StDev
drefus 180 0.00677 0.04724
fidel 180 0.00470
0.05659
keystne 180 0.00654 0.08424
Putnminc 180 0.00552 0.03008
scudinc 180 0.00443 0.03597
windsor 180 0.01002 0.04864
eqmrkt 180 0.01082 0.06856
valmrkt 180 0.00681 0.04800
tbill 180 0.00598
0.00252 © Imperial College Business School
It is considered good to have a large
mean return and a small standard deviation.
0.011 eqmrkt
windsor
0.010
0.009
Mean
0.008
valmrkt drefus
0.007 keystne
tbill
0.006 Putnminc
0.005 fidel
scudinc
0.004
0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
StDev
Let us compare some countries
honkong
0.02 Based
on
usa
singapor france monthly
returns
Mean
belgium germany
australi finalnd
0.01 canada from ‘88
italy to ‘96
japan
0.00
0.03 0.04 0.05 0.06 0.07 0.08
StDev

1.6 How to Relate Two Things
• The mean and standard deviation help us summarize a bunch

of numbers which are measurements of just one thing (one
variable)
• A fundamental and totally different question is how one thing

relates to another
• In this section of the notes we look at scatter plots and how

covariance and correlation can be used to summarize them
• When examining two things (variables) at the time, the scatter

plot will be our main graphical tool whereas covariance and
correlation will be our main numerical summaries

Is the number of beers you can drink
Example: related to your weight?
nbeer weight i
12.0 192 1 20
12.0 160 2
5.0 155 3
5.0 120 4
nbeer
10
7.0 150 5
13.0 175 6
4.0 100 7
0
12.0 165 8 100 150 200
12.0 165 9 weight
12.0 150 10
. . .
. . . Scatter plot
. . .
Now we think of each pair of numbers as an observation.
Each pair corresponds to a person.
Each person has two numbers associated with him/her,
# beers and weight.
Each pair corresponds to a point on the plot. © Imperial College Business School
Example:
Are returns on a mutual fund related to market returns?
0.2
Each point
corresponds 0.1
to a month
windsor
0.0
-0.1
-0.1 0.0 0.1 0.2

valmrkt

In general we have observations
and each point on the plot
corresponds to an observation.
Our data looks like:

the ith observation
( xi , yi ) is a pair
x y of numbers
i
12.0 192
1
12.0 160
2 The plot enables us to see
5.0 the relationship between
155 3 x and y
5.0
120 4
7.0
150 5 © Imperial College Business School
• In both examples it does look like there is a relationship
• Even more, the relationship looks linear in that it looks
like we could draw a line through the plot to capture the
pattern
• Covariance and correlation summarize how strong a
linear relationship there is between two variables
• In our first example weight and # beers were two
variables. In our second example our two variables
were two kinds of returns
• In general, we think of the two variables as x and y

The sample covariance between x and y:
1 n
s xy  
n  1 i 1
( xi  x)( yi  y )
The sample correlation between x and y:
s xy
rxy 
s xs y
So, the correlation is just the covariance divided by

the two standard deviations. RFME2
We will get some intuition about these formulae, but first let us
see them in action. How do they summarize data for us? Let us
start with the correlation.
Correlation, the fact of life:

1  rxy  1
The closer r is to 1 the stronger the linear
relationship is with a positive slope.
When one goes up, the other tends to go up.
The closer r is to -1 the stronger the linear

relationship is with a negative slope.
When one goes up, the other tends to go down.

The correlations corresponding to the two scatter plots
we looked at are:
Correlation of valmrkt and windsor = 0.923
Correlation of nbeer and weight = 0.692
The larger correlation between valmrkt and windsor

indicates that the linear relationship is stronger.
Let us look at some more examples.

2
Correlation of 0
y1
y1 and x1 = 0.019 -1
-2
-3 -2 -1 0 1 2 3
x1
Correlation of 1
y2 and x2 = 0.995
y2
-1
-2
-3
-3 -2 -1 0 1 2 3
x2
4
Correlation of 1
y3
0
y3 and x3 = 0.586 -1
-2
-3
-4
-3 -2 -1 0 1 2 3
x3
Correlation of 1
y4 and x4 = -0.982
y4
-1
-2
-3
-3 -2 -1 0 1 2 3
x4
Correlation of y5 and x5 = 0.210
9
8
7
6
5
y5
4
3
2
1
0
-3 -2 -1 0 1 2 3
x5
The correlation only measures linear relationships

(here the value is small but there is a strong nonlinear
relationship between y5 and x5.)
Example: The country data
Which countries go up and down together?

I have data on 23 countries. That would be a lot of plots!
>> scatter(Canada, USA)
0.1
canada
0.0
-0.1
-0.1 0.0 0.1
usa

To summarize, we can compute all pair-wise correlations:
>>list=[australia belgium ... singapore]

>>corrcoef(list)
Why is this blank?
australi belgium canada finalnd france germany honkong italy
belgium 0.189
canada 0.507 0.357
finalnd 0.387 0.183 0.386
france 0.275 0.734 0.342 0.176
germany 0.226 0.691 0.302 0.304 0.709
honkong 0.334 0.301 0.558 0.355 0.359 0.339
italy 0.159 0.367 0.334 0.389 0.352 0.465 0.261
japan 0.251 0.418 0.271 0.307 0.421 0.318 0.219 0.426
usa 0.360 0.429 0.651 0.264 0.501 0.372 0.429 0.240
singapor 0.409 0.355 0.478 0.391 0.408 0.467 0.647 0.416
japan usa
usa 0.246
singapor 0.407 0.473

Understanding the covariance and correlation formulae
• How do these weird looking formulae for covariance

and correlation capture the relationship?
• To get a feeling for this, let us go back to the simple

example and compute covariance and correlation
x y
0.07 0.11
0.06 0.05
0.04 0.09
0.03 0.03
First, let us compute the covariance
(which is a necessary ingredient to
compute the correlation):
1 n

n  1 i 1
( xi  x)( yi  y) 
1
((.07 .05)(.11.07)  (.06 .05)(.05 .07)  (.04 .05)(.09 .07)  (.03 .05)(.03 .07))
3
1
 (.02*.04  .01 * ( .02)  ( .1)*.02  ( .02) * ( .04))
3
1 1
 (.0008 .0002.0002.0008)  (.0012) .0004
3 3
= .0004
Each of the 4 points makes a contribution to the sum.

Let us see which point does what.

( x3  x)( y 3  y )  ( .01)*.02  .0002 ( x1  x)( y1  y ) .02*.04 .008
x
0.11
0.10
0.09
0.08 (III) (I)
y 0.07 y
0.06
0.05
(II) (IV)
0.04
0.03
0.03 0.04 0.05 0.06 0.07

x
( x2  x)( y 2  y ) .01 * ( .02)  .0002

( x4  x)( y 4  y )  ( .02) * ( .04) .008
Points in (I) have both x and y bigger than their means so we get a
positive contribution to the covariance.
Points in (II) have both x and y less than their means so we get a
positive contribution to the covariance.
In (III) and (IV) one of x and y is less than its mean and the other is
greater so we get a negative contribution. The further out the point is,
the bigger the contribution.
just a few
relatively small Lots of positive contributions
contributions
just a few
relatively small
Lots of positive contributions
contributions

So,
• A positive covariance means that when a variable
is above its average the other one tends to be above
as well. They move up and down together
• A negative covariance means that when one is up

the other tends to be down.
They move in opposite directions
• A small covariance means that their movements are

almost (linearly) unrelated
Let us now compute the correlation.

We just finish the example:
.0004
rxy  .6
(.0365)(.0183)
The division by the standard deviations standardizes

the covariance so that the correlation is always
between +/- 1

The sign of the correlation contains the same
information as the sign of the covariance (in fact, they
have the same sign being the standard deviations
always positive)
Positive sign: positive relationship

Negative sign: negative relationship
The correlation is more informative, though, because it

is unit-less (always between –1 and 1), by construction.
Hence, it is a better measure of the strength of the
relationship.
Close to 1: strong positive relationship

Close to -1: strong negative relationship
1.7 Linearly Related Variables
• We have studied data sets that display some kind of relation
with each other (the mutual fund returns and the market
returns, for instance)
• Sometimes there is an exact linear relation between variables:
y = c0 + c 1 x
• Can we say something about the sample mean of y if all we

know is the sample mean of x (and vice versa)?
• Can we say something about the sample standard deviation
of y if all we know is the sample standard deviation of x (and
vice versa)?
• We will answer these questions in the sequel
Example:
cel fahr Suppose we have these temps in

10 50 Celsius and Fahrenheit.
15 59
20 68
25 77 How are the F values related to the
40 104 C values?
30 86
50 122
70 158 F = 32 + (9/5)C

Note: if we plot F versus C, what do we see ?
Correlation of cel and fahr = 1.000

In general, we like to use the symbols y and x for the
two variables
The variable y is a linear function of the variable x if:
y  c 0  c1x
c 0 : the intercept We think of the c’s as constants
c1 : the slope (fixed numbers) while x and y vary.

Example:
• Suppose you are a movie star and you

have a deal which gives you a $10 million
fee per movie + 10% of the gross
• How is your income related to the gross?

Mean and variance of a linear
function
Suppose y (the data y) is a linear function of x.
How are the mean and variance (standard deviation)

of y related to those of x?
Let us look at
our
>> cel = [ -10 0 10 15 20 25 30 35 ]';
temperature
example.
>> mul = (9/5)*cel;
Suppose we
first multiply
>> fahr = 32+mul;
by (9/5) and
then add 32.
>> mean([ cel mul fahr])
ans =
15.625000000000000
28.125000000000000
60.125000000000000
>> std([ cel mul fahr])
ans =
15.221577729375776
27.398839912876394
27.398839912876394
. . .. . . . .
+---------+---------+---------+---------+---------+-------cel
. . . . . . . .
+---------+---------+---------+---------+---------+-------mul
. . . . . . . .
+---------+---------+---------+---------+---------+-------fahr
0 30 60 90 120 150
Interpret
• When we multiply cel by 9/5 we affect

(increase) both the mean and the standard
deviation proportionally
• If we add a constant (32 in our case) we

simply increase the mean (by the value of
the constant) but leave the overall dispersion
unaffected

Suppose, y  c 0  c1x
Then, y  c 0  c1x
s y | c1 | s x
s c s
2
y
2 2
1 x

Example:
• Suppose our movie star makes 10 pictures

and the mean and standard deviation of the
gross on the films are 100 and 30 million.
• What are the mean and standard deviation

of the star’s income?

Example:
• Suppose x has mean 100 and standard

deviation 10
• What are the mean, standard deviation and

variance of:
Y = -2x?
Y = 5+x?
Y = 5-2x?

Linear combinations
We may want a variable to be related to several others

instead of just one. We will assume that Y is a function of
X,Z,…rather than just a function of X.
Example:
Suppose our movie star also gets 5 percent of
all sales of the CD released with the movie.
How is the star’s income related to the film’s
gross and CD sales (in millions of dollars)?

When a variable y is linearly related to several
others, we call it a linear combination.
y  c 0  c1x1  c 2 x 2   c k x k
y is a linear combination of the x’s.
ci is the coefficient of xi.

Important example: Portfolios
• Suppose you have $100 to invest
• Let x1 be the return on asset 1. If x1 = .1, and you put all

your money into asset 1, then you will have $110 at the
end of the period
• Let x2 be the return on asset 2. If x2 = .15, and you put
all your money into asset 2, then you will have $115 at
the end of the period
• Suppose you put 1/2 of your money into 1 and 1/2 into 2
• What will happen?
At the end of the period you will have:
(100)*.5*(1+.1) + (100)*.5*(1+.15)
=100*(1+.5*.1+.5*.15)= 100(1 + RP)
So the return is .5*.1 + .5*.15=.125.
To generalize, let w1 be the fraction of your wealth you

invest in asset 1.
Let w2 be the fraction of your wealth you invest in asset 2.
Let M be your wealth.
The w’s are called the portfolio weights.
Then, at the end of the period, you have:
w 1M(1  x1 )  w 2M(1  x2 )  M( w 1  w 2  w 1x1  w 2 x2 )

 M(1  w 1x1  w 2 x2 )
Hence the return is,
Rp  w1x1  w 2 x 2
This is beautiful (…some people get a kick out of weird stuff!)
The return on the portfolio is just a linear combination

of the asset returns where the coefficients are the
portfolio weights.
• Suppose we have m assets
• The return on the ith asset is xi
• Put wi fraction of your wealth into asset i
• Your portfolio is determined by the portfolio weights wi
• Then, the return on the portfolio is:

m
Rp  w 1x1  w 2 x 2   w m x m   w ix i
i1

Notice that the portfolio weights always sum up to one.
(If I invest 30% of my wealth in asset 1, then I have to
invest 70% of my wealth in asset 2)
Questions:
1. Can the portfolio weights sum up to one and be

negative? (What does it mean to invest –30% of
your wealth in asset 1 and 130% in asset 2?)
2. What is the equally weighted portfolio?
3. What is the value weighted portfolio?

Example (the country data again)
Let us use our country data and suppose that we had put
.5 into USA and .5 into Hong Kong.
What would our returns have been?
In MatLab:
>> port = .5*honkong + .5*usa
honkong usa port

0.02 0.04 0.030 For each month, we
0.06 -0.03 0.015
0.02 0.01 0.015
get the portfolio return
-0.03 0.01 -0.010 as ½*hongkong + ½*usa.
0.08 0.05 0.065

How do the returns on this portfolio compare
with those of Hong Kong and USA?
It looks like the

mean for my
portfolio is right 0.021
honkong
in between the 0.020
means of USA 0.019
and Hong 0.018

port
Mean
0.017
Kong. 0.016
0.015
What about the 0.014
0.013
usa
standard 0.03 0.04 0.05 0.06 0.07
deviation? StDev

Let us try a portfolio with three stocks.
Let us go short on Canada (i.e.we borrow
Canada to invest in the other stocks)
>> port = -.5*canada + usa +.5*honkong
honkong
Clearly, 0.020 port
forming
portfolios
Mean
0.015
is an usa
interesting
thing to do! 0.010
canada
0.03 0.04 0.05 0.06 0.07

StDev

• Basic question: why would we form portfolios?
• Maybe the portfolio has a nice mean and variance
(i.e., nice “average return” and nice “risk”)
• There are some basic formulae that relate the
mean and standard deviation of a linear
combination to the means, variances and
covariances of the input variables
• We can apply these formulae to understand how
the mean and variance of a portfolio depend on the
input assets. These formulae constitute the basic
part of the tool-kit of those who really understand
finance

Mean and variance of a linear combination
First, we consider the case where we have only two inputs
Then, y  c 0  c1x1  c 2 x2
y  c 0  c1x1  c 2 x2
s  c s  c s  2c1c 2 s x1x2
2
y
2 2
1 x1
2 2
2 x2

Example:
Going back to our movie star, suppose the average

sales of CD’s is 5 million and the standard deviation is
1 million.
Assume the correlation between gross and CD sales
is .8
1. What is the mean and standard deviation of the

star’s income?
2. How would the answer change if the correlation
were 0?
Example:
>> port = .5*honkong + .5*usa

Honkong usa
port For each month, we get
0.02 0.04 the portfolio return as
0.030 ½*hongkong + ½*usa
0.06 -0.03
0.015
0.02 0.01
0.015
The mean returns
-0.03 on USA and Hong
0.01 - Kong are .01346 and .02103.
0.010 what the portfolio returns are, we can easily compute the
Knowing
0.08return for the
mean 0.05portfolio (i.e., it is the sample mean of the
0.065
portfolio returns): .01724.
........
We can now confirm the validity of our formula:
.01724 = .5*.01346+.5*.02103
Let us do the same exercise for the variance:
Diagonal elements are variances, off diagonal elements

are covariances (this is a variance-covariance matrix)
Covariances
>> cov([ honkong usa port])
honkong usa port

honkong 0.00521497
usa 0.00103037 0.00110774
port 0.00312267 0.00106906 0.00209586
As before, we can check the formula:

.0021
= (.5)*(.5)*.00521 + (.5)*(.5)*.00111 + 2*(.5)*(.5)*.001
= .25*.00521 + .25*.00111 + .5*.001

Let us do it one more time:
>> port = .25*usa +.75*honkong

>> cov([ honkong usa port])
Covariances
honkong usa
port
honkong 0.00521497
usa 0.00103037 0.00110774
port 0.00416882 0.00104972
0.00338905
.0033 =
(.25)*(.25)*.00111 + (.75)*(.75)*.0052+(2)*(.25)*(.75)*(.00103)

Example: -0 .1 2
-0 .0 5
1
-0 .0 7
y = .5x1 + .5 x2 -0 .1-0 .0 1
0 .0 3
-0 .006.0 4
At each point we
0
-0 .0 1
-0 -0
.0 .0
53
x2
plot the value of y -0 .0 5
0 .0 5
-1
0 .1 2
-0 .0 8 0 .1 3
The variances and 00 .1
.112
0 .0 5
covariance are:
-2
0 .0 3
x1 x2
-1 0 1 2
x1 1.334636 x1
x2 -1.208679 1.106238
The dashed lines are drawn
Then, the variance of y is at the mean of x1 and x2
0.0058105 = .5*.5*1.3346 + .5*.5*1.106 +2*.5*.5*(-1.208679)
Why is the variance of y so much smaller than those of the x’s ?

Example: 1 .7 7
1 .5 5
1 .1 9
y = .5x1 + .5 x2
1
0 .815
0 .700 .8
.7 8
0 0.5.5 3
At each point we 0 .2 30 .3 3
0
plot the value of y
x2
-0 .0 3
-0 .1 7
-0-0
.4.3
69
-0 .7 9-0 .7
The variances and -1 .0 5
-1
covariance are:
-2 -1 .8 5
x1 -2 -1 0 1 2
x2 x1
x1 1.158167
x2 1.046490 The dashed lines are drawn
0.9609463 at the mean of x1 and x2
Then, the variance of y is
1 .0 5 3 = .5 *.5 *1 .1 5 8 + .5 * .5 * .9 6 1 + 2 * .5 * .5 * 1 .0 4 6 5
Why is the variance of y not so much smaller than those of the x’s ?
Example:
2 .0
0 .9 3
1 .5
-0 .0 2
0 .7 5
y = .5x1 + .5 x2 -0 .2 7 1 .2 9
1 .0
-0 .4 3 1 .0 3
0 .1 7
At each point we
0 .5
0 .4 3
x2
plot the value of y
0 .0
-0 .0 9 0 .3 9
-1 .11 -0 .3 5
-1 .2 0 .2 3
-0 .5
The variances and -1 .0 7 -0 .7 6
0 .1 3
covariance are:
-1 .0
-1 .6 7
-0 .6 9
-2 -1 0 1
x1 x2
x1
x1 1.3870537
x2 0.1976187
0.8247886 The dashed lines are drawn
Then, the variance of y is at the mean of x1 and x2
0 .6 5 1 7 5 = .5 *.5 *1 .3 8 7 + .5 * .5 * .8 2 4 8 + 2 * .5 * .5 * .1 9 7 6
Why is the variance of y so much smaller than those o the x’s ?

K inputs:
Suppose,
y  c 0  c1x1  c 2 x2  c 3 x3  ck xk
Then,
y  c 0  c1x1  c 2 x2  c 3 x3  ck xk
s  c s c s c s
2
y
2 2
1 x1
2 2
2 x2
2
3
2
x3
 2 c1c 2s x1x2  c1c 3 s x1x3  c 3 c 2s x3 x2

Example:
y  c 0  c1x1  c 2 x2  c 3 x3
y  c 0  c1x1  c 2 x2  c 3 x3
s  c s c s c s
2
y
2 2
1 x1
2 2
2 x2
2
3
2
x3
 2 c1c 2s x1x2  c1c 3 s x1x3  c 3 c 2s x3 x2

Example:
>> port = .1*fidel +.4*eqmrkt +.5*windsor

>> cov([ port fidel eqmrkt windsor])
Covariances
port fidel eqmrkt

windsor
port 0.00306760
fidel 0.00280224 0.00320210
eqmrkt 0.00369384 0.00319150 0.00470021
windsor 0.00261967 0.00241087 0.00298922
0.00236580
.0030676 = (.1)*(.1)*.003202 + (.4)*(.4)*.0047 + (.5)*(.5)*.0023658
+2*((.1)*(.4)*.00319 + (.1)*(.5)*.00241+(.4)*(.5)*.00299)

Example:
Cut from a Finance Textbook:


Chap 1 PZ

Uploaded by

Copyright:

Available Formats

You might also like

Chap 1 PZ

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chap 1 PZ

Uploaded by

Copyright:

Available Formats

1.

Two key ideas

• Exploratory (descriptive) issues: Look at the

• Inference issues: Use data (sample) to generalize

© Imperial College Business School

Summarize, describe the 300

data: 300 voters for A (A =

We will learn how to

For now, we just learn how

• All statistical problems involve either the collection,

•There are many aspects of data e,g. data may be

© Imperial College Business School

Interpret: Each number corresponds to a month. They are given in

• The return on an asset is the percentage increase

• If you invest B at the beginning of the time period

• (1+r) is the factor by which your wealth increases

© Imperial College Business School

Given E and B we can calculate r (the return): r = (E-B)/B

E=110, B=100, r =.1 or 10%

For an investment in a stock, E is comprised of the amount

B is the price you pay at the beginning of the time period

© Imperial College Business School

We are interested in ways to summarize or “see” the data.

-0.05 0.00 0.05

variation or spread about the center

0 1000 2000 3000 4000 5000 6000

• Let us use histograms to compare returns on some

• We will look at returns on different mutual funds

• The equally weighted market represents returns on

© Imperial College Business School

We just looked at two kinds of data:

• For the return data, each number corresponds to a month

• For the beer data, each number corresponds to a person

• There is a first one, a second one, and ....

© Imperial College Business School

• We could have daily data (temperature), annual

• For time series data, the time series plot is an

© Imperial College Business School

Do you see a pattern?

© Imperial College Business School

• We have looked at graphs. Suppose we are

• We have seen that two important features of

• First, we will introduce measures of central tendency to

• Measures of central tendency include: the mean and the

• Second, we will discuss measures of dispersion, such as

Suppose we collect n pieces of data. We need some way of

They are the values that we observe.

Sometimes the order of the observations means

This is summation notation

The sample mean

In some sense, the

(I deleted the outlier, I do not believe him!).

• This is a big difference

Let us look at summation in more detail.

add in this value for each i

© Imperial College Business School

Think of each row as an

In year 1 asset “x” had return 7%