BSMA 301 Statistics: Dr. Eyram Kwame

BSMA 301 Statistics
Dr. Eyram Kwame
December 7, 2020
1 / 137
Outline
Recommended Textbooks
Introduction
Descriptive Statistics
Numerical Descriptive Techniques
Graphical Descriptive Techniques
Elements of Probability
Random Variables
Discrete Probability Distribution
Binomial Distribution
Poisson Distribution
Continuous Random Variables
Joint Probability Density Function
2 / 137
Outline (contd.)
Expectation and Variance
Moment Generating Function
Uniform Distribution
Exponential Distribution
Normal Distribution
Parameter Estimation
Hypothesis Testing
Introduction to Regression
3 / 137
Introduction
Random Variables
Normal Distribution
Hypothesis Testing
Introduction to Regression 4 / 137
I Navidi William Statistics for Engineers and

Scientists, 4th Edition, McGraw-Hill Education,
2015.[1]
I Ross M. Sheldon Introduction to Probability
and Statistics for Engineers and Scientists, 3rd
Edition, Elsevier Academic Press 2009. [2]
5 / 137
Introduction
Random Variables
Normal Distribution
Hypothesis Testing
Introduction
I Statistics is the science of

learning from data. It comprises
data collection, Description, and
Analysis, for making Inferences.
I The two main branches of
statistics are Descriptive and
Inferential Statistics.
I Information is a processed data.
7 / 137
Introduction
I A population is the total number

of all the items of interest. It is
often very large and may be
infinite. A descriptive measure of
a population is called a parameter.
I A sample is a set of data drawn
from a population. A descriptive
measure of a sample is called a
statistic.
8 / 137
Introduction
I Descriptive statistics deals with

methods of organizing,
summarizing, and presenting data
in a convenient and informative
way.
I Statistical Inference is the process
of making an estimation,
prediction, or decision about a
population based on sample data.
9 / 137
Introduction
I Due to the size of populations, a

sample of a reasonable size is
adequate for conclusions or
estimations about the population
based on the information provided
I Simple random, Stratified random,
Cluster and Convenience
sampling are various Sampling
Techniques [1]
10 / 137
Introduction
Random Variables
Normal Distribution
Hypothesis Testing
I Central Location is measured

with Mean, Median and Mode.
Arithmetic mean of a data set is
𝑁 𝑛
1Õ 1Õ
𝜇= 𝑥𝑖 or 𝑥¯ = 𝑥𝑖
𝑁 𝑛
𝑖=1 𝑖=1
Note that 𝜇 is population mean
while 𝑥¯ is sample mean.
12 / 137
Numerical Descriptive Techniques (contd.)
I Dispersion (or Variability) is
measured with Range and
Variance. The Variance for a
population 𝜎 2 and a sample 𝑠2 are
𝑁
1 Õ
𝜎2 = (𝑥𝑖 − 𝜇) 2,
𝑁
𝑖=1
𝑛
2 1 Õ
𝑠 = ¯ 2
(𝑥𝑖 − 𝑥)
𝑛−1
𝑖=1
13 / 137
Numerical Descriptive Techniques (contd.)
I Relative Standing is a measure of

the relationship between a data set
and the rest of the data. Relative
standing is measured with
percentile ranking and quartiles.
The lower quartile 𝑄 𝐿 is the 25th percentile of a
data set. The middle quartile 𝑄 𝑀 is the median
or 50th percentile. The upper quartile 𝑄𝑈 is the
75th percentile.
14 / 137
I Histogram, Bar and Pie Chart are

various types of graphical
representation of data
I In this lecture, we shall use
Matlab and Microsoft Excel to
present processed data graphically
15 / 137
Introduction
Random Variables
Normal Distribution
Hypothesis Testing
I Sample Space: is the set of all

possible outcomes of an
experiment. It is denoted by S.
Let 𝑛 (S) denote the total count of
all the items in the sample space.
I An Event: is any subset 𝐴 of the
sample space.
Let 𝑛 ( 𝐴) denote the total count of
all the items in the event.
17 / 137
I Mutually Exclusive Events: two

or more events are said to be
mutually exclusive if both or all
events cannot occur at the same
time.
I Probability: is the measure of
the likelihood of an event
occurring during an experiment.
I The probability of 𝐴 is
𝑛( 𝐴)
P ( 𝐴) = . (1)
𝑛 (S)
18 / 137
Axioms of Probability
For any event 𝐴 of an experiment,
having a sample space S, the
following axioms hold
I A1: 0 ≤ P( 𝐴) ≤ 1
I A2: P (S) = 1
I A3: For any sequence of mutually
exclusive events
𝐴1, 𝐴2, 𝐴!3, · · · , 𝐴𝑛 , we have
Ø𝑛 Õ 𝑛
P 𝐴𝑖 = P( 𝐴𝑖 )
𝑖=1 𝑖=1 19 / 137
Conditional Probability
I The Intersection of events 𝐴 and
𝐵 is the event that both event 𝐴
and event 𝐵 occur at the same
time. It is denoted as 𝐴 ∩ 𝐵
I Conditional Probability is the
probability of an event (say 𝐴)
given that an event (say 𝐵) has
occurred:
P( 𝐴 ∩ 𝐵)
P( 𝐴|𝐵) = (2)
P(𝐵)
20 / 137
Introduction
Random Variables
Normal Distribution
Hypothesis Testing
Random Variables
I A random variable (r.v.) is a

variable (say 𝑋) that can be
assigned a possible numerical
values, which are outcomes of a
random phenomenon.
I Consider the rolling of two fair
dice. If our interest only lies in
knowing the sum of the values but
22 / 137
Random Variables (contd.)
not the individual values of the

dice, then our r.v. 𝑋 is the sum
with possible values
{2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12}.
23 / 137
Random Variables
I There are two types of random

variables.
I A discrete r.v. is any r.v. that can
take on a countable number of
values.
Eg: 𝑋 = number of heads
observed in an experiment that
flips a coin 3 times.
24 / 137
I A continuous random variable

is any r.v. with uncountable
values. An excellent example of a
continuous r. v. is the time to
complete a task (𝑋 = time to write
a statistics exam).
25 / 137
Random Variables
I A probability distribution is a
table, formula, or graph that
describes the values of a random
variable and the probability
associated with these values.
I The probability distribution of the
r.v. 𝑋, which is the number of
26 / 137
heads obtained in an experiment

that flips a coin 3 times is given as
𝑥 0 1 2 3
p(𝑥) 0.125 0.375 0.375 0.125
27 / 137
Introduction
Random Variables
Normal Distribution
Hypothesis Testing
D.P.D
I The requirement for a distribution

to be considered as a Discrete
Probability Distribution is as
follows:
0 ≤ p(𝑥) ≤ 1 (3)
Õ
p(𝑥) = 1 (4)
𝑥
29 / 137
D.P.D (contd.)
The population mean 𝜇 of a discrete

r.v. is the weighted average of all its
values. The parameter is called the
Expected value (Expectation) of 𝑋
and it is represented as
Õ
E(𝑋) = 𝜇 = 𝑥p(𝑥) (5)
𝑥
30 / 137
D.P.D (contd.)
The population variance of a discrete

r.v. is defined as
Õ
2
Var(𝑋) = 𝜎 = (𝑥 − 𝜇) 2p(𝑥)
𝑥
Õ
= 𝑥 2p(𝑥) − 𝜇2
𝑥
31 / 137
D.P.D
Properties of Expected value and

Variance
1. E(𝑐) = 𝑐
2. Var(𝑐) = 0
3. E(𝑋 + 𝑐) = E(𝑋) + 𝑐
4. Var(𝑋 + 𝑐) = Var(𝑋)
5. E(𝑐𝑋) = 𝑐E(𝑋)
6. Var(𝑐𝑋) = 𝑐2Var(𝑋)
32 / 137
An Example
Using a historical records, the manager of a company

has determined that the probability distribution of 𝑋,
the number of employees absent per day
𝑥 0 1 2 3 4 5 6
p(𝑥) 0.005 0.025 0.31 0.34 0.22 0.08 0.02
(a) Compute the following probabilities
(i) P(2 ≤ 𝑋 ≤ 5)
(ii) P(𝑋 > 5)
(iii) P(𝑋 < 4)
(b) Calculate the mean and the standard deviation of

the population
33 / 137
Bivariate Distribution
A Bivariate Probability
Distribution of random variables 𝑋
and 𝑌 is a table or formula that gives
the joint probabilities p(𝑥, 𝑦) for all
pairs of 𝑥 and 𝑦.
The requirement for a Discrete
Bivariate Distribution is as follows:
Õ 0≤Õp(𝑥, 𝑦) ≤ 1 (6)
p(𝑥, 𝑦) = 1 (7)
𝑥 𝑦 34 / 137
Given p(𝑥, 𝑦) for 𝑥 and 𝑦, we have

Õ
E(𝑋) =𝜇𝑥 = 𝑥p(𝑥) (8)
Õ𝑥
E(𝑌 ) =𝜇 𝑦 = 𝑦p(𝑦) (9)
𝑦
where p(𝑥) and p(𝑦) are known as
the marginal probabilities of 𝑥 and 𝑦
respectively.
35 / 137
The variance and covariance are given as
Õ
Var(𝑋) = 𝜎𝑥2 = (𝑥 − 𝜇𝑥 ) 2 p(𝑥)
𝑥
Õ
= 𝑥 2 p(𝑥) − 𝜇𝑥2 (10)
𝑥
Õ
Var(𝑌 ) = 𝜎𝑦2 = (𝑦 − 𝜇 𝑦 ) 2 p(𝑦)
𝑦
Õ
= 𝑦 2 p(𝑦) − 𝜇2𝑦 (11)
𝑦
ÕÕ
COV(𝑋, 𝑌 ) = 𝜎𝑥𝑦 = (𝑥 − 𝜇𝑥 )(𝑦 − 𝜇 𝑦 )p(𝑥, 𝑦)
𝑥 𝑦
ÕÕ
= 𝑥𝑦p(𝑥, 𝑦) − 𝜇𝑥 𝜇 𝑦 (12)
𝑥 𝑦 36 / 137
The coefficient of correlation, which

measure the strength of the linear
relationship between two random
variables is defined as
𝜎𝑥𝑦
𝑟= (13)
𝜎𝑥 𝜎𝑦
Note that |𝑟 | ≤ 1.
37 / 137
Properties of Expected values and

Variance
1. E(𝑋 + 𝑌 ) = E(𝑋) + E(𝑌 )
2. Var (𝑋 + 𝑌 ) = Var (𝑋) + Var (𝑌 ) + 2COV (𝑋, 𝑌 )
If 𝑋 and 𝑌 are independent,
COV(𝑋, 𝑌 ) = 0 and thus
3 Var(𝑋 + 𝑌 ) = Var(𝑋) + Var(𝑌 )
38 / 137
Example
Given the Bivariate distribution of r.v. 𝑋 and 𝑌 as

𝑥
𝑦 1 2
1 0.28 0.42
2 0.12 0.18
(a) Compute the marginal probabilities, p(𝑥) and

p(𝑦).
(b) Compute 𝜇𝑥 , 𝜇 𝑦 , 𝜎𝑥 and 𝜎𝑦 .
(c) Compute COV(𝑋, 𝑌 ) and 𝑟
(d) Compute E(𝑋 + 𝑌 ) and Var(𝑋 + 𝑌 )
39 / 137
Example
After analysing several months of sales data, the
owner of an appliance store produced the ff joint
probability distribution of the refrigerators and stoves
sold daily;
Refrigerators
Stoves 0 1 2
0 0.08 .14 0.12
1 0.09 0.17 0.13
2 0.05 0.18 0.04
(a) Compute COV(𝑋, 𝑌 ) and 𝑟
(b) Compute the following conditional probabilities
(i) P(1 Ref | 0 Stove)
(ii) P(2 Ref | 2 Stove)
40 / 137
Introduction
Random Variables
Normal Distribution
Hypothesis Testing
I The Binomial experiment consists

of a fixed number of trials. We
represent the # of trials with 𝑛.
I Each trial has 2 possible
outcomes. We label one outcome
as success and the other as failure.
I The probability of success is 𝑝
and that of failure is 1 − 𝑝.
42 / 137
I The trials are independent. Thus,

the outcome of one trial does not
affect the outcome of any other
trials.
I If 𝑋 represents the number of
successes that occur in the 𝑛 trials,
then 𝑋 is said to be a binomial r.v.
with parameters (𝑛, 𝑝).
43 / 137

𝑛 𝑥
P(𝑋 = 𝑥) = 𝑝 (1 − 𝑝) 𝑛−𝑥
𝑥
(14)
𝑛!
= 𝑝 𝑥 (1 − 𝑝) 𝑛−𝑥
𝑥!(𝑛 − 𝑥)!
𝜇 = E(𝑋) =𝑛𝑝 (15)
𝜎 2 = Var(𝑋) =𝑛𝑝(1 − 𝑝) (16)
44 / 137
Examples of Bin(𝑛, 𝑝)
Given a Binomial r.v. 𝑋 with 𝑛 = 10

and 𝑝 = 0.6, compute
(a) P(𝑋 = 3)
(b) P(𝑋 = 5)
(c) P(𝑋 ≤ 4)
(d) P(6 ≤ 𝑋 ≤ 9)
45 / 137
Examples of Bin(𝑛, 𝑝)
A certain type of tomato seed germinates 80% of the

time. A backyard farmer planted 25 seeds. What is
the probability that
(a) exactly 20 of the seeds germinated?
(b) more than 20 of the seeds germinated?
(c) 24 or fewer of the seeds germinated?
(d) What is the expected number of germinated
seeds?
(e) What is the standard deviation?
46 / 137
Introduction
Random Variables
Normal Distribution
Hypothesis Testing
A Poisson experiment is
characterized by the following
properties
I The # of successes that occur in
any interval is independent of that
which occurs in another.
I The probability of a success in an
interval is the same for all
equal-size intervals.
48 / 137
I The probability of a success in an

interval is proportional to the size
of the interval.
I The probability of more than one
success in an interval approaches
zero (0) as the interval becomes
smaller.
49 / 137
I The Poisson r.v. takes the # of

successes that occur in a period of
time or an interval of space in a
Poisson experiment.
I The probability that a Poisson r.v.
𝑋 assumes a value 𝑥 in a specific
interval with parameter 𝜆 > 0 is
𝜆𝑥 𝑒 −𝜆
P(𝑋 = 𝑥) = ∀ 𝑥 = 0, 1, 2, · · ·
𝑥!
(17)50 / 137
The Expectation and Variance of

𝑋 ∼ Poisson(𝜆) are given as
E(𝑋) =𝜆 (18)
Var(𝑋) =𝜆 (19)
Example:
Given 𝑋 ∼ Poisson(2), compute
(a) P(𝑋 = 0)
(b) P(𝑋 ≤ 3)
(c) P(𝑋 ≥ 5) 51 / 137
The # of accidents that occur at a

busy intersection is a Poisson
distribution with mean of 3.5 per
week. Find the probability of the
following events;
(a) No accident occur in one week
(b) Five or more accidents occur in
one week
(c) two accident will occur today
52 / 137
Introduction
Random Variables
Normal Distribution
Hypothesis Testing
Continuous r.v.
A random variable 𝑋 is said to be a

continuous r.v. if ∃ a non-negative
function 𝑓 (𝑥) defined ∀ 𝑥 ∈ R s. t.
for any set 𝐵 ⊂ R
∫
P(𝑥 ∈ 𝐵) = 𝑓 (𝑥)𝑑𝑥 (20)
𝐵
𝑓 (𝑥) is known as the probability
density function (pdf) of the r.v. 𝑋
54 / 137
Requirement for a pdf
A function 𝑓 (𝑥) can only be classified as a pdf iff
𝑓 (𝑥) ≥0 (21)
∫ ∞
P(𝑥 ∈ R) = 𝑓 (𝑥)𝑑𝑥 = 1 (22)
−∞
∫ 𝑏
P(𝑎 ≤ 𝑥 ≤ 𝑏) = 𝑓 (𝑥)𝑑𝑥 (23)
𝑎
∫ 𝑎
P(𝑥 = 𝑎) = 𝑓 (𝑥)𝑑𝑥 = 0 (24)
𝑎
55 / 137
Requirement for a p.d.f.
The relationship between the

cumulative distribution function 𝐹 (.)
and the pdf 𝑓 (𝑥) is expressed by

𝐹 (𝑎) =P 𝑥 ∈ (−∞, 𝑎]
∫ 𝑎
= 𝑓 (𝑥)𝑑𝑥 (25)
−∞
56 / 137
Requirement for a pdf
Suppose that 𝑋 is a continuous r.v.

with p.d.f.

𝑘 (36𝑥 − 6𝑥 2), 0 < 𝑥 < 6
𝑓 (𝑥) =
0 otherwise
Compute: (a) the value of 𝑘.
(b) P(𝑋 > 3)
57 / 137
Introduction
Random Variables
Normal Distribution
Hypothesis Testing
Joint p.d.f.
Random variables 𝑋 and 𝑌 are said

to be jointly continuous if ∃ a
non-negative function 𝑓 (𝑥, 𝑦)
defined ∀ (𝑥, 𝑦) ∈ R 2 s. t. for any set
2

𝐶 = (𝑥, 𝑦) ∈ R ,
∫ ∫
P (𝐶) = 𝑓 (𝑥, 𝑦)𝑑𝑥𝑑𝑦 (26)
𝑦 𝑥
59 / 137
Joint p.d.f.
If 𝐴 and 𝐵 are
any sets in
R, then for
any set 𝐶 = (𝑥, 𝑦) ∈ R2 , we have
∫ ∫
P (𝑥 ∈ 𝐴, 𝑦 ∈ 𝐵) = 𝑓 (𝑥, 𝑦)𝑑𝑥𝑑𝑦
𝐵 𝐴
(27)
because
𝐹 (𝑎, 𝑏) =P (𝑥 ∈ (−∞, 𝑎], 𝑦 ∈ (−∞, 𝑏])
∫ 𝑏∫ 𝑎
= 𝑓 (𝑥, 𝑦)𝑑𝑥𝑑𝑦
−∞ −∞ 60 / 137
Joint p.d.f.
If 𝑋 and 𝑌 are jointly continuous,

they are individually continuous and
their pdfs can be obtained as follows:
P (𝑥 ∈ 𝐴) =P (𝑥 ∈ 𝐴, 𝑦 ∈ R)
∫ ∫ ∞
= 𝑓 (𝑥, 𝑦)𝑑𝑦𝑑𝑥
∫𝐴 −∞
= 𝑓 𝑋 (𝑥)𝑑𝑥 (28)
𝐴
61 / 137
Joint p.d.f.
Where
∫ ∞
𝑓 𝑋 (𝑥) = 𝑓 (𝑥, 𝑦)𝑑𝑦 (29)
−∞
Similarly,
∫ ∞
𝑓𝑌 (𝑦) = 𝑓 (𝑥, 𝑦)𝑑𝑥 (30)
−∞
62 / 137
Joint p.d.f.
The joint p.d.f. of 𝑋 and 𝑌 is given

by
𝑘𝑒 −3𝑥−2𝑦 , (𝑥, 𝑦) ∈ R2+

𝑓 (𝑥, 𝑦) =
0 otherwise
Compute: (a) the value of 𝑘.
(b) P(𝑋 > 1, 𝑌 < 1)
(c) P(𝑋 < 𝑌 )
(d) P(𝑋 < 𝑟)
63 / 137
Joint p.d.f.
The r.v. 𝑋 and 𝑌 are said to be

independent if
𝑓 (𝑥, 𝑦) = 𝑓 𝑋 (𝑥) 𝑓𝑌 (𝑦) ∀ 𝑥, 𝑦. (31)
If 𝑋 and 𝑌 have joint p.d.f. 𝑓 (𝑥, 𝑦),
then the conditional p.d.f. of 𝑋,
given 𝑌 = 𝑦 is defined ∀ values of 𝑦
s.t. 𝑓𝑌 (𝑦) > 0 by
𝑓 (𝑥, 𝑦)
𝑓 𝑋 |𝑌 (𝑥|𝑦) = (32)
𝑓𝑌 (𝑦) 64 / 137
Joint p.d.f.
Then for any set 𝐴,

∫
P 𝑥 ∈ 𝐴|𝑌 = 𝑦 = 𝑓 𝑋 |𝑌 (𝑥|𝑦)𝑑𝑥
𝐴
see page 107 for an example
65 / 137
Introduction
Random Variables
Normal Distribution
Hypothesis Testing
Expectation
The expected value of a continuous

r.v. is define as,
∫ ∞
E (𝑋) = 𝑥 𝑓 (𝑥)𝑑𝑥. (33)
−∞
Given r.v. 𝑋, define 𝑌 = 𝑔(𝑋) to be
a new r.v. We define the expectation
of 𝑌 = 𝑔(𝑋) as
∫ ∞
E [𝑔(𝑋)] = 𝑔(𝑥) 𝑓 (𝑥)𝑑𝑥. (34)
−∞
67 / 137
Expectation
The expected value of a continuous

r.v. 𝑋 is normally referred to as the
mean or the first moment of 𝑋.
For joint p.d.f. 𝑓 (𝑥, 𝑦), we have
∫ ∞∫ ∞
E [𝑔(𝑋, 𝑌 )] = 𝑔(𝑥, 𝑦) 𝑓 (𝑥, 𝑦)𝑑𝑥𝑑𝑦. (35)
−∞ −∞
As an example, show that

E [𝑋 + 𝑌 ] = E [𝑋] + E [𝑌 ]
68 / 137
Expectation
If 𝑋 is a r.v. with mean 𝜇, then the

variance of 𝑋 is defined by
h i
Var (𝑋) =E (𝑋 − 𝜇) 2
h i
=E 𝑋 2 − 𝜇2 (36)
where
h i ∫ ∞
E 𝑋2 = 𝑥 2 𝑓 (𝑥)𝑑𝑥. (37)
−∞
69 / 137
Introduction
Random Variables
Normal Distribution
Hypothesis Testing
mgf
The moment generating function

(mgf) 𝜙(𝑡) of the r.v. 𝑋 is defined
∀ 𝑡 by,
𝑋𝑡
𝜙(𝑡) =E 𝑒
∫ ∞
= 𝑒 𝑥𝑡 𝑓 (𝑥)𝑑𝑥 (38)
−∞
By successively differentiating 𝜙(𝑡),
all moments of the r.v. 𝑋 can be
obtained. 71 / 137
mgf
For example
0 𝑑 𝑋𝑡
𝜙 (𝑡) = E 𝑒
𝑑𝑡
𝑑 𝑋𝑡
=E 𝑒
𝑑𝑡
𝑋𝑡
=E 𝑋𝑒 (39)
Hence 𝜙0 (0) = E [𝑋]h i
Similarly 𝜙00 (0) = E 𝑋 2
72 / 137
mgf
In general, the 𝑛th derivative of 𝜙(𝑡)

evaluated at 𝑡 = 0 gives the 𝑛th
moment,
𝜙 (𝑛) (0) =E [𝑋 𝑛 ] , 𝑛≥1
∫ ∞
= 𝑥 𝑛 𝑓 (𝑥)𝑑𝑥. (40)
−∞
73 / 137
mgf
Suppose 𝑋 and 𝑌 are i.r.v. and have

mgfs 𝜙 𝑋 (𝑡) and 𝜙𝑌 (𝑡) respectively.
Then the mgf of 𝑋 + 𝑌 is given by
h i
𝜙 𝑋+𝑌 (𝑡) =E 𝑒 𝑡 (𝑋+𝑌 )
𝑡 𝑋 𝑡𝑌
=E 𝑒 𝑒
=𝜙 𝑋 (𝑡)𝜙𝑌 (𝑡). (41)
74 / 137
Example
A r.v. 𝑋, which represents the

weights (in kg) of an article has
density function given by
 𝑥 − 8, 8 ≤ 𝑥 < 9



𝑓 (𝑥) = 10 − 𝑥, 9 ≤ 𝑥 ≤ 10

 0
 otherwise
Compute: (a) the mean and variance
of 𝑋.
75 / 137
Example
The manufacturer sells the article for

a fixed price of $5.00. There is a
guarantee to refund the purchase
amount to any customer who finds
the weight of the article to be less
than 8.3kg. The cost of production is
related to the weight of the article as
𝑥
+ 0.35. Compute the expected
7.5
profit per article.
76 / 137
Introduction
Random Variables
Normal Distribution
77 / 137
Hypothesis Testing
78 / 137
Uniform distribution
A r.v. 𝑋 is said to be uniformly

distributed over the interval [𝛼, 𝛽] if
its p.d.f. is given by
1
, 𝛼≤𝑥≤𝛽
𝑓 (𝑥) = 𝛽−𝛼
0 otherwise
∫ 𝑏
1 𝑏−𝑎
P(𝑎 ≤ 𝑥 ≤ 𝑏) = 𝑑𝑥 =
𝛽−𝛼 𝑎 𝛽−𝛼
79 / 137
Uniform distribution
A r.v. 𝑋 is said to be uniformly

distributed over the interval [𝛼, 𝛽] if
its p.d.f. is given by
1
, 𝛼≤𝑥≤𝛽
𝑓 (𝑥) = 𝛽−𝛼
0 otherwise
∫ 𝑏
1 𝑏−𝑎
P(𝑎 ≤ 𝑥 ≤ 𝑏) = 𝑑𝑥 =
𝛽−𝛼 𝑎 𝛽−𝛼
80 / 137
Example
The amount of time it takes for a student to complete

a statistics quiz is uniformly distributed between 30
and 60 minutes. One student is selected at random.
Find the probability of the following events.
(a) The student requires more than 55 minutes to
complete the quiz.
(b) The student completes the quiz in a time between
30 and 40 minutes.
(c) The student completes the quiz in exactly 37.23
minutes.
81 / 137
Introduction
Random Variables
Normal Distribution
82 / 137
Hypothesis Testing
83 / 137
Exponential distribution
A continuous r.v. 𝑋 with parameter

𝜆 > 0 and p.d.f given by
−𝜆𝑥
𝜆𝑒 , 𝑥 ≥ 0
𝑓 (𝑥) =
0 otherwise
is said to be an exponential r.v.
The mgf of an exponential r.v. 𝑋 is
given by
𝑡𝑥 𝜆
𝜙(𝑡) = E[𝑒 ] = (42)
𝜆−𝑡
Obtain the expected value and variance of 𝑋 84 / 137
Exponential distribution
The life-time of an alkaline battery

(measured in hours) is exponentially
distributed with 𝜆 = 0.02.
(a) What is the mean and std dev of
the battery life?
(b) Determine the probability that a
battery will last b/n 10 and 15
hours.
(c) Compute P(𝑋 > 20)
85 / 137
Introduction
Random Variables
Normal Distribution
86 / 137
Hypothesis Testing
87 / 137
Normal Distribution
A r.v. 𝑋 is said to be normally

distributed with parameters 𝜇 and 𝜎 2
and we write 𝑋 ∼ N(𝜇, 𝜎 2), if its
p.d.f. is
2
1 (𝑥 − 𝜇)
𝑓 (𝑥) = √ exp −
𝜎 2𝜋 2𝜎 2
(43)
for all 𝑥 ∈ R
88 / 137
Normal distribution
The normal p.d.f. 𝑓 (𝑥) is a

bell-shaped curve that is symmetric
about 𝜇.
Figure 1: The bell-shaped Normal Distribution curve is symmetric about 𝜇

89 / 137
Normal distribution
The parameters 𝜇 and 𝜎 2 represent

the mean and variance of the
distribution respectively.
Thus
E[𝑋] = 𝜇, Var(𝑋) = 𝜎 2
Given a normal r.v. 𝑋 and constants
𝛼, 𝛽, the r.v. defined as 𝑌 = 𝛼𝑋 + 𝛽
is normally distributed with mean
𝛼𝜇 + 𝛽 and variance 𝛼2 𝜎 2.
90 / 137
Normal distribution
Therefore, if 𝑋 ∼ N(𝜇, 𝜎 2), then

𝑋−𝜇
𝑍= (44)
𝜎
is a normal r.v. with mean 0 and
variance 1.
The r.v 𝑍 is called the standard
normal distribution.
We shall obtain probabilities of a
normal r.v. by converting it into the
standard normal r.v. 91 / 137
Normal distribution
Thus,

𝑋−𝜇 𝑏−𝜇
P(𝑋 < 𝑏) =P <
𝜎 𝜎

𝑏−𝜇
=P 𝑍 <
𝜎
Similarly

𝑎−𝜇 𝑋−𝜇 𝑏−𝜇
P(𝑎 < 𝑋 < 𝑏) =P < <
𝜎 𝜎 𝜎

𝑎−𝜇 𝑏−𝜇
=P <𝑍<
𝜎 𝜎

𝑏−𝜇 𝑎 − 𝜇
=P 𝑍 < −P 𝑍 <
𝜎 𝜎
92 / 137
Normal distribution
Figure 2: P(𝑍 < −𝑎) and P(𝑍 > 𝑎)

Note that due to symmetry, we have
P(𝑍 < −𝑎) = P(𝑍 > 𝑎) (45)
93 / 137
Example of Normal distribution
Q1. If 𝑋 ∼ N(3, 16), compute

(a) P(𝑋 < 12)
(b) P(𝑋 < −2)
(c) P(3 < 𝑋 < 8)
94 / 137
Example of Normal distribution
Q2. The power 𝑊 dissipated in a

resistor is proportional to the square
of the voltage 𝑉 (i.e. 𝑊 = 𝑟𝑉 2).
If 𝑟 = 2.5 and 𝑉 can be assumed to
be normally distributed with mean 5
and standard deviation 1, compute
(a) E[𝑊]
(b) P(𝑊 > 150)
95 / 137
Introduction
Random Variables
Normal Distribution
96 / 137
Hypothesis Testing
97 / 137
We can use sample data to estimate a

population parameter in two ways.
A Point Estimator draws inferences
about a population by estimating the
value of an unknown parameter by
using a single value or a point.
98 / 137
An Interval Estimator draws

inferences about a population by
estimating the value of an unknown
parameter by using an interval.
An Unbiased Estimator of a
population is an estimator whose
expected value is equal to that
parameter.
99 / 137
Note: The sample mean 𝑥¯ is an

unbiased estimator of the population
¯ = 𝜇).
mean 𝜇 (i.e. E[ 𝑥]
An unbiased estimator is said to be
Consistent if the difference between
the estimator and the parameter
grows smaller as size grows larger.
𝜎2
¯ =
Var( 𝑥) (46)
𝑛
100 / 137
If there are two or more unbiased

estimators for a parameter, the
estimator with the least variance is
said to have relative efficiency.
101 / 137
Confidence Intervals for the mean of a Normally Distributed Population
Known 𝜎 2
Suppose that 𝑥1, 𝑥2, · · · , 𝑥 𝑛 is a
sample from a normally distributed
population having an unknown mean
𝜇 but a known variance 𝜎 2.
Though 𝑥¯ is an unbiased estimator of
𝜇, we do not expect 𝑥¯ = 𝜇 but rather
𝑥¯ ≈ 𝜇.
102 / 137
Based on the Central Limit

Theorem (see page 204 of the
textbook), we have
√
𝑛( 𝑥¯ − 𝜇)
𝑧= (47)
𝜎
Therefore, ∃ 𝛼 s.t.
√
𝑛( 𝑥¯ − 𝜇)
P −𝑧 𝛼/2 < < 𝑧 𝛼/2 = 1−𝛼
𝜎 103 / 137
The probability 1 − 𝛼 is called the

Confidence Level.
Therefore, the Confidence Interval
Estimate of 𝜇 for known variance is
given as

𝜎
𝜇 ∈ 𝑥¯ ± √ 𝑧 𝛼/2 (48)
𝑛
104 / 137
Confidence Intervals for Normal mean with an unknown 𝜎 2

𝜇 and variance 𝜎 2. To construct a
(1 − 𝛼) ∗ 100% confidence interval,
we define a new random variable
√
𝑛
𝑡= ( 𝑥¯ − 𝜇) (49)
𝑠
with 𝑛 − 1 degrees of freedom.
105 / 137
Note that
𝑛
2 1 Õ
𝑠 = ¯ 2
(𝑥𝑖 − 𝑥)
𝑛−1
𝑖=1
and
√
𝑛( 𝑥¯ − 𝜇)
P −𝑡 𝛼/2, 𝑛−1 < < 𝑡 𝛼/2, 𝑛−1 = 1−𝛼
𝑠
106 / 137
Therefore, the Confidence Interval

Estimate of 𝜇 for an unknown
variance is given as

𝑠
𝜇 ∈ 𝑥¯ ± √ 𝑡 𝛼/2, 𝑛−1 (50)
𝑛
107 / 137
Suppose that when a signal having value 𝜇 is

transmitted from location 𝐴, the value received at
location 𝐵 is normally distributed with mean 𝜇 and
variance 4. To reduce the error, the same value is
sent 9 times. If the sequence of values received are
5, 8.5, 12, 15, 7, 9, 7.5, 6.5, 10.5
Construct
I 95% confidence interval for 𝜇
I 99% confidence interval for 𝜇
I 95% and 99% confidence intervals for 𝜇
assuming the variance is unknown.
108 / 137
Confidence Intervals for Variance (𝜎 2 ) of a Normal Distribution

𝜇 and variance 𝜎 2. We can construct
a confidence interval for 𝜎 2 by using
the fact that the sample variance 𝑠2 is
an unbiased consistent estimator of
𝜎 2.
109 / 137
Confidence Intervals for Variance (𝜎 2 ) of a Normal Distribution
We define a new random variable

𝑠2 2
(𝑛 − 1) 2 ∼ 𝜒𝑛−1 (51)
𝜎
2 is known as the
where 𝜒𝑛−1
chi-squared distribution with 𝑛 − 1
degrees of freedom.
110 / 137
Confidence Intervals for 𝜎 2
Hence
2
2 𝑠 2
P 𝜒1−𝛼/2, 𝑛−1
≤ (𝑛 − 1) 2
≤ 𝜒𝛼/2, 𝑛−1
= 1−𝛼
𝜎
⇔
!
(𝑛 − 1)𝑠2 2 (𝑛 − 1)𝑠2
P 2
≤𝜎 ≤ 2 = 1−𝛼
𝜒𝛼/2, 𝑛−1 𝜒1−𝛼/2, 𝑛−1
(52)
111 / 137
Confidence Intervals for 𝜎 2
The weights of a random samples of

cereal boxes that are supposed to be
weighing 1kg are listed below.
1.05, 1.03, 0.98, 1.0, 0.99, 0.97,
1.01, 0.96.
Estimate the variance of the entire
population of cereal box weights
with 90% confidence.
112 / 137
Introduction
Random Variables
Normal Distribution
113 / 137
Hypothesis Testing
114 / 137
Hypothesis Testing
Instead of constructing a confidence

interval for a parameter of a
population with a known
distribution, we shall make an
emphatic statement about the
parameter and then use the available
sample data to test the validity or
otherwise of our statement.
115 / 137
H.T for the Mean of a Normally Distributed Population

sample of size 𝑛 from a population,
which is normally distributed with
an unknown mean 𝜇 and a known
(unknown) variance 𝜎 2.
116 / 137
Suppose we are interested in testing

the null hypothesis
𝐻0 : 𝜇 = 𝜇 0 (53)
against the alternative hypothesis
𝐻1 : 𝜇 ≠ 𝜇 0 (54)
where 𝜇0 is a specified constant.
117 / 137
Decision 𝐻0 is True 𝐻0 is False

Reject 𝐻0 Type I Error Correct Decision
Do not reject 𝐻0 Correct Decision Type II Error
The significant level 𝛼 test is to

reject 𝐻0 if
𝜎
| 𝑥¯ − 𝜇0 | > 𝑧 𝛼/2 √ (55)
𝑛
and accept otherwise.
118 / 137
Thus to say that
√
𝑛
I Reject 𝐻0 if 𝑥¯ − 𝜇0 > 𝑧 𝛼/2

𝜎
√
𝑛
I Accept 𝐻0 if 𝑥¯ − 𝜇0 ≤ 𝑧 𝛼/2

𝜎
119 / 137
One-Sided Hypothesis Tests
When the testing statement is given

as follows:
𝐻0 : 𝜇 = 𝜇 0
(56)
𝐻1 : 𝜇 > 𝜇 0
we reject 𝐻0 when 𝑥,
¯ the point
estimate of 𝜇0 is much greater than
𝜇0
120 / 137
One-Sided Hypothesis Tests
Thus √
𝑛
I Reject 𝐻0 if 𝑥¯ − 𝜇0 > 𝑧 𝛼
𝜎
√
𝑛
I Accept 𝐻0 if 𝑥¯ − 𝜇0 ≤ 𝑧 𝛼
𝜎
The decision criteria is called
one-sided critical region.
121 / 137
Summary
𝐻0 𝐻1 Test√Statistic (TS) Significant Level 𝛼 Test

𝑛
𝜇 = 𝜇0 𝜇 ≠ 𝜇0 √𝜎
¯
𝑥 − 𝜇 0 Reject 𝐻0 if |𝑇 𝑆| > 𝑧 𝛼/2
𝑛
𝜇 ≤ 𝜇0 𝜇 > 𝜇0 √𝜎 ¯
𝑥 − 𝜇 0 Reject 𝐻0 if 𝑇 𝑆 > 𝑧 𝛼
𝑛
𝜇 ≥ 𝜇0 𝜇 < 𝜇0 𝜎 ¯
𝑥 − 𝜇 0 Reject 𝐻0 if 𝑇 𝑆 < −𝑧 𝛼
122 / 137
Examples
Test the following

𝐻 : 𝜇 = 100
I 0 with 𝜎 = 10,
𝐻1 : 𝜇 ≠ 100
𝑛 = 100, 𝑥¯ = 100 and 𝛼 = 0.05
𝐻 : 𝜇 = 50
I 0 with 𝜎 = 15,
𝐻1 : 𝜇 < 50
𝑛 = 100, 𝑥¯ = 48 and 𝛼 = 0.05
𝐻 : 𝜇 = 50
I 0 with 𝜎 = 5, 𝑛 = 9,
𝐻1 : 𝜇 > 50
𝑥¯ = 51 and 𝛼 = 0.03
123 / 137
Example
A random sample of 18 young adult men (20 - 30 yrs

old) were sampled. Each person was asked how
many minutes of sport he watched on TV daily. The
responses are listed below.
64, 50, 48, 65, 74, 66, 37, 45, 68, 65, 58, 55, 52, 63,
59, 57, 74, 65
Test to determine at 5% significance level whether
there is enough statistical evidence to infer that the
mean amount of TV watched daily by all young men
is greater than 50 minutes.
124 / 137
Introduction
Random Variables
Normal Distribution
125 / 137
Hypothesis Testing
126 / 137
Regression analysis is a technique

for developing mathematical model
that describes the relationship
between a set of variables.
In many situations, there is a single
variable 𝑌 , which depends on other
set of variables 𝑥1, 𝑥2, · · · , 𝑥𝑟 .
127 / 137
The simplest type of relationship

between 𝑌 and the input variables
𝑥1, 𝑥2, · · · , 𝑥𝑟 , is a linear
relationship.
Thus
𝑌 = 𝛽0 + 𝛽1𝑥1 + 𝛽2𝑥2 +· · ·+ 𝛽𝑟 𝑥𝑟 (57)
128 / 137
However, (57) is almost never

attainable. The best that can be
obtained is a relationship subject to
random error.
Thus a linear regression equation is
𝑌 = 𝛽0 + 𝛽1𝑥1 + 𝛽2𝑥2 + · · · + 𝛽𝑟 𝑥𝑟 + 𝜉
(58)
with E[𝜉] = 0. Hence
E[𝑌 |𝑥] = 𝛽0 + 𝛽1𝑥1 + 𝛽2𝑥2 + · · · + 𝛽𝑟 𝑥𝑟
129 / 137
The constants 𝛽𝑖 ∀ 𝑖 = 0, 1, · · · , 𝑟 are

called the regression coefficients and
are usually estimated from a data set.
A regression equation containing a
single independent variable (i.e.
𝑟 = 1) is called a simple regression
equation whereas an equation
containing many independent
variables (i.e. 𝑟 > 1) is called a
multiple regression equation. 130 / 137
Least Squares Estimators of

Regression Parameters
Consider a simple linear regression
equation
𝑌 = 𝛽0 + 𝛽1 𝑥 + 𝜉 (59)
Then we can rewrite the equation as
𝑌 − 𝛽0 − 𝛽1 𝑥 = 𝜉
131 / 137
We define the sum of squared errors

as
Õ𝑛
𝑆𝑆 = (𝑌𝑖 − 𝛽0 − 𝛽1𝑥𝑖 ) 2 (60)
𝑖=1
The least squares method chooses
estimators of 𝛽0 and 𝛽1 that
minimizes SS
132 / 137
The lest square method estimates

𝛽0 = 𝑦¯ − 𝛽1𝑥¯ (61)
𝑆𝑥𝑦
𝛽1 = 2 (62)
𝑆𝑥
where
𝑛
1Õ
𝑥¯ = 𝑥𝑖
𝑛
𝑖=1
133 / 137
𝑛
1Õ
𝑦¯ = 𝑦𝑖
𝑛
𝑖=1
𝑛
1 Õ
𝑆𝑥𝑦 = 𝑥𝑖 𝑦𝑖 − 𝑛𝑥¯ 𝑦¯
𝑛−1
𝑖=1
𝑛
1 Õ
𝑆𝑥2 = 𝑥𝑖2 − 𝑛𝑥¯ 2
𝑛−1
𝑖=1
134 / 137
Attempting to analyze the
relationship between advertising and
sales, the owner of a furniture store
recorded the monthly advertising
budget ($) and sales ($1,000) for 8
months as follows
Advert 23 46 60 28 33 25 31 36
Sales 9.6 11.3 12.8 8.9 12.5 12.0 11.4 12.6
How much should the store spend on
adverting if a sales value of $50,000
is desired? 135 / 137
Thank you
136 / 137
References
[1] Navidi, W.
Statistics for Engineers and Scientists, fourth ed.
McGraw-Hill, 2015.
[2] Ross, S.
Introduction to Probability and Statistics for Engineers and
Scientists, fourth ed.
Elsevier Academic Press, 2009.
137 / 137

BSMA 301 Statistics: Dr. Eyram Kwame

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

BSMA 301 Statistics: Dr. Eyram Kwame

Uploaded by

Copyright:

Available Formats

BSMA 301 Statistics

Dr. Eyram Kwame

Moment Generating Function

I Navidi William Statistics for Engineers and

I Statistics is the science of

I A population is the total number

I Descriptive statistics deals with

I Due to the size of populations, a

I Central Location is measured

I Relative Standing is a measure of

I Histogram, Bar and Pie Chart are

I Sample Space: is the set of all

I Mutually Exclusive Events: two

I A3: For any sequence of mutually

I A random variable (r.v.) is a

not the individual values of the

I There are two types of random

I A continuous random variable

heads obtained in an experiment

I The requirement for a distribution

The population mean 𝜇 of a discrete

The population variance of a discrete

Properties of Expected value and

Using a historical records, the manager of a company

(b) Calculate the mean and the standard deviation of

Given p(𝑥, 𝑦) for 𝑥 and 𝑦, we have

The coefficient of correlation, which

Properties of Expected values and

Given the Bivariate distribution of r.v. 𝑋 and 𝑌 as

(a) Compute the marginal probabilities, p(𝑥) and

I The Binomial experiment consists

I The trials are independent. Thus,

Given a Binomial r.v. 𝑋 with 𝑛 = 10

A certain type of tomato seed germinates 80% of the

I The probability of a success in an

I The Poisson r.v. takes the # of

The Expectation and Variance of

The # of accidents that occur at a

A random variable 𝑋 is said to be a

A function 𝑓 (𝑥) can only be classified as a pdf iff

The relationship between the

Suppose that 𝑋 is a continuous r.v.

Random variables 𝑋 and 𝑌 are said

If 𝑋 and 𝑌 are jointly continuous,

The joint p.d.f. of 𝑋 and 𝑌 is given

𝑘𝑒 −3𝑥−2𝑦 , (𝑥, 𝑦) ∈ R2+

The r.v. 𝑋 and 𝑌 are said to be

Then for any set 𝐴,

The expected value of a continuous

The expected value of a continuous

As an example, show that

If 𝑋 is a r.v. with mean 𝜇, then the

The moment generating function

In general, the 𝑛th derivative of 𝜙(𝑡)

Suppose 𝑋 and 𝑌 are i.r.v. and have

A r.v. 𝑋, which represents the

The manufacturer sells the article for

A r.v. 𝑋 is said to be uniformly

A r.v. 𝑋 is said to be uniformly

The amount of time it takes for a student to complete

A continuous r.v. 𝑋 with parameter

The life-time of an alkaline battery