Download as pdf or txt
Download as pdf or txt
You are on page 1of 137

BSMA 301 Statistics

Dr. Eyram Kwame

December 7, 2020

1 / 137
Outline
Recommended Textbooks
Introduction
Descriptive Statistics
Numerical Descriptive Techniques
Graphical Descriptive Techniques
Elements of Probability
Random Variables
Discrete Probability Distribution
Binomial Distribution
Poisson Distribution
Continuous Random Variables
Joint Probability Density Function
2 / 137
Outline (contd.)
Expectation and Variance

Moment Generating Function

Uniform Distribution

Exponential Distribution

Normal Distribution

Parameter Estimation

Hypothesis Testing

Introduction to Regression
3 / 137
Recommended Textbooks
Introduction
Descriptive Statistics
Numerical Descriptive Techniques
Graphical Descriptive Techniques
Elements of Probability
Random Variables
Discrete Probability Distribution
Binomial Distribution
Poisson Distribution
Continuous Random Variables
Joint Probability Density Function
Expectation and Variance
Moment Generating Function
Uniform Distribution
Exponential Distribution
Normal Distribution
Parameter Estimation
Hypothesis Testing
Introduction to Regression 4 / 137
Recommended Textbooks

I Navidi William Statistics for Engineers and


Scientists, 4th Edition, McGraw-Hill Education,
2015.[1]
I Ross M. Sheldon Introduction to Probability
and Statistics for Engineers and Scientists, 3rd
Edition, Elsevier Academic Press 2009. [2]

5 / 137
Recommended Textbooks
Introduction
Descriptive Statistics
Numerical Descriptive Techniques
Graphical Descriptive Techniques
Elements of Probability
Random Variables
Discrete Probability Distribution
Binomial Distribution
Poisson Distribution
Continuous Random Variables
Joint Probability Density Function
Expectation and Variance
Moment Generating Function
Uniform Distribution
Exponential Distribution
Normal Distribution
Parameter Estimation
Hypothesis Testing
Introduction to Regression 6 / 137
Introduction

I Statistics is the science of


learning from data. It comprises
data collection, Description, and
Analysis, for making Inferences.
I The two main branches of
statistics are Descriptive and
Inferential Statistics.
I Information is a processed data.
7 / 137
Introduction

I A population is the total number


of all the items of interest. It is
often very large and may be
infinite. A descriptive measure of
a population is called a parameter.
I A sample is a set of data drawn
from a population. A descriptive
measure of a sample is called a
statistic.
8 / 137
Introduction

I Descriptive statistics deals with


methods of organizing,
summarizing, and presenting data
in a convenient and informative
way.
I Statistical Inference is the process
of making an estimation,
prediction, or decision about a
population based on sample data.
9 / 137
Introduction

I Due to the size of populations, a


sample of a reasonable size is
adequate for conclusions or
estimations about the population
based on the information provided
I Simple random, Stratified random,
Cluster and Convenience
sampling are various Sampling
Techniques [1]
10 / 137
Recommended Textbooks
Introduction
Descriptive Statistics
Numerical Descriptive Techniques
Graphical Descriptive Techniques
Elements of Probability
Random Variables
Discrete Probability Distribution
Binomial Distribution
Poisson Distribution
Continuous Random Variables
Joint Probability Density Function
Expectation and Variance
Moment Generating Function
Uniform Distribution
Exponential Distribution
Normal Distribution
Parameter Estimation
Hypothesis Testing
Introduction to Regression 11 / 137
Numerical Descriptive Techniques

I Central Location is measured


with Mean, Median and Mode.
Arithmetic mean of a data set is
𝑁 𝑛
1Õ 1Õ
𝜇= 𝑥𝑖 or 𝑥¯ = 𝑥𝑖
𝑁 𝑛
𝑖=1 𝑖=1
Note that 𝜇 is population mean
while 𝑥¯ is sample mean.
12 / 137
Numerical Descriptive Techniques (contd.)
I Dispersion (or Variability) is
measured with Range and
Variance. The Variance for a
population 𝜎 2 and a sample 𝑠2 are
𝑁
1 Õ
𝜎2 = (𝑥𝑖 − 𝜇) 2,
𝑁
𝑖=1
𝑛
2 1 Õ
𝑠 = ¯ 2
(𝑥𝑖 − 𝑥)
𝑛−1
𝑖=1
13 / 137
Numerical Descriptive Techniques (contd.)

I Relative Standing is a measure of


the relationship between a data set
and the rest of the data. Relative
standing is measured with
percentile ranking and quartiles.
The lower quartile 𝑄 𝐿 is the 25th percentile of a
data set. The middle quartile 𝑄 𝑀 is the median
or 50th percentile. The upper quartile 𝑄𝑈 is the
75th percentile.

14 / 137
Graphical Descriptive Techniques

I Histogram, Bar and Pie Chart are


various types of graphical
representation of data
I In this lecture, we shall use
Matlab and Microsoft Excel to
present processed data graphically

15 / 137
Recommended Textbooks
Introduction
Descriptive Statistics
Numerical Descriptive Techniques
Graphical Descriptive Techniques
Elements of Probability
Random Variables
Discrete Probability Distribution
Binomial Distribution
Poisson Distribution
Continuous Random Variables
Joint Probability Density Function
Expectation and Variance
Moment Generating Function
Uniform Distribution
Exponential Distribution
Normal Distribution
Parameter Estimation
Hypothesis Testing
Introduction to Regression 16 / 137
Elements of Probability

I Sample Space: is the set of all


possible outcomes of an
experiment. It is denoted by S.
Let 𝑛 (S) denote the total count of
all the items in the sample space.
I An Event: is any subset 𝐴 of the
sample space.
Let 𝑛 ( 𝐴) denote the total count of
all the items in the event.
17 / 137
Elements of Probability

I Mutually Exclusive Events: two


or more events are said to be
mutually exclusive if both or all
events cannot occur at the same
time.
I Probability: is the measure of
the likelihood of an event
occurring during an experiment.
I The probability of 𝐴 is

𝑛( 𝐴)
P ( 𝐴) = . (1)
𝑛 (S)
18 / 137
Axioms of Probability
For any event 𝐴 of an experiment,
having a sample space S, the
following axioms hold
I A1: 0 ≤ P( 𝐴) ≤ 1

I A2: P (S) = 1

I A3: For any sequence of mutually

exclusive events
𝐴1, 𝐴2, 𝐴!3, · · · , 𝐴𝑛 , we have
Ø𝑛 Õ 𝑛
P 𝐴𝑖 = P( 𝐴𝑖 )
𝑖=1 𝑖=1 19 / 137
Conditional Probability
I The Intersection of events 𝐴 and
𝐵 is the event that both event 𝐴
and event 𝐵 occur at the same
time. It is denoted as 𝐴 ∩ 𝐵
I Conditional Probability is the
probability of an event (say 𝐴)
given that an event (say 𝐵) has
occurred:
P( 𝐴 ∩ 𝐵)
P( 𝐴|𝐵) = (2)
P(𝐵)
20 / 137
Recommended Textbooks
Introduction
Descriptive Statistics
Numerical Descriptive Techniques
Graphical Descriptive Techniques
Elements of Probability
Random Variables
Discrete Probability Distribution
Binomial Distribution
Poisson Distribution
Continuous Random Variables
Joint Probability Density Function
Expectation and Variance
Moment Generating Function
Uniform Distribution
Exponential Distribution
Normal Distribution
Parameter Estimation
Hypothesis Testing
Introduction to Regression 21 / 137
Random Variables

I A random variable (r.v.) is a


variable (say 𝑋) that can be
assigned a possible numerical
values, which are outcomes of a
random phenomenon.
I Consider the rolling of two fair
dice. If our interest only lies in
knowing the sum of the values but
22 / 137
Random Variables (contd.)

not the individual values of the


dice, then our r.v. 𝑋 is the sum
with possible values
{2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12}.

23 / 137
Random Variables

I There are two types of random


variables.
I A discrete r.v. is any r.v. that can
take on a countable number of
values.
Eg: 𝑋 = number of heads
observed in an experiment that
flips a coin 3 times.
24 / 137
Random Variables (contd.)

I A continuous random variable


is any r.v. with uncountable
values. An excellent example of a
continuous r. v. is the time to
complete a task (𝑋 = time to write
a statistics exam).

25 / 137
Random Variables

I A probability distribution is a
table, formula, or graph that
describes the values of a random
variable and the probability
associated with these values.
I The probability distribution of the
r.v. 𝑋, which is the number of

26 / 137
Random Variables (contd.)

heads obtained in an experiment


that flips a coin 3 times is given as
𝑥 0 1 2 3
p(𝑥) 0.125 0.375 0.375 0.125

27 / 137
Recommended Textbooks
Introduction
Descriptive Statistics
Numerical Descriptive Techniques
Graphical Descriptive Techniques
Elements of Probability
Random Variables
Discrete Probability Distribution
Binomial Distribution
Poisson Distribution
Continuous Random Variables
Joint Probability Density Function
Expectation and Variance
Moment Generating Function
Uniform Distribution
Exponential Distribution
Normal Distribution
Parameter Estimation
Hypothesis Testing
Introduction to Regression 28 / 137
D.P.D

I The requirement for a distribution


to be considered as a Discrete
Probability Distribution is as
follows:
0 ≤ p(𝑥) ≤ 1 (3)
Õ
p(𝑥) = 1 (4)
𝑥
29 / 137
D.P.D (contd.)

The population mean 𝜇 of a discrete


r.v. is the weighted average of all its
values. The parameter is called the
Expected value (Expectation) of 𝑋
and it is represented as
Õ
E(𝑋) = 𝜇 = 𝑥p(𝑥) (5)
𝑥

30 / 137
D.P.D (contd.)

The population variance of a discrete


r.v. is defined as
Õ
2
Var(𝑋) = 𝜎 = (𝑥 − 𝜇) 2p(𝑥)
𝑥
Õ
= 𝑥 2p(𝑥) − 𝜇2
𝑥

31 / 137
D.P.D

Properties of Expected value and


Variance
1. E(𝑐) = 𝑐
2. Var(𝑐) = 0
3. E(𝑋 + 𝑐) = E(𝑋) + 𝑐
4. Var(𝑋 + 𝑐) = Var(𝑋)
5. E(𝑐𝑋) = 𝑐E(𝑋)
6. Var(𝑐𝑋) = 𝑐2Var(𝑋)
32 / 137
An Example

Using a historical records, the manager of a company


has determined that the probability distribution of 𝑋,
the number of employees absent per day
𝑥 0 1 2 3 4 5 6
p(𝑥) 0.005 0.025 0.31 0.34 0.22 0.08 0.02
(a) Compute the following probabilities
(i) P(2 ≤ 𝑋 ≤ 5)
(ii) P(𝑋 > 5)
(iii) P(𝑋 < 4)

(b) Calculate the mean and the standard deviation of


the population

33 / 137
Bivariate Distribution
A Bivariate Probability
Distribution of random variables 𝑋
and 𝑌 is a table or formula that gives
the joint probabilities p(𝑥, 𝑦) for all
pairs of 𝑥 and 𝑦.
The requirement for a Discrete
Bivariate Distribution is as follows:
Õ 0≤Õp(𝑥, 𝑦) ≤ 1 (6)
p(𝑥, 𝑦) = 1 (7)
𝑥 𝑦 34 / 137
Bivariate Distribution

Given p(𝑥, 𝑦) for 𝑥 and 𝑦, we have


Õ
E(𝑋) =𝜇𝑥 = 𝑥p(𝑥) (8)
Õ𝑥
E(𝑌 ) =𝜇 𝑦 = 𝑦p(𝑦) (9)
𝑦
where p(𝑥) and p(𝑦) are known as
the marginal probabilities of 𝑥 and 𝑦
respectively.
35 / 137
Bivariate Distribution
The variance and covariance are given as
Õ
Var(𝑋) = 𝜎𝑥2 = (𝑥 − 𝜇𝑥 ) 2 p(𝑥)
𝑥
Õ
= 𝑥 2 p(𝑥) − 𝜇𝑥2 (10)
𝑥
Õ
Var(𝑌 ) = 𝜎𝑦2 = (𝑦 − 𝜇 𝑦 ) 2 p(𝑦)
𝑦
Õ
= 𝑦 2 p(𝑦) − 𝜇2𝑦 (11)
𝑦
ÕÕ
COV(𝑋, 𝑌 ) = 𝜎𝑥𝑦 = (𝑥 − 𝜇𝑥 )(𝑦 − 𝜇 𝑦 )p(𝑥, 𝑦)
𝑥 𝑦
ÕÕ
= 𝑥𝑦p(𝑥, 𝑦) − 𝜇𝑥 𝜇 𝑦 (12)
𝑥 𝑦 36 / 137
Bivariate Distribution

The coefficient of correlation, which


measure the strength of the linear
relationship between two random
variables is defined as
𝜎𝑥𝑦
𝑟= (13)
𝜎𝑥 𝜎𝑦
Note that |𝑟 | ≤ 1.

37 / 137
Bivariate Distribution

Properties of Expected values and


Variance
1. E(𝑋 + 𝑌 ) = E(𝑋) + E(𝑌 )
2. Var (𝑋 + 𝑌 ) = Var (𝑋) + Var (𝑌 ) + 2COV (𝑋, 𝑌 )
If 𝑋 and 𝑌 are independent,
COV(𝑋, 𝑌 ) = 0 and thus
3 Var(𝑋 + 𝑌 ) = Var(𝑋) + Var(𝑌 )

38 / 137
Example

Given the Bivariate distribution of r.v. 𝑋 and 𝑌 as


𝑥
𝑦 1 2
1 0.28 0.42
2 0.12 0.18

(a) Compute the marginal probabilities, p(𝑥) and


p(𝑦).
(b) Compute 𝜇𝑥 , 𝜇 𝑦 , 𝜎𝑥 and 𝜎𝑦 .
(c) Compute COV(𝑋, 𝑌 ) and 𝑟
(d) Compute E(𝑋 + 𝑌 ) and Var(𝑋 + 𝑌 )

39 / 137
Example
After analysing several months of sales data, the
owner of an appliance store produced the ff joint
probability distribution of the refrigerators and stoves
sold daily;
Refrigerators
Stoves 0 1 2
0 0.08 .14 0.12
1 0.09 0.17 0.13
2 0.05 0.18 0.04
(a) Compute COV(𝑋, 𝑌 ) and 𝑟
(b) Compute the following conditional probabilities
(i) P(1 Ref | 0 Stove)
(ii) P(2 Ref | 2 Stove)
40 / 137
Recommended Textbooks
Introduction
Descriptive Statistics
Numerical Descriptive Techniques
Graphical Descriptive Techniques
Elements of Probability
Random Variables
Discrete Probability Distribution
Binomial Distribution
Poisson Distribution
Continuous Random Variables
Joint Probability Density Function
Expectation and Variance
Moment Generating Function
Uniform Distribution
Exponential Distribution
Normal Distribution
Parameter Estimation
Hypothesis Testing
Introduction to Regression 41 / 137
Binomial Distribution

I The Binomial experiment consists


of a fixed number of trials. We
represent the # of trials with 𝑛.
I Each trial has 2 possible
outcomes. We label one outcome
as success and the other as failure.
I The probability of success is 𝑝
and that of failure is 1 − 𝑝.
42 / 137
Binomial Distribution

I The trials are independent. Thus,


the outcome of one trial does not
affect the outcome of any other
trials.
I If 𝑋 represents the number of
successes that occur in the 𝑛 trials,
then 𝑋 is said to be a binomial r.v.
with parameters (𝑛, 𝑝).
43 / 137
Binomial Distribution

 
𝑛 𝑥
P(𝑋 = 𝑥) = 𝑝 (1 − 𝑝) 𝑛−𝑥
𝑥
(14)
𝑛!
= 𝑝 𝑥 (1 − 𝑝) 𝑛−𝑥
𝑥!(𝑛 − 𝑥)!
𝜇 = E(𝑋) =𝑛𝑝 (15)
𝜎 2 = Var(𝑋) =𝑛𝑝(1 − 𝑝) (16)
44 / 137
Examples of Bin(𝑛, 𝑝)

Given a Binomial r.v. 𝑋 with 𝑛 = 10


and 𝑝 = 0.6, compute
(a) P(𝑋 = 3)
(b) P(𝑋 = 5)
(c) P(𝑋 ≤ 4)
(d) P(6 ≤ 𝑋 ≤ 9)

45 / 137
Examples of Bin(𝑛, 𝑝)

A certain type of tomato seed germinates 80% of the


time. A backyard farmer planted 25 seeds. What is
the probability that
(a) exactly 20 of the seeds germinated?
(b) more than 20 of the seeds germinated?
(c) 24 or fewer of the seeds germinated?
(d) What is the expected number of germinated
seeds?
(e) What is the standard deviation?

46 / 137
Recommended Textbooks
Introduction
Descriptive Statistics
Numerical Descriptive Techniques
Graphical Descriptive Techniques
Elements of Probability
Random Variables
Discrete Probability Distribution
Binomial Distribution
Poisson Distribution
Continuous Random Variables
Joint Probability Density Function
Expectation and Variance
Moment Generating Function
Uniform Distribution
Exponential Distribution
Normal Distribution
Parameter Estimation
Hypothesis Testing
Introduction to Regression 47 / 137
Poisson Distribution

A Poisson experiment is
characterized by the following
properties
I The # of successes that occur in
any interval is independent of that
which occurs in another.
I The probability of a success in an
interval is the same for all
equal-size intervals.
48 / 137
Poisson Distribution

I The probability of a success in an


interval is proportional to the size
of the interval.
I The probability of more than one
success in an interval approaches
zero (0) as the interval becomes
smaller.

49 / 137
Poisson Distribution

I The Poisson r.v. takes the # of


successes that occur in a period of
time or an interval of space in a
Poisson experiment.
I The probability that a Poisson r.v.
𝑋 assumes a value 𝑥 in a specific
interval with parameter 𝜆 > 0 is
𝜆𝑥 𝑒 −𝜆
P(𝑋 = 𝑥) = ∀ 𝑥 = 0, 1, 2, · · ·
𝑥!
(17)50 / 137
Poisson Distribution

The Expectation and Variance of


𝑋 ∼ Poisson(𝜆) are given as
E(𝑋) =𝜆 (18)
Var(𝑋) =𝜆 (19)
Example:
Given 𝑋 ∼ Poisson(2), compute
(a) P(𝑋 = 0)
(b) P(𝑋 ≤ 3)
(c) P(𝑋 ≥ 5) 51 / 137
Poisson Distribution

The # of accidents that occur at a


busy intersection is a Poisson
distribution with mean of 3.5 per
week. Find the probability of the
following events;
(a) No accident occur in one week
(b) Five or more accidents occur in
one week
(c) two accident will occur today
52 / 137
Recommended Textbooks
Introduction
Descriptive Statistics
Numerical Descriptive Techniques
Graphical Descriptive Techniques
Elements of Probability
Random Variables
Discrete Probability Distribution
Binomial Distribution
Poisson Distribution
Continuous Random Variables
Joint Probability Density Function
Expectation and Variance
Moment Generating Function
Uniform Distribution
Exponential Distribution
Normal Distribution
Parameter Estimation
Hypothesis Testing
Introduction to Regression 53 / 137
Continuous r.v.

A random variable 𝑋 is said to be a


continuous r.v. if ∃ a non-negative
function 𝑓 (𝑥) defined ∀ 𝑥 ∈ R s. t.
for any set 𝐵 ⊂ R

P(𝑥 ∈ 𝐵) = 𝑓 (𝑥)𝑑𝑥 (20)
𝐵
𝑓 (𝑥) is known as the probability
density function (pdf) of the r.v. 𝑋
54 / 137
Requirement for a pdf

A function 𝑓 (𝑥) can only be classified as a pdf iff

𝑓 (𝑥) ≥0 (21)
∫ ∞
P(𝑥 ∈ R) = 𝑓 (𝑥)𝑑𝑥 = 1 (22)
−∞
∫ 𝑏
P(𝑎 ≤ 𝑥 ≤ 𝑏) = 𝑓 (𝑥)𝑑𝑥 (23)
𝑎
∫ 𝑎
P(𝑥 = 𝑎) = 𝑓 (𝑥)𝑑𝑥 = 0 (24)
𝑎

55 / 137
Requirement for a p.d.f.

The relationship between the


cumulative distribution function 𝐹 (.)
and the pdf 𝑓 (𝑥) is expressed by
 
𝐹 (𝑎) =P 𝑥 ∈ (−∞, 𝑎]
∫ 𝑎
= 𝑓 (𝑥)𝑑𝑥 (25)
−∞

56 / 137
Requirement for a pdf

Suppose that 𝑋 is a continuous r.v.


with p.d.f.

𝑘 (36𝑥 − 6𝑥 2), 0 < 𝑥 < 6
𝑓 (𝑥) =
0 otherwise
Compute: (a) the value of 𝑘.
(b) P(𝑋 > 3)

57 / 137
Recommended Textbooks
Introduction
Descriptive Statistics
Numerical Descriptive Techniques
Graphical Descriptive Techniques
Elements of Probability
Random Variables
Discrete Probability Distribution
Binomial Distribution
Poisson Distribution
Continuous Random Variables
Joint Probability Density Function
Expectation and Variance
Moment Generating Function
Uniform Distribution
Exponential Distribution
Normal Distribution
Parameter Estimation
Hypothesis Testing
Introduction to Regression 58 / 137
Joint p.d.f.

Random variables 𝑋 and 𝑌 are said


to be jointly continuous if ∃ a
non-negative function 𝑓 (𝑥, 𝑦)
defined ∀ (𝑥, 𝑦) ∈ R 2 s. t. for any set
 2

𝐶 = (𝑥, 𝑦) ∈ R ,
∫ ∫
P (𝐶) = 𝑓 (𝑥, 𝑦)𝑑𝑥𝑑𝑦 (26)
𝑦 𝑥

59 / 137
Joint p.d.f.

If 𝐴 and 𝐵 are
 any sets in
R, then for
any set 𝐶 = (𝑥, 𝑦) ∈ R2 , we have
∫ ∫
P (𝑥 ∈ 𝐴, 𝑦 ∈ 𝐵) = 𝑓 (𝑥, 𝑦)𝑑𝑥𝑑𝑦
𝐵 𝐴
(27)
because
𝐹 (𝑎, 𝑏) =P (𝑥 ∈ (−∞, 𝑎], 𝑦 ∈ (−∞, 𝑏])
∫ 𝑏∫ 𝑎
= 𝑓 (𝑥, 𝑦)𝑑𝑥𝑑𝑦
−∞ −∞ 60 / 137
Joint p.d.f.

If 𝑋 and 𝑌 are jointly continuous,


they are individually continuous and
their pdfs can be obtained as follows:
P (𝑥 ∈ 𝐴) =P (𝑥 ∈ 𝐴, 𝑦 ∈ R)
∫ ∫ ∞
= 𝑓 (𝑥, 𝑦)𝑑𝑦𝑑𝑥
∫𝐴 −∞
= 𝑓 𝑋 (𝑥)𝑑𝑥 (28)
𝐴
61 / 137
Joint p.d.f.

Where
∫ ∞
𝑓 𝑋 (𝑥) = 𝑓 (𝑥, 𝑦)𝑑𝑦 (29)
−∞
Similarly,
∫ ∞
𝑓𝑌 (𝑦) = 𝑓 (𝑥, 𝑦)𝑑𝑥 (30)
−∞

62 / 137
Joint p.d.f.

The joint p.d.f. of 𝑋 and 𝑌 is given


by

𝑘𝑒 −3𝑥−2𝑦 , (𝑥, 𝑦) ∈ R2+



𝑓 (𝑥, 𝑦) =
0 otherwise
Compute: (a) the value of 𝑘.
(b) P(𝑋 > 1, 𝑌 < 1)
(c) P(𝑋 < 𝑌 )
(d) P(𝑋 < 𝑟)
63 / 137
Joint p.d.f.

The r.v. 𝑋 and 𝑌 are said to be


independent if
𝑓 (𝑥, 𝑦) = 𝑓 𝑋 (𝑥) 𝑓𝑌 (𝑦) ∀ 𝑥, 𝑦. (31)
If 𝑋 and 𝑌 have joint p.d.f. 𝑓 (𝑥, 𝑦),
then the conditional p.d.f. of 𝑋,
given 𝑌 = 𝑦 is defined ∀ values of 𝑦
s.t. 𝑓𝑌 (𝑦) > 0 by
𝑓 (𝑥, 𝑦)
𝑓 𝑋 |𝑌 (𝑥|𝑦) = (32)
𝑓𝑌 (𝑦) 64 / 137
Joint p.d.f.

Then for any set 𝐴,


  ∫
P 𝑥 ∈ 𝐴|𝑌 = 𝑦 = 𝑓 𝑋 |𝑌 (𝑥|𝑦)𝑑𝑥
𝐴
see page 107 for an example

65 / 137
Recommended Textbooks
Introduction
Descriptive Statistics
Numerical Descriptive Techniques
Graphical Descriptive Techniques
Elements of Probability
Random Variables
Discrete Probability Distribution
Binomial Distribution
Poisson Distribution
Continuous Random Variables
Joint Probability Density Function
Expectation and Variance
Moment Generating Function
Uniform Distribution
Exponential Distribution
Normal Distribution
Parameter Estimation
Hypothesis Testing
Introduction to Regression 66 / 137
Expectation

The expected value of a continuous


r.v. is define as,
∫ ∞
E (𝑋) = 𝑥 𝑓 (𝑥)𝑑𝑥. (33)
−∞
Given r.v. 𝑋, define 𝑌 = 𝑔(𝑋) to be
a new r.v. We define the expectation
of 𝑌 = 𝑔(𝑋) as
∫ ∞
E [𝑔(𝑋)] = 𝑔(𝑥) 𝑓 (𝑥)𝑑𝑥. (34)
−∞
67 / 137
Expectation

The expected value of a continuous


r.v. 𝑋 is normally referred to as the
mean or the first moment of 𝑋.
For joint p.d.f. 𝑓 (𝑥, 𝑦), we have
∫ ∞∫ ∞
E [𝑔(𝑋, 𝑌 )] = 𝑔(𝑥, 𝑦) 𝑓 (𝑥, 𝑦)𝑑𝑥𝑑𝑦. (35)
−∞ −∞

As an example, show that


E [𝑋 + 𝑌 ] = E [𝑋] + E [𝑌 ]

68 / 137
Expectation

If 𝑋 is a r.v. with mean 𝜇, then the


variance of 𝑋 is defined by
h i
Var (𝑋) =E (𝑋 − 𝜇) 2
h i
=E 𝑋 2 − 𝜇2 (36)
where
h i ∫ ∞
E 𝑋2 = 𝑥 2 𝑓 (𝑥)𝑑𝑥. (37)
−∞
69 / 137
Recommended Textbooks
Introduction
Descriptive Statistics
Numerical Descriptive Techniques
Graphical Descriptive Techniques
Elements of Probability
Random Variables
Discrete Probability Distribution
Binomial Distribution
Poisson Distribution
Continuous Random Variables
Joint Probability Density Function
Expectation and Variance
Moment Generating Function
Uniform Distribution
Exponential Distribution
Normal Distribution
Parameter Estimation
Hypothesis Testing
Introduction to Regression 70 / 137
mgf

The moment generating function


(mgf) 𝜙(𝑡) of the r.v. 𝑋 is defined
∀ 𝑡 by,
 𝑋𝑡 
𝜙(𝑡) =E 𝑒
∫ ∞
= 𝑒 𝑥𝑡 𝑓 (𝑥)𝑑𝑥 (38)
−∞
By successively differentiating 𝜙(𝑡),
all moments of the r.v. 𝑋 can be
obtained. 71 / 137
mgf

For example
0 𝑑  𝑋𝑡 
𝜙 (𝑡) = E 𝑒
𝑑𝑡 
𝑑 𝑋𝑡
=E 𝑒
𝑑𝑡
 𝑋𝑡 
=E 𝑋𝑒 (39)
Hence 𝜙0 (0) = E [𝑋]h i
Similarly 𝜙00 (0) = E 𝑋 2
72 / 137
mgf

In general, the 𝑛th derivative of 𝜙(𝑡)


evaluated at 𝑡 = 0 gives the 𝑛th
moment,
𝜙 (𝑛) (0) =E [𝑋 𝑛 ] , 𝑛≥1
∫ ∞
= 𝑥 𝑛 𝑓 (𝑥)𝑑𝑥. (40)
−∞

73 / 137
mgf

Suppose 𝑋 and 𝑌 are i.r.v. and have


mgfs 𝜙 𝑋 (𝑡) and 𝜙𝑌 (𝑡) respectively.
Then the mgf of 𝑋 + 𝑌 is given by
h i
𝜙 𝑋+𝑌 (𝑡) =E 𝑒 𝑡 (𝑋+𝑌 )
 𝑡 𝑋 𝑡𝑌 
=E 𝑒 𝑒
=𝜙 𝑋 (𝑡)𝜙𝑌 (𝑡). (41)

74 / 137
Example

A r.v. 𝑋, which represents the


weights (in kg) of an article has
density function given by
 𝑥 − 8, 8 ≤ 𝑥 < 9



𝑓 (𝑥) = 10 − 𝑥, 9 ≤ 𝑥 ≤ 10

 0
 otherwise
Compute: (a) the mean and variance
of 𝑋.
75 / 137
Example

The manufacturer sells the article for


a fixed price of $5.00. There is a
guarantee to refund the purchase
amount to any customer who finds
the weight of the article to be less
than 8.3kg. The cost of production is
related to the weight of the article as
𝑥
+ 0.35. Compute the expected
7.5
profit per article.
76 / 137
Recommended Textbooks
Introduction
Descriptive Statistics
Numerical Descriptive Techniques
Graphical Descriptive Techniques
Elements of Probability
Random Variables
Discrete Probability Distribution
Binomial Distribution
Poisson Distribution
Continuous Random Variables
Joint Probability Density Function
Expectation and Variance
Moment Generating Function
Uniform Distribution
Exponential Distribution
Normal Distribution
Parameter Estimation
77 / 137
Hypothesis Testing

Introduction to Regression

78 / 137
Uniform distribution

A r.v. 𝑋 is said to be uniformly


distributed over the interval [𝛼, 𝛽] if
its p.d.f. is given by
 1
, 𝛼≤𝑥≤𝛽
𝑓 (𝑥) = 𝛽−𝛼
0 otherwise
∫ 𝑏
1 𝑏−𝑎
P(𝑎 ≤ 𝑥 ≤ 𝑏) = 𝑑𝑥 =
𝛽−𝛼 𝑎 𝛽−𝛼
79 / 137
Uniform distribution

A r.v. 𝑋 is said to be uniformly


distributed over the interval [𝛼, 𝛽] if
its p.d.f. is given by
 1
, 𝛼≤𝑥≤𝛽
𝑓 (𝑥) = 𝛽−𝛼
0 otherwise
∫ 𝑏
1 𝑏−𝑎
P(𝑎 ≤ 𝑥 ≤ 𝑏) = 𝑑𝑥 =
𝛽−𝛼 𝑎 𝛽−𝛼
80 / 137
Example

The amount of time it takes for a student to complete


a statistics quiz is uniformly distributed between 30
and 60 minutes. One student is selected at random.
Find the probability of the following events.
(a) The student requires more than 55 minutes to
complete the quiz.
(b) The student completes the quiz in a time between
30 and 40 minutes.
(c) The student completes the quiz in exactly 37.23
minutes.

81 / 137
Recommended Textbooks
Introduction
Descriptive Statistics
Numerical Descriptive Techniques
Graphical Descriptive Techniques
Elements of Probability
Random Variables
Discrete Probability Distribution
Binomial Distribution
Poisson Distribution
Continuous Random Variables
Joint Probability Density Function
Expectation and Variance
Moment Generating Function
Uniform Distribution
Exponential Distribution
Normal Distribution
Parameter Estimation
82 / 137
Hypothesis Testing

Introduction to Regression

83 / 137
Exponential distribution

A continuous r.v. 𝑋 with parameter


𝜆 > 0 and p.d.f given by
 −𝜆𝑥
𝜆𝑒 , 𝑥 ≥ 0
𝑓 (𝑥) =
0 otherwise
is said to be an exponential r.v.
The mgf of an exponential r.v. 𝑋 is
given by
𝑡𝑥 𝜆
𝜙(𝑡) = E[𝑒 ] = (42)
𝜆−𝑡
Obtain the expected value and variance of 𝑋 84 / 137
Exponential distribution

The life-time of an alkaline battery


(measured in hours) is exponentially
distributed with 𝜆 = 0.02.
(a) What is the mean and std dev of
the battery life?
(b) Determine the probability that a
battery will last b/n 10 and 15
hours.
(c) Compute P(𝑋 > 20)
85 / 137
Recommended Textbooks
Introduction
Descriptive Statistics
Numerical Descriptive Techniques
Graphical Descriptive Techniques
Elements of Probability
Random Variables
Discrete Probability Distribution
Binomial Distribution
Poisson Distribution
Continuous Random Variables
Joint Probability Density Function
Expectation and Variance
Moment Generating Function
Uniform Distribution
Exponential Distribution
Normal Distribution
Parameter Estimation
86 / 137
Hypothesis Testing

Introduction to Regression

87 / 137
Normal Distribution

A r.v. 𝑋 is said to be normally


distributed with parameters 𝜇 and 𝜎 2
and we write 𝑋 ∼ N(𝜇, 𝜎 2), if its
p.d.f. is
 2 
1 (𝑥 − 𝜇)
𝑓 (𝑥) = √ exp −
𝜎 2𝜋 2𝜎 2
(43)
for all 𝑥 ∈ R
88 / 137
Normal distribution

The normal p.d.f. 𝑓 (𝑥) is a


bell-shaped curve that is symmetric
about 𝜇.

Figure 1: The bell-shaped Normal Distribution curve is symmetric about 𝜇


89 / 137
Normal distribution

The parameters 𝜇 and 𝜎 2 represent


the mean and variance of the
distribution respectively.
Thus
E[𝑋] = 𝜇, Var(𝑋) = 𝜎 2
Given a normal r.v. 𝑋 and constants
𝛼, 𝛽, the r.v. defined as 𝑌 = 𝛼𝑋 + 𝛽
is normally distributed with mean
𝛼𝜇 + 𝛽 and variance 𝛼2 𝜎 2.
90 / 137
Normal distribution

Therefore, if 𝑋 ∼ N(𝜇, 𝜎 2), then


𝑋−𝜇
𝑍= (44)
𝜎
is a normal r.v. with mean 0 and
variance 1.
The r.v 𝑍 is called the standard
normal distribution.
We shall obtain probabilities of a
normal r.v. by converting it into the
standard normal r.v. 91 / 137
Normal distribution
Thus,
 
𝑋−𝜇 𝑏−𝜇
P(𝑋 < 𝑏) =P <
𝜎 𝜎
 
𝑏−𝜇
=P 𝑍 <
𝜎
Similarly
 
𝑎−𝜇 𝑋−𝜇 𝑏−𝜇
P(𝑎 < 𝑋 < 𝑏) =P < <
𝜎 𝜎 𝜎
 
𝑎−𝜇 𝑏−𝜇
=P <𝑍<
𝜎 𝜎
 
𝑏−𝜇  𝑎 − 𝜇
=P 𝑍 < −P 𝑍 <
𝜎 𝜎
92 / 137
Normal distribution

Figure 2: P(𝑍 < −𝑎) and P(𝑍 > 𝑎)


Note that due to symmetry, we have
P(𝑍 < −𝑎) = P(𝑍 > 𝑎) (45)
93 / 137
Example of Normal distribution

Q1. If 𝑋 ∼ N(3, 16), compute


(a) P(𝑋 < 12)
(b) P(𝑋 < −2)
(c) P(3 < 𝑋 < 8)

94 / 137
Example of Normal distribution

Q2. The power 𝑊 dissipated in a


resistor is proportional to the square
of the voltage 𝑉 (i.e. 𝑊 = 𝑟𝑉 2).
If 𝑟 = 2.5 and 𝑉 can be assumed to
be normally distributed with mean 5
and standard deviation 1, compute
(a) E[𝑊]
(b) P(𝑊 > 150)
95 / 137
Recommended Textbooks
Introduction
Descriptive Statistics
Numerical Descriptive Techniques
Graphical Descriptive Techniques
Elements of Probability
Random Variables
Discrete Probability Distribution
Binomial Distribution
Poisson Distribution
Continuous Random Variables
Joint Probability Density Function
Expectation and Variance
Moment Generating Function
Uniform Distribution
Exponential Distribution
Normal Distribution
Parameter Estimation
96 / 137
Hypothesis Testing

Introduction to Regression

97 / 137
Parameter Estimation

We can use sample data to estimate a


population parameter in two ways.
A Point Estimator draws inferences
about a population by estimating the
value of an unknown parameter by
using a single value or a point.

98 / 137
Parameter Estimation

An Interval Estimator draws


inferences about a population by
estimating the value of an unknown
parameter by using an interval.
An Unbiased Estimator of a
population is an estimator whose
expected value is equal to that
parameter.
99 / 137
Parameter Estimation

Note: The sample mean 𝑥¯ is an


unbiased estimator of the population
¯ = 𝜇).
mean 𝜇 (i.e. E[ 𝑥]
An unbiased estimator is said to be
Consistent if the difference between
the estimator and the parameter
grows smaller as size grows larger.
𝜎2
¯ =
Var( 𝑥) (46)
𝑛
100 / 137
Parameter Estimation

If there are two or more unbiased


estimators for a parameter, the
estimator with the least variance is
said to have relative efficiency.

101 / 137
Confidence Intervals for the mean of a Normally Distributed Population

Known 𝜎 2
Suppose that 𝑥1, 𝑥2, · · · , 𝑥 𝑛 is a
sample from a normally distributed
population having an unknown mean
𝜇 but a known variance 𝜎 2.
Though 𝑥¯ is an unbiased estimator of
𝜇, we do not expect 𝑥¯ = 𝜇 but rather
𝑥¯ ≈ 𝜇.
102 / 137
Confidence Intervals for the mean of a Normally Distributed Population

Based on the Central Limit


Theorem (see page 204 of the
textbook), we have

𝑛( 𝑥¯ − 𝜇)
𝑧= (47)
𝜎
Therefore, ∃ 𝛼 s.t.
 √ 
𝑛( 𝑥¯ − 𝜇)
P −𝑧 𝛼/2 < < 𝑧 𝛼/2 = 1−𝛼
𝜎 103 / 137
Confidence Intervals for the mean of a Normally Distributed Population

The probability 1 − 𝛼 is called the


Confidence Level.
Therefore, the Confidence Interval
Estimate of 𝜇 for known variance is
given as
 
𝜎
𝜇 ∈ 𝑥¯ ± √ 𝑧 𝛼/2 (48)
𝑛
104 / 137
Confidence Intervals for Normal mean with an unknown 𝜎 2

Suppose that 𝑥1, 𝑥2, · · · , 𝑥 𝑛 is a


sample from a normally distributed
population having an unknown mean
𝜇 and variance 𝜎 2. To construct a
(1 − 𝛼) ∗ 100% confidence interval,
we define a new random variable

𝑛
𝑡= ( 𝑥¯ − 𝜇) (49)
𝑠
with 𝑛 − 1 degrees of freedom.
105 / 137
Confidence Intervals for Normal mean with an unknown 𝜎 2

Note that
𝑛
2 1 Õ
𝑠 = ¯ 2
(𝑥𝑖 − 𝑥)
𝑛−1
𝑖=1
and
 √ 
𝑛( 𝑥¯ − 𝜇)
P −𝑡 𝛼/2, 𝑛−1 < < 𝑡 𝛼/2, 𝑛−1 = 1−𝛼
𝑠

106 / 137
Confidence Intervals for Normal mean with an unknown 𝜎 2

Therefore, the Confidence Interval


Estimate of 𝜇 for an unknown
variance is given as
 
𝑠
𝜇 ∈ 𝑥¯ ± √ 𝑡 𝛼/2, 𝑛−1 (50)
𝑛

107 / 137
Confidence Intervals for Normal mean with an unknown 𝜎 2

Suppose that when a signal having value 𝜇 is


transmitted from location 𝐴, the value received at
location 𝐵 is normally distributed with mean 𝜇 and
variance 4. To reduce the error, the same value is
sent 9 times. If the sequence of values received are
5, 8.5, 12, 15, 7, 9, 7.5, 6.5, 10.5
Construct
I 95% confidence interval for 𝜇
I 99% confidence interval for 𝜇
I 95% and 99% confidence intervals for 𝜇
assuming the variance is unknown.

108 / 137
Confidence Intervals for Variance (𝜎 2 ) of a Normal Distribution

Suppose that 𝑥1, 𝑥2, · · · , 𝑥 𝑛 is a


sample from a normally distributed
population having an unknown mean
𝜇 and variance 𝜎 2. We can construct
a confidence interval for 𝜎 2 by using
the fact that the sample variance 𝑠2 is
an unbiased consistent estimator of
𝜎 2.
109 / 137
Confidence Intervals for Variance (𝜎 2 ) of a Normal Distribution

We define a new random variable


𝑠2 2
(𝑛 − 1) 2 ∼ 𝜒𝑛−1 (51)
𝜎
2 is known as the
where 𝜒𝑛−1
chi-squared distribution with 𝑛 − 1
degrees of freedom.

110 / 137
Confidence Intervals for 𝜎 2

Hence
 2 
2 𝑠 2
P 𝜒1−𝛼/2, 𝑛−1
≤ (𝑛 − 1) 2
≤ 𝜒𝛼/2, 𝑛−1
= 1−𝛼
𝜎

!
(𝑛 − 1)𝑠2 2 (𝑛 − 1)𝑠2
P 2
≤𝜎 ≤ 2 = 1−𝛼
𝜒𝛼/2, 𝑛−1 𝜒1−𝛼/2, 𝑛−1
(52)
111 / 137
Confidence Intervals for 𝜎 2

The weights of a random samples of


cereal boxes that are supposed to be
weighing 1kg are listed below.
1.05, 1.03, 0.98, 1.0, 0.99, 0.97,
1.01, 0.96.
Estimate the variance of the entire
population of cereal box weights
with 90% confidence.
112 / 137
Recommended Textbooks
Introduction
Descriptive Statistics
Numerical Descriptive Techniques
Graphical Descriptive Techniques
Elements of Probability
Random Variables
Discrete Probability Distribution
Binomial Distribution
Poisson Distribution
Continuous Random Variables
Joint Probability Density Function
Expectation and Variance
Moment Generating Function
Uniform Distribution
Exponential Distribution
Normal Distribution
Parameter Estimation
113 / 137
Hypothesis Testing

Introduction to Regression

114 / 137
Hypothesis Testing

Instead of constructing a confidence


interval for a parameter of a
population with a known
distribution, we shall make an
emphatic statement about the
parameter and then use the available
sample data to test the validity or
otherwise of our statement.
115 / 137
H.T for the Mean of a Normally Distributed Population

Suppose that 𝑥1, 𝑥2, · · · , 𝑥 𝑛 is a


sample of size 𝑛 from a population,
which is normally distributed with
an unknown mean 𝜇 and a known
(unknown) variance 𝜎 2.

116 / 137
H.T for the Mean of a Normally Distributed Population

Suppose we are interested in testing


the null hypothesis
𝐻0 : 𝜇 = 𝜇 0 (53)
against the alternative hypothesis
𝐻1 : 𝜇 ≠ 𝜇 0 (54)
where 𝜇0 is a specified constant.
117 / 137
H.T for the Mean of a Normally Distributed Population

Decision 𝐻0 is True 𝐻0 is False


Reject 𝐻0 Type I Error Correct Decision
Do not reject 𝐻0 Correct Decision Type II Error

The significant level 𝛼 test is to


reject 𝐻0 if
𝜎
| 𝑥¯ − 𝜇0 | > 𝑧 𝛼/2 √ (55)
𝑛
and accept otherwise.
118 / 137
H.T for the Mean of a Normally Distributed Population
Thus to say that

𝑛
I Reject 𝐻0 if 𝑥¯ − 𝜇0 > 𝑧 𝛼/2

𝜎

𝑛
I Accept 𝐻0 if 𝑥¯ − 𝜇0 ≤ 𝑧 𝛼/2

𝜎

119 / 137
One-Sided Hypothesis Tests

When the testing statement is given


as follows:
𝐻0 : 𝜇 = 𝜇 0
(56)
𝐻1 : 𝜇 > 𝜇 0
we reject 𝐻0 when 𝑥,
¯ the point
estimate of 𝜇0 is much greater than
𝜇0

120 / 137
One-Sided Hypothesis Tests

Thus √ 
𝑛 
I Reject 𝐻0 if 𝑥¯ − 𝜇0 > 𝑧 𝛼
𝜎
√ 
𝑛 
I Accept 𝐻0 if 𝑥¯ − 𝜇0 ≤ 𝑧 𝛼
𝜎
The decision criteria is called
one-sided critical region.

121 / 137
Summary

𝐻0 𝐻1 Test√Statistic (TS) Significant Level 𝛼 Test


𝑛 
𝜇 = 𝜇0 𝜇 ≠ 𝜇0 √𝜎
¯
𝑥 − 𝜇 0 Reject 𝐻0 if |𝑇 𝑆| > 𝑧 𝛼/2
𝑛 
𝜇 ≤ 𝜇0 𝜇 > 𝜇0 √𝜎 ¯
𝑥 − 𝜇 0 Reject 𝐻0 if 𝑇 𝑆 > 𝑧 𝛼
𝑛 
𝜇 ≥ 𝜇0 𝜇 < 𝜇0 𝜎 ¯
𝑥 − 𝜇 0 Reject 𝐻0 if 𝑇 𝑆 < −𝑧 𝛼

122 / 137
Examples

Test the following


𝐻 : 𝜇 = 100
I 0 with 𝜎 = 10,
𝐻1 : 𝜇 ≠ 100
𝑛 = 100, 𝑥¯ = 100 and 𝛼 = 0.05
𝐻 : 𝜇 = 50
I 0 with 𝜎 = 15,
𝐻1 : 𝜇 < 50
𝑛 = 100, 𝑥¯ = 48 and 𝛼 = 0.05
𝐻 : 𝜇 = 50
I 0 with 𝜎 = 5, 𝑛 = 9,
𝐻1 : 𝜇 > 50
𝑥¯ = 51 and 𝛼 = 0.03
123 / 137
Example

A random sample of 18 young adult men (20 - 30 yrs


old) were sampled. Each person was asked how
many minutes of sport he watched on TV daily. The
responses are listed below.
64, 50, 48, 65, 74, 66, 37, 45, 68, 65, 58, 55, 52, 63,
59, 57, 74, 65
Test to determine at 5% significance level whether
there is enough statistical evidence to infer that the
mean amount of TV watched daily by all young men
is greater than 50 minutes.

124 / 137
Recommended Textbooks
Introduction
Descriptive Statistics
Numerical Descriptive Techniques
Graphical Descriptive Techniques
Elements of Probability
Random Variables
Discrete Probability Distribution
Binomial Distribution
Poisson Distribution
Continuous Random Variables
Joint Probability Density Function
Expectation and Variance
Moment Generating Function
Uniform Distribution
Exponential Distribution
Normal Distribution
Parameter Estimation
125 / 137
Hypothesis Testing

Introduction to Regression

126 / 137
Introduction to Regression

Regression analysis is a technique


for developing mathematical model
that describes the relationship
between a set of variables.
In many situations, there is a single
variable 𝑌 , which depends on other
set of variables 𝑥1, 𝑥2, · · · , 𝑥𝑟 .

127 / 137
Introduction to Regression

The simplest type of relationship


between 𝑌 and the input variables
𝑥1, 𝑥2, · · · , 𝑥𝑟 , is a linear
relationship.
Thus
𝑌 = 𝛽0 + 𝛽1𝑥1 + 𝛽2𝑥2 +· · ·+ 𝛽𝑟 𝑥𝑟 (57)

128 / 137
Introduction to Regression

However, (57) is almost never


attainable. The best that can be
obtained is a relationship subject to
random error.
Thus a linear regression equation is
𝑌 = 𝛽0 + 𝛽1𝑥1 + 𝛽2𝑥2 + · · · + 𝛽𝑟 𝑥𝑟 + 𝜉
(58)
with E[𝜉] = 0. Hence
E[𝑌 |𝑥] = 𝛽0 + 𝛽1𝑥1 + 𝛽2𝑥2 + · · · + 𝛽𝑟 𝑥𝑟
129 / 137
Introduction to Regression

The constants 𝛽𝑖 ∀ 𝑖 = 0, 1, · · · , 𝑟 are


called the regression coefficients and
are usually estimated from a data set.
A regression equation containing a
single independent variable (i.e.
𝑟 = 1) is called a simple regression
equation whereas an equation
containing many independent
variables (i.e. 𝑟 > 1) is called a
multiple regression equation. 130 / 137
Introduction to Regression

Least Squares Estimators of


Regression Parameters
Consider a simple linear regression
equation
𝑌 = 𝛽0 + 𝛽1 𝑥 + 𝜉 (59)
Then we can rewrite the equation as
𝑌 − 𝛽0 − 𝛽1 𝑥 = 𝜉

131 / 137
Introduction to Regression

We define the sum of squared errors


as
Õ𝑛
𝑆𝑆 = (𝑌𝑖 − 𝛽0 − 𝛽1𝑥𝑖 ) 2 (60)
𝑖=1
The least squares method chooses
estimators of 𝛽0 and 𝛽1 that
minimizes SS
132 / 137
Introduction to Regression

The lest square method estimates


𝛽0 = 𝑦¯ − 𝛽1𝑥¯ (61)
𝑆𝑥𝑦
𝛽1 = 2 (62)
𝑆𝑥
where
𝑛

𝑥¯ = 𝑥𝑖
𝑛
𝑖=1
133 / 137
Introduction to Regression

𝑛

𝑦¯ = 𝑦𝑖
𝑛
𝑖=1
𝑛
1 Õ
𝑆𝑥𝑦 = 𝑥𝑖 𝑦𝑖 − 𝑛𝑥¯ 𝑦¯
𝑛−1
𝑖=1
𝑛
1 Õ
𝑆𝑥2 = 𝑥𝑖2 − 𝑛𝑥¯ 2
𝑛−1
𝑖=1
134 / 137
Introduction to Regression
Attempting to analyze the
relationship between advertising and
sales, the owner of a furniture store
recorded the monthly advertising
budget ($) and sales ($1,000) for 8
months as follows
Advert 23 46 60 28 33 25 31 36
Sales 9.6 11.3 12.8 8.9 12.5 12.0 11.4 12.6
How much should the store spend on
adverting if a sales value of $50,000
is desired? 135 / 137
Thank you

136 / 137
References
[1] Navidi, W.
Statistics for Engineers and Scientists, fourth ed.
McGraw-Hill, 2015.
[2] Ross, S.
Introduction to Probability and Statistics for Engineers and
Scientists, fourth ed.
Elsevier Academic Press, 2009.

137 / 137

You might also like