Basics of Statistics

Basics of Statistics
Review of Statistics:
What do we have Committee in decision-making process?
In statistics we collect data, then we summarize data and take decision based on the summarize and
sometimes further analysis
→Mean
→Variance & standard deviation
Mean: Mean represents central tendency of the data values, These leads us to the nation “random
variables.”
Suppose you want to invest money in some barrier. This is the only asset you have. If you loss it you will
be ant of the street
A 15 % B 50%
It is granted. Here 60% chance you will loose it all.
You are very rich. You can easily absorbed the loss.
M = (mu) Proportionating mean.

Random variables model the uncertainties that we face before taking a decision which is so very true to
business.
The random variables can be qualitative or quantitative. We are going to study quantitative random
variables.
Quantitative variables are of two types: Discrete variables and Continuous variable
Discrete Variable:
Discrete random variables assume specific values on the number line which could be both fractional and
whole number. The values can be set an ordinal array. It’s countable.
Continuous Variable:
The continuous random variable assumes values from a continuous interval on the number line. By the
very nature of the random variable there are infinite values that the random variable can assume.
Difference between continuous and discrete variable:

Aspects/Nature of random variable:
Whatever kind of random variable we are facing we need to know the two aspects of its nature.
 The set of values it can assume. This is called the sample space and is denoted by S.
 An idea of the likelihood of each value.
Both the piece of information are included in what we call the distribution of random variable.
[0,10] means x ranges from 0 to 10. Both 0 and 10 are included.
(0,10) means x ranges from 0 to 10 but 0 & 10 are not included
(0,10] means x ranges from 0 to 10 but 0 not included
{1,1,2,2,4} means x includes 4 values.

How do we write distribution?
This is an example of tabular distribution. Here f(x) is the probability mass function. In case of discrete
distribution the sum of the probability masses must always exactly 1.0
Example: Casting a fair or unbiased six sided dice
Event: Each value the random variable can take can be called an event. Such as x<3 is an event.
Here 1,1,5 & 2 are less than 3.
The probability of an event is equal to the sum of the probabilities of the outcome that define an event.
P(x<3)=0.1+0.2+0.15=0.45
Information of a distribution:
There are two pieces of summarized information we always need to know about any distribution.
1. Mean
2. Standard deviation
Expected value of X: E(X)
Standard Deviation:
Using Excel:
X F(X) E(X) xi-sum E(X) (xi-sum E(X))^2 f(X)*(xi-sum E(X))^2

1 0.1 0.1 -1.69 2.8561 0.28561
1.5 0.2 0.3 -1.19 1.4161 0.28322
2 0.15 0.3 -0.69 0.4761 0.071415
3 0.05 0.15 0.31 0.0961 0.004805
3.2 0.2 0.64 0.51 0.2601 0.05202
4 0.3 1.2 1.31 1.7161 0.51483
Varianc
2.69 1.2119 e
Varianc
e 1.2119
1.10086
SD 3
Cumulative Probability:
We are mostly going to work with normal distribution.
 It’s a continuous distribution

 It is bell shaped
 Its perfectly symmetric about the mean.
From the curve we see that ő1< ő2
How to write to state a normal distribution:
xeN(µ, ő)
It means x is distributed normally with a mean and standard deviation.
xeN(100,25)
x is distributed normally with a mean of 100 and standard deviation of 25.
Two questions:
1. How to compute or determine probabilities associated with normal distribution?
2. If we already know the probability how to trace the value of random variable?
Remember in a continuous distribution, f(x) is a mathematical function called probability density

function. For discrete distribution it is called probability mass function.
The curve is called probability density function (PDF) curve.
How do we calculate probability?
P(x=a)=0 in continuous distribution where a is a specific value from the sample space.
In case of continuous distribution
P(x=18)=0
P(x=12)=0
( as in continuous distribution we have infinite samples)
P(x=a) is to be read as probability of x to be exactly equal to a
The sum of probability of all outcome is equal to 1.
In case of continuous distribution we normally calculate the probability of intervals.

Z score:
It is the distance of random variable from the mean in units of SD.
Standard Normal Distribution:
We can find Z(x) for any X of a distribution.
In the case of normal distribution Z(x) associated with if is
 Always normally distributed

 It always has a mean of zero
 It always has a SD of one
Z is actually called standard normal distribution, All other distributions are arbitrary distribution but they
can be converted to standard normal distribution.
Standard normal distribution has a mean of 0 and SD of 1
Sum of independent random variables:

Consider three processes, First is admission application, Second is summary of application s and finally
decision making.
What is the distribution of total production time per unit? T
Another question:
Labor cost for P1, W1=$20/hour for P2, W2=$40/hour for P3, W3=$60/hour
What is the distribution of per unit labor cost?
Here we are considering time as independent variable.
How do we determine that two random variables are independent?
Such as exploring relationship between taking coffee and GPA
Consider X= no of coffee cups taken per day
Y=GPA.
Covariance (x,y)=
∑ (xi−µx )( yi−µy) will be zero if two variables are independent, otherwise it will
n
have positive or negative value.
Positive value means direct relationship between x and y, and negative value means inverse
relationship.
With the assumption of independence we want the following results.
 T is normally distributed
 The mean or expected value of T, E(T)=E(t1)+E(t2)+E(t3)
=60+120+180=360 minutes
As T is the sum of random variable t1, t2 & t3, so the mean or expected value of (T) will be sum of mean
of t1, t2 & t3.
Now we have to determine standard deviation but for calculating SD we need to calculate variance first.
Var(T)=Var(t1)+Var(t2)+Var(t3)
=152+302+602=225+900+3600=4725 min2
Standard Deviation ő T =√ VAR(T )=√ 4725 min2 =68.74 min
The answer is
Now what is the chance that a product will take more than 7 hours to produce?
Now 7 hours=420 mins
Question is P(T)>420 mins
=1-NORM.DIST(420,360,68.74,1)
=0.191371
Note:
For calculating P(T) less than value need to use norm.dist
More than value=1-norm.dist
Solution for distribution of labor cost:
W=labor cost at P1+Labor cost at P2+Labor cost at P3
t1 t2 t3
=W1* + W2* + W3*
60 60 60
t1 t2 t3
= 20* + 40* + 60*
60 60 60
t 1 2t 2
= + +t3
3 3
Assumption
1. W is normally distributed
2. E(W)=(1/3)E(t1)+(2/3)E(t2)=E(t3)
=(60/3)+(2*120/3)+180
=20+80+180=$280
Now we need to calculate standard deviation and for calculating this we need to calculate
variance
Var(W)=
=((1/3)*15)^2+((2/3)*30)^2+(60)^2
=(5)^2+(20)^2+(60)^2
=25+400+3600
=4025
So SD ő=√ 4025 =63.44
So the answer is
Summarizing Theory:
Here x is considered normally distributed and a is the coefficient for every i
Then the below assumption holds
1. Y is normally distributed
=(a1őx1)^2+(a2őx2)^2+(a3őx3)^2
means for every value of i

Basics of Statistics

Uploaded by

Copyright:

Available Formats

You might also like

Basics of Statistics

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Basics of Statistics

Uploaded by

Copyright:

Available Formats

Basics of Statistics

What do we have Committee in decision-making process?

→Variance & standard deviation

It is granted. Here 60% chance you will loose it all.

M = (mu) Proportionating mean.

Difference between continuous and discrete variable:

[0,10] means x ranges from 0 to 10. Both 0 and 10 are included.

(0,10) means x ranges from 0 to 10 but 0 & 10 are not included

(0,10] means x ranges from 0 to 10 but 0 not included

{1,1,2,2,4} means x includes 4 values.

Example: Casting a fair or unbiased six sided dice

Expected value of X: E(X)

X F(X) E(X) xi-sum E(X) (xi-sum E(X))^2 f(X)*(xi-sum E(X))^2

 It’s a continuous distribution

From the curve we see that ő1< ő2

How to write to state a normal distribution:

It means x is distributed normally with a mean and standard deviation.

x is distributed normally with a mean of 100 and standard deviation of 25.

Remember in a continuous distribution, f(x) is a mathematical function called probability density

The curve is called probability density function (PDF) curve.

How do we calculate probability?

In case of continuous distribution

( as in continuous distribution we have infinite samples)

P(x=a) is to be read as probability of x to be exactly equal to a

The sum of probability of all outcome is equal to 1.

In case of continuous distribution we normally calculate the probability of intervals.

It is the distance of random variable from the mean in units of SD.

Standard Normal Distribution:

We can find Z(x) for any X of a distribution.

In the case of normal distribution Z(x) associated with if is

 Always normally distributed

Sum of independent random variables:

What is the distribution of total production time per unit? T

What is the distribution of per unit labor cost?

Here we are considering time as independent variable.

How do we determine that two random variables are independent?

Such as exploring relationship between taking coffee and GPA

Consider X= no of coffee cups taken per day

With the assumption of independence we want the following results.

Now 7 hours=420 mins

Question is P(T)>420 mins

For calculating P(T) less than value need to use norm.dist

More than value=1-norm.dist

Solution for distribution of labor cost:

W=labor cost at P1+Labor cost at P2+Labor cost at P3

So SD ő=√ 4025 =63.44

Here x is considered normally distributed and a is the coefficient for every i

Then the below assumption holds

means for every value of i

You might also like