Professional Documents
Culture Documents
Uzair Talpur 1811162 Bba 4B Statistical Inference Assignment
Uzair Talpur 1811162 Bba 4B Statistical Inference Assignment
1811162
BBA 4B
Assignment 1
Inferential statistics are used to draw inferences about a population from a sample. Consider
an experiment in which 10 subjects who performed a task after 24 hours of sleep deprivation
scored 12 points lower than 10 subjects who performed after a normal night's sleep. Is the
difference real or could it be due to chance? How much larger could the real difference be
than the 12 points found in the sample? These are the types of questions answered by
inferential statistics.
There are two main methods used in inferential statistics: estimation and
hypothesis testing. In estimation, the sample is used to estimate a parameter andconfidence
interval about the estimate is constructed. In the most common use of hypothesis testing, a
"straw man" null hypothesis is put forward and it is determined whether the data are strong
enough to reject it. For the sleep deprivation study, the null hypothesis would be that sleep
deprivation has no effect on performance.
Unbiasedness
Sample variance
The sample mean 𝑥ҧis an unbiased estimator of the population mean 𝜇; because 𝐸 𝑥ҧ = 𝜇
The sample variance 𝑆 2 is a biased estimator of the population variance 𝜎 2 ; because
An unbiased estimator of the population variance 𝜎 2 is given by
Because
The distinction between 𝑆 2 and 𝑠 2 in which only he denominators are different. 𝑆 2 is the
variance of the sample observations, but 𝑠 2 is the ‘unbiased estimator’ of the variance (𝜎 2 )
in the population.
Consistency
A desirable property of good estimator is that its accuracy should increase when the sample
becomes larger. That is, the estimator is expected to come closer to the parameter as the size
of the sample increases.
A statistic 𝑡𝑛 computed from a sample of n observations is said to be a Consistent Estimator
of a parameter 𝜃, if it converges in probability to 𝜃 as n tends to infinity. This means that the
larger the sample size (n), the less is the chance that the difference between 𝑡𝑛 and 𝜃 will
exceed any fixed value. In symbols, given any arbitrary small positive quantity
Efficiency
If we confine ourselves to unbiased estimates, there will, in general, exist more than one
consistent estimator of a parameter. For example, in sampling from a normal population 𝑁 𝜇,
𝜎 2 , when 𝜎 2 is known, sample mean 𝑥ҧis an unbiased and consistent estimator of 𝜇. From
symmetry it follows immediately the sample median (Md) is an unbiased estimate of 𝜇.
Which is same as the population median. Also for large n,
Sufficiency
Where g(t, 𝜃) is the sampling distribution of t and contains 𝜃, but ℎ (𝑥1, 𝑥2,… , 𝑥𝑛) is
independent of 𝜃.
Since the parameter 𝜃 is occurring in the joint distribution of all the sample observations can
be contained in the distribution of the statistic t, it is said that t alone can provide all
‘information’ about 𝜃 and is therefore “sufficient” for 𝜃.
Sufficient estimators are the most desirable kind of estimators, but unfortunately they exist in
only relatively few cases. If a sufficient estimator exists, it can be found by the method of
maximum likelihood.
In random sampling from a Normal population 𝑁 (𝜇, 𝜎 2) , the sample mean 𝑥ҧis a sufficient
estimator of 𝜇.
Assignment 2
For example, when we toss a coin, either we get Head OR Tail, only two possible outcomes
are possible (H, T). But if we toss two coins in the air, there could be three possibilities of
events to occur, such as both the coins show heads or both show tails or one shows heads and
one tail, i.e.(H, H), (H, T),(T, T).
Types
Theoretical Probability
It is based on the possible chances of something to happen. The theoretical probability is
mainly based on the reasoning behind probability. For example, if a coin is tossed, the
theoretical probability of getting a head will be ½.
Experimental Probability
It is based on the basis of the observations of an experiment. The experimental
probability can be calculated based on the number of possible outcomes by the total number
of trials. For example, if a coin is tossed 10 times and heads is recorded 6 times then, the
experimental probability for heads is 6/10 or, 3/5.
Axiomatic Probability
In axiomatic probability, a set of rules or axioms are set which applies to all types. These
axioms are set by Kolmogorov and are known as Kolmogorov’s three axioms. With the
axiomatic approach to probability, the chances of occurrence or non-occurrence of the events
can be quantified. The axiomatic probabilitylesson covers this concept in detail with
Kolmogorov’s three rules (axioms) along with various examples.
Conditional Probability is the likelihood of an event or outcome occurring based on the
occurrence of a previous event or outcome.
Events In Probability
The sample space for the tossing of three coins simultaneously is given by:
S = {(T , T , T) , (T , T , H) , (T , H , T) , (T , H , H ) , (H , T , T ) , (H , T , H) , (H , H, T) ,(H
, H , H)}
Suppose, if we want to find only the outcomes which have at least two heads; then the set of
all such possibilities can be given as:
E = { (H , T , H) , (H , H ,T) , (H , H ,H) , (T , H , H)}
Thus, an event is a subset of the sample space, i.e., E is a subset of S.
There could be a lot of events associated with a given sample space. For any event to occur,
the outcome of the experiment must be an element of the set of event E.
Types
Impossible and Sure Events
If the probability of occurrence of an event is 0, such an event is called an impossible
event and if the probability of occurrence of an event is 1, it is called a sure event. In other
words, the empty set ϕ is an impossible event and the sample space S is a sure event.
Simple Events
Any event consisting of a single point of the sample space is known as a simple event in
probability. For example, if S = {56 , 78 , 96 , 54 , 89} and E = {78} then E is a simple event.
Compound Events
Contrary to the simple event, if any event consists of more than one single point of the
sample space then such an event is called a compound event. Considering the same example
again, if S = {56 ,78 ,96 ,54 ,89}, E1 = {56 ,54 }, E2 = {78 ,56 ,89 } then, E1 and
E2 represent two compound events.
Exhaustive Events
A set of events is called exhaustive if all the events together consume the entire sample space.
Complementary Events
For any event E1 there exists another event E1‘ which represents the remaining elements of
the sample space S.
E1 = S − E1‘
If a dice is rolled then the sample space S is given as S = {1 , 2 , 3 , 4 , 5 , 6 }. If event
E1 represents all the outcomes which is greater than 4, then E1 = {5, 6} and E1‘ = {1, 2, 3,
4}.
Thus E1‘ is the complement of the event E1.
Similarly, the complement of E1, E2, E3……….En will be represented as E1‘, E2‘,
E3‘……….En‘
Just as one die has six outcomes and two dice have 62 = 36 outcomes, the probability
experiment of rolling three dice has 63 = 216 outcomes. This idea generalizes further for
more dice. If we roll n dice then there are 6n outcomes.
We can also consider the possible sums from rolling several dice. The smallest possible sum
occurs when all of the dice are the smallest, or one each. This gives a sum of three when we
are rolling three dice. The greatest number on a die is six, which means that the greatest
possible sum occurs when all three dice are sixes. The sum of this situation is 18.
When n dice are rolled, the least possible sum is n and the greatest possible sum is 6n.
There is one possible way three dice can total 3
3 ways for 4
6 for 5
10 for 6
15 for 7
21 for 8
25 for 9
27 for 10
27 for 11
25 for 12
21 for 13
15 for 14
10 for 15
6 for 16
3 for 17
1 for 18
As discussed above, for three dice the possible sums include every number from three to 18.
The probabilities can be calculated by using counting strategy and recognizing that we are
looking for ways to partition a number into exactly three whole numbers. For example, the
only way to obtain a sum of three is 3 = 1 + 1 + 1. Since each die is independent from the
others, a sum such as four can be obtained in three different ways:
1+1+2
1+2+1
2+1+1
Further counting arguments can be used to find the number of ways of forming the other
sums. The partitions for each sum follow:
3=1+1+1
4=1+1+2
5=1+1+3=2+2+1
6=1+1+4=1+2+3=2+2+2
7=1+1+5=2+2+3=3+3+1=1+2+4
8=1+1+6=2+3+3=4+3+1=1+2+5=2+2+4
9=6+2+1=4+3+2=3+3+3=2+2+5=1+3+5=1+4+4
10 = 6 + 3 + 1 = 6 + 2 + 2 = 5 + 3 + 2 = 4 + 4 + 2 = 4 + 3 + 3 = 1 + 4 + 5
11 = 6 + 4 + 1 = 1 + 5 + 5 = 5 + 4 + 2 = 3 + 3 + 5 = 4 + 3 + 4 = 6 + 3 + 2
12 = 6 + 5 + 1 = 4 + 3 + 5 = 4 + 4 + 4 = 5 + 2 + 5 = 6 + 4 + 2 = 6 + 3 + 3
13 = 6 + 6 + 1 = 5 + 4 + 4 = 3 + 4 + 6 = 6 + 5 + 2 = 5 + 5 + 3
14 = 6 + 6 + 2 = 5 + 5 + 4 = 4 + 4 + 6 = 6 + 5 + 3
15 = 6 + 6 + 3 = 6 + 5 + 4 = 5 + 5 + 5
16 = 6 + 6 + 4 = 5 + 5 + 6
17 = 6 + 6 + 5
18 = 6 + 6 + 6
When three different numbers form the partition, such as 7 = 1 + 2 + 4, there are 3! (3x2x1)
different ways of permutingthese numbers. So this would count toward three outcomes in the
sample space. When two different numbers form the partition, then there are three different
ways of permuting these numbers.
Specific Probabilities
We divide the total number of ways to obtain each sum by the total number of outcomes in
the sample space, or 216. The results are:
Probability of a sum of 3: 1/216 = 0.5%
Probability of a sum of 4: 3/216 = 1.4%
Probability of a sum of 5: 6/216 = 2.8%
Probability of a sum of 6: 10/216 = 4.6%
Probability of a sum of 7: 15/216 = 7.0%
Probability of a sum of 8: 21/216 = 9.7%
Probability of a sum of 9: 25/216 = 11.6%
Probability of a sum of 10: 27/216 = 12.5%
Probability of a sum of 11: 27/216 = 12.5%
Probability of a sum of 12: 25/216 = 11.6%
Probability of a sum of 13: 21/216 = 9.7%
Probability of a sum of 14: 15/216 = 7.0%
Probability of a sum of 15: 10/216 = 4.6%
Probability of a sum of 16: 6/216 = 2.8%
Probability of a sum of 17: 3/216 = 1.4%
Probability of a sum of 18: 1/216 = 0.5%
As can be seen, the extreme values of 3 and 18 are least probable. The sums that are exactly
in the middle are the most probable. This corresponds to what was observed when two dice
were rolled.
Assignment 3
The term Venn diagram is not foreign since we all have had Mathematics, especially
Probability and Algebra. Now, for a layman, the Venn diagram is a pictorial exhibition of all
possible real relations between a collection of varying sets of items. It is made up of several
overlapping circles or oval shapes, with each representing a single set or item.
Venn diagrams depict complex and theoretical relationships and ideas for a better and easier
understanding. These diagrams are also professionally utilized to display complex
mathematical concepts by professors, classification in science, and develop sales strategies in
the business industry.
The first Venn diagram example is in Mathematics. They are accessible when covering Sets
Theory and Probability topics.
In the diagram below, there are two sets, A = {1, 5, 6, 7, 8, 9, 10, 12} and B = {2, 3, 4, 6, 7,
9, 11, 12, 13}. The section where the two sets overlap has the numbers contained in both Set
A and B, referred to as the intersection of A and B. The two sets put together, gives their
union which comprises of all the objects in A, B which are {1 2 3 4 5 6 7 8 9 10 11 12 13}.
Assignment 4
The multiplication rule is a way to find the probability of two events happening at the same
time (this is also one of the AP Statistic formula). There are two multiplication rules. The
general multiplication rule formula is: P(A ∩ B) = P(A) P(B|A) and the specific
multiplication rule is P(A and B) = P(A) * P(B). P(B|A) means “the probability of A
happening given that B has occurred”.
The specific multiplication rule, P(A and B) = P(A) * P(B), is only valid if the two events are
independent. In other words, it only works if one event does not change the probability of the
other event.
Examples of independent events :
Owning a cat and getting a weekly paycheck.
Finding a parking space and having a coin for the meter.
Buying a book and then buying a coffee.
Assignmnt 5
Suppose you are ordering a sandwich at the deli. There are 5 choices for bread, 4 choices for
meat, 12 choices for vegetables, and 3 choices for a sauce. How many different sandwiches
can be ordered? If you choose a sandwich at random, what's the probability that you get
turkey and mayonnaise on your sandwich?
In order to answer this probability question you need to know:
The total number of sandwiches that can be ordered.
The number of sandwiches that can be ordered that involve turkey and mayonnaise.
In each case, you can use the fundamental counting principle to help.
A sandwich is made by choosing a bread, a meat, a vegetable, and a sauce. There are 5
outcomes for the event of choosing bread, 4 outcomes for the event of choosing meat, 12
outcomes for the event of choosing vegetables, and 3 outcomes for the event of choosing a
sauce. The total number of sandwiches that can be ordered is: 5⋅4⋅12⋅3=720
A sandwich with turkey and mayonnaise is made by choosing a bread, turkey, a vegetable,
and mayonnaise. There are 5 outcomes for the event of choosing bread, there is 1 outcome for
the event of choosing turkey, there are 12 outcomes for the event of choosing vegetables, and
there is 1 outcome for the event of choosing mayonnaise. The total number of sandwiches
with turkey and mayonnaise that can be ordered is: 5⋅1⋅12⋅1=60
The probability of a sandwich with turkey and mayonnaise is 60720=112.
Discrete random variables can take on either a finite or at most a countably infinite set of
discrete values (for example, the integers). Their probability distribution is given by
a probability mass function which directly maps each value of the random variable to a
probability. For example, the value of x1x1 takes on the probability p1p1, the value of x2x2
takes on the probability p2p2, and so on. The probabilities pipi must satisfy two
requirements: every probability pipi is a number between 0 and 1, and the sum of all the
probabilities is 1. ($p_1+p_2+\dots + p_k = 1$)
Continuous random variables, on the other hand, take on values that vary continuously within
one or more real intervals, and have a cumulative distribution function (CDF) that is
absolutely continuous. As a result, the random variable has an uncountable infinite number of
possible values, all of which have probability 0, though ranges of such values can have
nonzero probability. The resulting probability distribution of the random variable can be
described by a probability density, where the probability is found by taking the area under the
curve.
Several specialized discrete probability distributions are useful for specific applications. For
business applications, three frequently used discrete distributions are:
Binomial
Geometric
Poisson
You use the binomial distribution to compute probabilities for a process where only one of
two possible outcomes may occur on each trial. The geometric distribution is related to the
binomial distribution; you use the geometric distribution to determine the probability that a
specified number of trials will take place before the first success occurs. You can use
the Poisson distribution to measure the probability that a given number of events will occur
during a given time frame.
1. Bernoulli Distribution
2. Binomial Distribution
3. Hypergeometric Distribution
4. Negative Binomial Distribution
5. Geometric Distribution
6. Poisson Distribution
7. Multinomial Distribution
Assignmnt 6
Assignment 7
The median is known as a measure of location; that is, it tells us where the data are. As stated
in , we do not need to know all the exact values to calculate the median; if we made the
smallest value even smaller or the largest value even larger, it would not change the value of
the median. Thus the median does not use all the information in the data and so it can be
shown to be less efficient than the mean or average, which does use all values of the
data. The range is an important measurement, for figures at the top and bottom of it denote
the findings furthest removed from the generality. However, they do not give much indication
of the spread of observations about the mean. This is where the standard deviation (SD)
comes in. The theoretical basis of the standard deviation is complex and need not trouble the
ordinary user. We will discuss sampling and populations in Chapter 3. A practical point to
note here is that, when the population from which the data arise have a distribution that is
approximately “Normal” (or Gaussian), then the standard deviation provides a useful basis
for interpreting the data in terms of probability. The Normal distribution is represented by a
family of curves defined uniquely by two parameters, which are the mean and the standard
deviation of the population. The curves are always symmetrically bell shaped, but the extent
to which the bell is compressed or flattened out depends on the standard deviation of the
population. However, the mere fact that a curve is bell shaped does not mean that it
represents a Normal distribution, because other distributions may have a similar sort of shape.
Mathematical expectation(advantages)
Assignmnt 8
Take 20 items from a departmental store and its prices then calculate
Arithemtic mean and standard deviation and make a graph plot values.
Assignmnet 9
-hypothesis
Hypothesis is an assumption that is made on the basis of some evidence. This is the initial
point of any investigation that translates the research questions into a prediction. It includes
components like variables, population and the relation between the variables. A research
hypothesis is a hypothesis that is used to test the relationship between two or more variables.
-level of significance
The significance level, also known as alpha or α, is a measure of the strength of the evidence
that must be present in your sample before you will reject the null hypothesis and conclude
that the effect is statistically significant. The researcher determines the significance level
before conducting the experiment.
The significance level is the probability of rejecting the null hypothesis when it is true. For
example, a significance level of 0.05 indicates a 5% risk of concluding that a difference exists
when there is no actual difference. Lower significance levels indicate that you require
stronger evidence before you will reject the null hypothesis.