Professional Documents
Culture Documents
As Report Thejaswin
As Report Thejaswin
GRADED PROJECT
ADVANCED STATISTICS
Done By
Thejaswin S
1
LIST OF FIGURES
LIST OF TABLES
2
Problem 1
A physiotherapist with a male football team is interested in studying the relationship between
foot injuries and the positions at which the players play from the data collected
1.1 What is the probability that a randomly chosen player would suffer an injury?
P(Player Injured) = N(players injured)/ N(Total Players)
P(Player Injured) = 145/235
P(Player Injured) = 0.617
1.3 What is the probability that a randomly chosen player plays in a striker position
and has a foot injury?
P(Striker and Injured) = 45/235
P(Striker and Injured) = 0.191
1.4 What is the probability that a randomly chosen injured player is a striker?
P(Striker / Injured) = P(Injured and Striker) / P(Injured)
P(Striker / Injured) = (45/235) / (145/235)
P(Striker / Injured) = 45/145
P(Striker / Injured) = 0.31
1.5 What is the probability that a randomly chosen injured player is either a forward
or an attacking midfielder?
P((forward or Attacking midfielder) / Injured) =
(P(forward and Injured) + P(midfielder and Injured)) / P(Injured)
3
P((forward or Attacking midfielder) / Injured) = (56/235 + 24/235) / (145/235)
P((forward or Attacking midfielder) / Injured) = 80/145
P((forward or Attacking midfielder) / Injured) = 0.551
Problem 2
According to the studies carried out by the organization, the probability of a radiation
leak in case of a fire is 20%, the probability of a radiation leak in case of a
mechanical 50%, and the probability of a radiation leak in case of a human error is
10%. The studies also showed the following;
2.1 What are the probabilities of a fire, a mechanical failure, and a human error,
respectively?
Let's denote the events as follows:
F = Accident is a fire hazard
M = Accident is a mechanical failure
H = Accident is a human error
R = Radiation leakage occurs
Given:
P(R/F) = 20% = 0.2
P(R/M) = 50% = 0.5
P(R/H) = 10% = 0.1
P(R∩F) = 0.1% = 0.001
P(R∩M) = 0.15% = 0.0015
P(R∩H) = 0.12% = 0.0012
4
P(M) = P(R∩M) / P(R/M)
P(M) = 0.0015 / 0.5 = 0.003
2.3 Suppose there has been a radiation leak in the reactor for which the definite
cause is not known. What is the probability that it has been caused by:
A Fire. A Mechanical Failure. A Human Error.
The probabilities that the radiation leak has been caused by a Fire, a Mechanical
Failure, and a Human Error are approximately 0.27, 0.405, and 0.324,
respectively.
5
Problem 3:
The breaking strength of gunny bags used for packaging cement is normally
distributed with a mean of 5 kg per sq. centimeter and a standard deviation of 1.5 kg
per sq. centimeter. The quality team of the cement company wants to know the
following about the packaging material to better understand wastage or pilferage
within the supply chain; Answer the questions below based on the given information;
(Provide an appropriate visual representation of your answers, without which marks
will be deducted)
3.1 What proportion of the gunny bags have a breaking strength less than 3.17 kg
per sq cm?
3.2 What proportion of the gunny bags have a breaking strength at least 3.6 kg per
sq cm.?
6
Fig 3.2 Distribution Plot
The proportion of the gunny bags have a breaking strength at least 3.6 kg per sq cm
is 82.46%.
3.3 What proportion of the gunny bags have a breaking strength between 5 and 5.5
kg per sq cm.?
3.4 What proportion of the gunny bags have a breaking strength NOT between 3 and
7.5 kg per sq cm.?
7
Fig 3.4 Distribution Plot
The proportion of the gunny bags that have a breaking strength NOT between 3 and
7.5 kg per sq cm is 13.9%
Problem 4:
4.1 What is the probability that a randomly chosen student gets a grade below 85 on
this exam?
8
The probability that a randomly chosen student gets a grade below 85 on this exam
is 0.826.
4.2 What is the probability that a randomly selected student scores between 65 and
87?
4.3 What should be the passing cut-off so that 75% of the students clear the exam?
9
Problem 5:
5.1 Earlier experience of Zingaro with this particular client is favorable as the stone
surface was found to be of adequate hardness. However, Zingaro has reason to
believe now that the unpolished stones may not be suitable for printing. Do you think
Zingaro is justified in thinking so?
Lets Say,
Null Hypothesis: Unpolished stones suitable for printing
Alternate Hypothesis: Unpolished stones may not be suitable for printing
Assuming a 5% significance level
t-statistic: -3.2422320501414053
p-value: 0.0014655150194628353
10
5.2 Is the mean hardness of the polished and unpolished stones the same?
Null Hypothesis(H0): the mean hardness of the polished and unpolished stones
are equal.
Thus we perform independent 2 tailed t test
Alternate Hypothesis(H1): the mean mean hardness of the polished and
unpolished stones are not equal.
Ttest_indResult(statistic=-3.2422320501414053,
pvalue=0.0014655150194628353)
Since p-value is less than 0.05 alpha significance, we reject null Hypothesis.
No, The mean hardness of the polished and unpolished stones are not same.
The mean hardness of polished stone is 147.78
The mean hardness of unpolished stone is 134.11
Problem 6
Aquarius health club, one of the largest and most popular cross-fit gyms in the
country has been advertising a rigorous program for body conditioning. The program
is considered successful if the candidate is able to do more than 5 push-ups, as
compared to when he/she enrolled in the program. Using the sample data provided
can you conclude whether the program is successful? (Consider the level of
Significance as 5%)
Note that this is a problem of the paired-t-test. Since the claim is that the training will
make a difference of more than 5, the null and alternative hypotheses must be
formed accordingly.
11
We perform paired t-test on two related samples of pushups count as they are
related to each other in some way.
We compute the difference of pushups before and after program.
Let's say,
Null Hypothesis (H0): The mean difference in pushups is less than or equal to 5.
Alternative Hypothesis (H1): The mean difference in pushups is greater than 5.
Given significance level is 5% (0.05)
t-statistic: 19.322619811082458
p-value: 2.292*10^(-35)
The p-value is less than alpha (0.05), therefore rejecting Null Hypothesis.
Thus the program is successful in making a difference of more than 5 push-ups.
Problem 7
Dental implant data: The hardness of metal implants in dental cavities depends on
multiple factors, such as the method of implant, the temperature at which the metal is
treated, the alloy used as well as the dentists who may favor one method above
another and may work better in his/her favorite method. The response is the variable
of interest.
7.1 Test whether there is any difference among the dentists on implant hardness.
State the null and alternative hypotheses. Note that both types of alloys cannot be
considered together. You must state the null and alternative hypotheses separately
for the two types of alloys.?
To test whether there is any difference among the dentists on the implant
hardness, we can use a one-way ANOVA (Analysis of Variance) test as there is
only 1 factor involved. This test will help us determine if there are significant
differences in implant hardness based on the dentists.
The null and alternative hypotheses for the one-way ANOVA test are as follows:
For Alloy1:
Null Hypothesis (H0): There is no difference among dentists on implant hardness
Alternative Hypothesis (H1): There is a difference among dentists on implant
hardness.
For Alloy2:
Null Hypothesis (H0): There is no difference among dentists on implant hardness
Alternative Hypothesis (H1): There is a difference among dentists on implant
hardness.
12
7.2. Before the hypotheses may be tested, state the required assumptions. Are the
assumptions fulfilled? Comment separately on both alloy types.?
For Alloy 1:
Independence of observations: Assuming that each hardness measurement is
from a different stone implant, this assumption is likely to be fulfilled.
P-value for all dentist is greater than alpha (0.05), therefore we reject alternate
hypothesis, thus it follows normal distribution.
For Alloy 2:
Independence of observations: Assuming that each hardness measurement is
from a different stone implant, this assumption is likely to be fulfilled.
13
Normality: To check the normality assumption, we can perform a normality test
on the hardness measurements for each dentist group. For example, we can use
the Shapiro-Wilk test.
Lets say,
Null Hypothesis: Follows normal distribution.
Alternate Hypothesis: Doesn't follow a normal distribution.
Shapiro-Wilk Test for Alloy 2:
Dentist 1: ShapiroResult(statistic=0.9039731621742249,
pvalue=0.27593979239463806)
Dentist 2: ShapiroResult(statistic=0.9392004013061523,
pvalue=0.5735077857971191)
Dentist 3: ShapiroResult(statistic=0.9340971112251282,
pvalue=0.5213080644607544)
The P-value for all dentists is greater than alpha (0.05). Therefore we reject the
alternate hypothesis. Thus it follows a normal distribution.
7.3. Irrespective of your conclusion in 2, we will continue with the testing procedure.
What do you conclude regarding whether implant hardness depends on dentists?
Clearly state your conclusion. If the null hypothesis is rejected, is it possible to
identify which pairs of dentists differ?
For Alloy1:
Null Hypothesis (H0): There is no difference among dentists on implant hardness
Alternative Hypothesis (H1): There is a difference among dentists on implant
hardness.
F-statistic: 1.1232073892024739
p-value: 0.3417393954842689
As the p-value is greater than alpha (0.05), we reject the alternate hypothesis.
There is no significant difference among the dentists on the implant hardness for
Alloy 1.
For Alloy2:
Null Hypothesis (H0): There is no difference among dentists on implant hardness
14
Alternative Hypothesis (H1): There is a difference among dentists on implant
hardness.
F-statistic: 0.26968540577569117
p-value: 0.7659030899578484
As the p-value is greater than alpha (0.05), we reject the alternate hypothesis.
There is no significant difference among the dentists on the implant hardness for
Alloy 2.
7.4. Now test whether there is any difference among the methods on the hardness of
dental implants, separately for the two types of alloys. What are your conclusions? If
the null hypothesis is rejected, is it possible to identify which pairs of methods differ?
For Alloy1:
Null Hypothesis (H0): There is no significant difference in implant hardness
among methods.
Alternate Hypothesis (H1): There is a significant difference in implant hardness
among methods.
p-value: 0.004
Since p-value is less than alpha (0.05), we reject null hypothesis.
Thus there is a significant difference in implant hardness among methods.
we can perform posthoc tests (Tukey's HSD test or Bonferroni correction) to
identify which pairs of methods differ significantly in terms of implant hardness.
These post-hoc tests help us pinpoint the specific methods between which the
differences exist for each type of alloy.
For Alloy2:
Null Hypothesis (H0): There is no significant difference in implant hardness
among methods.
15
Alternate Hypothesis (H1): There is a significant difference in implant hardness
among methods.
p-value: 0.000006
Since p-value is less than alpha (0.05), we reject null hypothesis.
Thus there is a significant difference in implant hardness among methods.
we can perform posthoc tests (Tukey's HSD test or Bonferroni correction) to
identify which pairs of methods differ significantly in terms of implant hardness.
These post-hoc tests help us pinpoint the specific methods between which the
differences exist for each type of alloy.
7.5.Now test whether there is any difference among the temperature levels on the
hardness of dental implant, separately for the two types of alloys. What are your
conclusions? If the null hypothesis is rejected, is it possible to identify which levels of
temperatures differ?
For Alloy1:
Null Hypothesis (H0): There is no significant difference in implant hardness
among temperature levels.
Alternate Hypothesis (H1): There is a significant difference in implant hardness
among temperature levels.
16
For Alloy2:
Null Hypothesis (H0): There is no significant difference in implant hardness
among temperature levels.
Alternate Hypothesis (H1): There is a significant difference in implant hardness
among temperature levels.
7.6.Consider the interaction effect of dentist and method and comment on the
interaction plot, separately for the two types of alloys?
We can perform a two-way ANOVA with interaction to test for the main effects of
'Dentist' and 'Method'.
1. The p-value for the factor 'Dentist' is 0.01, which is less than the significance
level of 0.05. Therefore, we reject the null hypothesis and conclude that there is
a significant difference in implant hardness among different dentists for Alloy.
2. The p-value for the factor 'Method' is 0.0002, which is less than 0.05. Hence,
we reject the null hypothesis and conclude that there is a significant difference in
implant hardness among different methods for Alloy 1.
3. The p-value for the interaction term 'C(Dentist):C(Method)' is 0.006, which is
less than 0.05. Therefore, we reject the null hypothesis and conclude that there
is a significant interaction effect between 'Dentist' and 'Method' on implant
hardness for Alloy 1.
17
For Alloy 2:
The p-value for the factor 'Dentist' is 0.371, which is greater than 0.05. As a
result, we fail to reject the null hypothesis, indicating that there is no significant
difference in implant hardness among different dentists for Alloy 2.
The p-value for the factor 'Method' is 0.000004, which is less than 0.05. Thus, we
reject the null hypothesis and conclude that there is a significant difference in
implant hardness among different methods for Alloy 2.
The p-value for the interaction term 'Dentist:C(Method)' is 0.09, which is greater
than 0.05. Hence, we fail to reject the null hypothesis, suggesting that there is no
significant interaction effect between 'Dentist' and 'Method' on implant hardness
for Alloy 2.
For Alloy 1, both dentists and methods significantly influence implant hardness,
and there is a significant interaction effect between dentists and methods.
For Alloy 2, only the method significantly influences implant hardness, while the
effect of dentists and the interaction between dentists and methods are not
significant.
18
Fig 7.6.2 Interaction Plot for Alloy 2
7.7.Now consider the effect of both factors, dentist, and method, separately on each
alloy. What do you conclude? Is it possible to identify which dentists are different,
which methods are different, and which interaction levels are different?
Based on the provided conclusions and post-hoc test results:
19
Table 7.7.2 Turkey HSD Results for comparison on methods
For Alloy 1:
The null hypothesis is rejected, indicating that there is a significant difference in
implant hardness among different methods for Alloy 1.
The post-hoc test (Tukey's HSD) results show that there is a significant
difference in implant hardness between Method 1 and Method 3, as well as
between Method 2 and Method 3. However, there is no significant difference
between Method 1 and Method 2.
For Alloy 2:
The null hypothesis is rejected, implying that there is a significant difference in
implant hardness among different methods for Alloy 2.
The post-hoc test (Tukey's HSD) results show that there is a significant
difference in implant hardness between Method 1 and Method 3, as well as
between Method 2 and Method 3. However, there is no significant difference
between Method 1 and Method 2.
Conclusions:
For both Alloy 1 and Alloy 2, there is a significant difference in implant hardness
among different methods.
In Alloy 1, Method 3 shows significantly different implant hardness compared to
Method 1 and Method 2. However, there is no significant difference between
Method 1 and Method 2.
In Alloy 2, Method 3 exhibits significantly different implant hardness compared to
Method 1 and Method 2. However, there is no significant difference between
Method 1 and Method 2.
20