AS Graded Project Suchi Solanki

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 21

ADVANCED STATISTICS PROJECT REPORT

ADVANCED
DSBA
STATISTICS

By

SUCHI SOLANKI

22- Sep-2022
2|Page

Index******
Problem 1

1.1 What is the probability that a randomly chosen player would suffer an injury?-------------------------------------------------------- Pg 4

1.2 What is the probability that a player is a forward or a winger? --------------------------------------------------------------------------- Pg 4

1.3 What is the probability that a randomly chosen player plays in a striker position and has a foot injury?------------------------Pg 5

1.4 What is the probability that a randomly chosen injured player is a striker? ------------------------------------ ------------------------ Pg 5

1.5 What is the probability that a randomly chosen injured player is either a forward or an attacking midfielder? --------------- Pg 5

Problem 2

2.1 What are the probabilities of a fire, a mechanical failure, and a human error respectively?-------------------- ---------------------Pg 6

2.2 What is the probability of a radiation leak?-------------------------------------------------------------------------------------------------------- Pg6

2.3 Suppose there has been a radiation leak in the reactor for which the definite cause is not known. What is the
probability that it has been caused by: A Fire , A Mechanical Failure , a human error ----------------------------------------------------Pg7
Problem 3:

3.1 What proportion of the gunny bags have a breaking strength less than 3.17 kg per sq cm? ------------------------------------------Pg 7

3.2 What proportion of the gunny bags have a breaking strength at least 3.6 kg per sq cm.? -------------------------------------------- Pg 8

3.3 What proportion of the gunny bags have a breaking strength between 5 and 5.5 kg per sq cm.?----------------------------------- Pg 8

3.4 What proportion of the gunny bags have a breaking strength NOT between 3 and 7.5 kg per sq cm.? ---------------------------- Pg 9

Problem 4:

4.1 What is the probability that a randomly chosen student gets a grade below 85 on this exam? ------------------------------------ Pg 9

4.2 What is the probability that a randomly selected student scores between 65 and 87? --------------------------------------------- Pg 10

4.3 What should be the passing cut-off so that 75% of the students clear the exam? --------------------------------------------------- Pg 10

Problem 5:

5.1 Earlier experience of Zingaro with this particular client is favorable as the stone surface ------------------------------------------ Pg 11
was found to be of adequate hardness. However, Zingaro has reason to believe now that the
unpolished stones may not be suitable for printing. Do you think Zingaro is justified in thinking so?

5.2 Is the mean hardness of the polished and unpolished stones the same? -------------------------------------------------------------- Pg 12

Problem 6: ----------------------------------------------------------------------------------------------------------- Pg 13
Problem 7:

7.1 Test whether there is any difference among the dentists on the implant hardness. State the null and alternative
hypotheses. Note that both types of alloys cannot be considered together. You must state the null and alternative hypotheses
separately for the two types of alloys.? -------------------------------------------------------------------------- Pg 15
7.2 Before the hypotheses may be tested, state the required assumptions. Are the assumptions fulfilled?
Comment separately on both alloy types.? --------------------------------------------------------------------------- Pg 15

7.3 Irrespective of your conclusion in 2, we will continue with the testing procedure. What do you
conclude regarding whether implant hardness depends on dentists? Clearly state your conclusion. If the null hypothesis
is rejected, is it possible to identify which pairs of dentists differ? --------------------------------------------------------------- Pg 17

7.4 Now test whether there is any difference among the methods on the hardness of dental implant,
3|Page

separately for the two types of alloys. What are your conclusions? If the null hypothesis is rejected,
is it possible to identify which pairs of methods differ? ---------------------------------------------------------------------------------- Pg 18

7.5 Now test whether there is any difference among the temperature levels on the hardness of
dental implant, separately for the two types of alloys. What are your conclusions? If the null
hypothesis is rejected, is it possible to identify which levels of temperatures differ? --------------------------------------------- Pg 19

7.6 Now test whether there is any difference among the temperature levels on the hardness of
dental implant, separately for the two types of alloys. What are your conclusions? If the null
hypothesis is rejected, is it possible to identify which levels of temperatures differ? --------------------------------------------- Pg 20

7.7 Now consider the effect of both factors, dentist, and method, separately on each alloy. What do you conclude?
Is it possible to identify which dentists are different, which methods are different, and which interaction levels
are different? -------------------------------------------------------------------------------------------------------------------------------- Pg 21
4|Page

Problem 1
A physiotherapist with a male football team is interested in studying the relationship
between foot injuries and the positions at which the players play from the data
collected

1.1 What is the probability that a randomly chosen player would suffer an injury?
** Solution -------
As stated Total number of player (N)= 235

Total Number of players injured (I)= 145

The probability of an Event

P (I ) = P(I) /P(N)

P( I ) = 145/235

P( I ) = 0.6170212765957447 ≈ 0.61
The probability of an Event that a randomly chosen played would suffer injury will
be 0.61.

1.2 What is the probability that a player is a forward or a winger?


** Solution -------
Total Forward ( F) = 94

Total Winger (W) = 29

Total number of player ( N) = 235

The probability of an Event = P(F U W) = P(F) + P(W)


P(F)= F/N
P(W)=W/N

P(F U W) = 0.5234042553191489 ≈ 0.52

The probability of an Event that a player is a forward or a winger is 0 .52


5|Page

1.3 What is the probability that a randomly chosen player plays in a striker position and
has a foot injury?
** Solution -------
Number of Striker which are injured and has foot injury (I1) = 45

#Total number of player of Striker (injured + not injured) (n) = 77

The probability of an Event

P (I 1) = P(I1) /P(n)

P (I 1) = 0.5844155844155844 ≈ 0.58

Probability that a randomly chosen player plays in a striker position and has a foot injury is 0.58.

1.4 What is the probability that a randomly chosen injured player is a striker?
** Solution -------
Number of striker (S) = 45

Total number of injured player( n1) = 145

The probability of an Event

P (S ) = P(S) /P(n1)

P (S ) = 0.3103448275862069 ≈ 0.31

Probability that a randomly chosen injured player is a striker is 0.31

1.5 What is the probability that a randomly chosen injured player is either a forward or an attacking
midfielder?

** Solution -------
Total Forward ( F) = 56

Total Winger (M) = 24

Total number of player ( N) = 145

The probability of an Event = P(F U M) = P(F) + P(M)


P(F)= F/N
P(M)=M/N

P(F U M) = 0.5517241379310345 ≈ 0.55

The probability of an Event that a player is a forward or an attacking midfielder is 0


.55
6|Page

Problem 2¶
Data Required for the finding the event of occurrences as below

 probability of a radiation leak in case of a fire is 20%,


 probability of a radiation leak in case of a mechanical 50%,
 probability of a radiation leak in case of a human error is 10%.
 The probability of a radiation leak occurring simultaneously with a fire is 0.1%.
 The probability of a radiation leak occurring simultaneously with a mechanical failure is 0.15%.
 The probability of a radiation leak occurring simultaneously with a human error is 0.12%.

2.1 What are the probabilities of a fire, a mechanical failure, and a human error
respectively?
** Solution -------
 Radiation leak occurring simultaneously with a fire is 0.1%.
 Probability of a radiation leak in case of a fire 20%,

P(Fire) = 0.1% / 20%


 Probability of a radiation leak in case of a mechanical is 50%,
 Radiation leak occurring simultaneously with a mechanical failure is 0.15%

P(mechanical_failure) = .15% / 50%


 Radiation leak in case of a human error is 10%.
 Radiation leak occurring simultaneously with a human error is 0.12%.

human_error = .12% / 10%


 With the above calculation we came to conclusion as below.

 Probabilities of a fire is 0.005


 Probabilities of a mechanical failure is 0.003
 Probabilities of a human error is 0.012

2.2 What is the probability of a radiation leak?


** Solution -------
 The probability of a radiation leak occurring simultaneously with a fire is 0.1%.
 The probability of a radiation leak occurring simultaneously with a mechanical failure is 0.15%.
 The probability of a radiation leak occurring simultaneously with a human error is 0.12%
P(Radiation_Leak) = 0.1% + 0.15% + 0.12% = 0.37%
 So we came to conclusion that the Probability of Radiation leak will be 0.37 %
7|Page

2.3 Suppose there has been a radiation leak in the reactor for
which the definite cause is not known. What is the probability
that it has been caused by:

 A Fire.

 A Mechanical Failure.

 A Human Error

** Solution -------
 The probability of a radiation leak occurring simultaneously with a fire is 0.1%.
 The probability of a radiation leak occurring simultaneously with a mechanical failure is 0.15%.
 The probability of a radiation leak occurring simultaneously with a human error is 0.12%.

 Probability of Radiation leak : 0.37 %


P(Fire) = .1% /.37%
P(Mechanical_Failure) = .15% /.37 %
P(Human_Error) = .12% /.37 %

So the conclusions of the events is as below:

Probability that Radiation Leak has been caused by Fire is 0.2702702702702703 ≈ 0.27%
Probability that Radiation Leak has been caused by Mechanical Failure 0.4054054054054054 ≈ 0.40 %
Probability that Radiation Leak has been caused by Human Error is 0.32432432432432434 ≈ .32%

Problem 3
3.1 What proportion of the gunny bags have a breaking strength less than 3.17 kg
per sq cm?
** Solution -------
 μ (Mean) = 5
 σ (Standard Deviation) = 1.5
 X (Gunny Bag Strength) = 3.17
Z Value ( Z) = (X - μ)/ σ = -1.22 ; CDF Value = 0.1112

 . Thus we conclude that the 11.12% of gunny bags have a breaking strength less than 3.17 kg per
sq cm
8|Page

3.2 What proportion of the gunny bags have a breaking strength at least 3.6 kg per sq
cm.?

** Solution -------
 μ (Mean) = 5
 σ (Standard Deviation) = 1.5
 X (Gunny Bag Strength) = 3.6
Z Value ( Z) = (X - μ)/ σ = - 0.933 ; CDF Value is 0.8246

 Thus we conclude that the 82.46% of gunny bags have the breaking strength at least 3.6 kg/sq.cm
3.3 What proportion of the gunny bags have a breaking strength between 5 and 5.5 kg
per sq cm.?
** Solution -------
 μ (Mean) = 5
 σ (Standard Deviation) = 1.5
 X1 = 5.5
 X2 = 5
Z Value ( Z 1 ) = (X1 - μ)/ σ = 0.333
Z Value ( Z 2) = (X 2- μ)/ σ = 0.0

CDF value is 0.1306


9|Page

 Thus we conclude that 13.06% of gunny bags have breaking strength between 5 and 5.5
kg/sq.cm.

3.4 What proportion of the gunny bags have a breaking strength NOT between 3 and
7.5 kg per sq cm.?
** Solution -------
 μ (Mean) = 5
 σ (Standard Deviation) = 1.5
 X1 = 3
 X2 = 7.5

Z Value ( Z 1) = (X1 - μ)/ σ = -1.33


Z Value ( Z 2) = (X2 - μ)/ σ = 1.67

CDF value = 0.1390


 Thus we conclude that the proportion of gunny bags having strength not between 3 and 7.5 per sq
cm is 13.9%

Problem 4
4.1 What is the probability that a randomly chosen student gets a grade below 85 on this
exam?
** Solution -------
 μ (Mean) = 77
 σ (Standard Deviation) = 8.5
Z Value ( Z) = (X - μ)/ σ = 0.94; CDF value = 0.8267
10 | P a g e

 Conclusion - **probability that a randomly chosen student gets a grade below 85 on this exam is
83% **

4.2 What is the probability that a randomly selected student scores between 65 and 87?
** Solution -------
 μ (Mean) = 77
 σ (Standard Deviation) = 8.5
 X1 = 65
 X2 = 87

Z1 = (X1 - μ)/ σ
Z2 = (X2 - μ)/ σ

Z Value ( Z 1) = (X1 - μ)/ σ = -1.41


Z Value ( Z 2) = (X2 - μ)/ σ = 1.17; CDF Value = .8012

 The probability that a randomly selected student scores between 65 and 87 is 80.12%

4.3 What should be the passing cut-off so that 75% of the students clear the exam?

** Solution -------
 μ (Mean) = 77
 σ (Standard Deviation) = 8.5

The minimum score required for 75% is 71. 26


11 | P a g e

 Conclusion------ passing cut-off so that 75% of the students clear the exam is 71.26

Problem 5
5.1 Earlier experience of Zingaro with this particular client is favourable as the stone surface was found to
be of adequate hardness. However, Zingaro has reason to believe now that the unpolished stones may not
be suitable for printing. Do you think Zingaro is justified in thinking so?

** Solution -------
Analyzing Data set

Step 1: Define null and alternative hypotheses

** It states that stones having BHN no of atleast 150 suitable for printing which means any stone
with BHN<150 is not suitable for printing.**

��: µ(����������) ≥ 150


��: µ(����������) < 150

Step 2: Decide the significance level


12 | P a g e

Significance Level (α) is given as 5 %


α = 0.05
Standard deviation(σ ) is known as well

Step 3: Identify the test statistic

**Sample Size (n) = 75**


**we will use two sample independent T test**

Step 4: Calculating the p - value and test statistic

tstat value : [-4.1646296 -1.22891066]


P- Value : [8.34257399e-05 2.22998968e-01]

Step 5: Decide to reject or accept null hypothesis

P – Value = 2.22998968e-01
α = 0.05
 Conclusion – P- Value is less than α , hence we reject the null hypothesis
We conclude that the unpolished stones may not suitable for printing.

Q.5.2 Is the mean hardness of the polished and unpolished stones the same?

** Solution -------

tstat : -3.242232050141406
p-value : 0.001465515019462831

** � 𝐯𝐚�𝐮� �� ���� 𝐭�𝐚� ��𝐯�� �� ��𝐠����𝐜𝐚�𝐜� �� 𝐰� 𝐫�𝐣�𝐜𝐭 �𝐮�� �𝐲��𝐭�����.

Conclusion :
 � 𝐯𝐚�𝐮� �� ���� 𝐭�𝐚� ��𝐯�� �� ��𝐠����𝐜𝐚�𝐜� �� 𝐰� 𝐫�𝐣�𝐜𝐭 �𝐮�� �𝐲��𝐭�����
 Mean hardness of polished and unpolished stones are not same **
 �1� : μ(Unpolished) ≠ μ(Treated and Polished)
13 | P a g e

Problem 6

** Solution -------
Data description

Step 1:Defining Null and Alternate Hypothesis

**HO : μ(After) - μ(Before) ≥ 5 refers to Null Hypothesis.


**HA : μ(After) - μ(Before) < 5 refers to Alternate Hypothesis ,

Step 2: Decide the significance level

Significance Level (α) is given as 5 %


α = 0.05

Step 3: Identify the test statistic


14 | P a g e

* The sample size, n =100 i.e n >30.

Step 4: Calculate the p - value and test statistic

tstat -19.323
p-value for one-tail: 1.1460209626255983e-35

Step 5: Decide to reject or accept null hypothesis

Paired two-sample t-test p-value= 1.1460209626255983e-35

Conclusion :

 � 𝐯𝐚�𝐮� �� ���� 𝐭�𝐚� ��𝐯�� �� ��𝐠����𝐜𝐚�𝐜� �� 𝐰� 𝐫�𝐣�𝐜𝐭 �𝐮�� �𝐲��𝐭�����


 We have enough evidence to reject the null hypothesis in favour of alternative hypothesis,i.e
Program is successful

Problem 7

Q.7.1 Test whether there is any difference among the dentists on the implant hardness. State the null and
alternative hypotheses. Note that both types of alloys cannot be considered together. You must state the
null and alternative hypotheses separately for the two types of alloys.?

** Solution -------
Data Description

Hypothesis formation:
15 | P a g e

(A)
�10 : There is No difference among Dentists in Alloy 1
�1� : There is difference among Dentists in Alloy 1

(B)
�20 : There is No difference among Dentists in Alloy 2
�2� : There is difference among Dentists in Alloy 2

Q 7.2. Before the hypotheses may be tested, state the required assumptions. Are the assumptions
fulfilled? Comment separately on both alloy types?

** Solution -------
* To check the data if its normally distributed we will perform 'The Shapiro–Wilk test' *

* alpha = 0.05*

**HO : P-Value ≤ 5 refers null hypothesis is rejected and there is evidence that
the data tested are not normally distributed
**HA : P-Value > 5 refers that the null hypothesis (that the data came from a
normally distributed population) can not be rejected

Results shows that all the data are not Normally Distributed except Alloy 2,Method 3 and Dentist 4 as the P-Value
< 0.05.*

** with above results we need to further conduct “ the Anderson –


Darling” test***
.
16 | P a g e

Result shows that above sample are not Normally Distributed Except Dentist 4, Alloy 2 and Alloy 1 as the Critical
Value is less than Test Statistic.

We will further conduct “Levene’s Test”

As per above result from Levene’s test we draw below conclusion.


 The P Value for Dentist and Method is less than 0.05 thus we reject Null Hypothesis (Dentist and Method
do not have Homogeniety)

 The P Value of Alloy is greater than 0.05 thus we fail to reject Null Hypothesis.

Q.7.3.Irrespective of your conclusion in 2, we will continue with the testing procedure. What do you
conclude regarding whether implant hardness depends on dentists? Clearly state your conclusion. If the
null hypothesis is rejected, is it possible to identify which pairs of dentists differ?

** Solution -------
Converting all the datatype into Categorical datatype
17 | P a g e

To conclude this problem we need to first group each Alloy 1 and Alloy2 and perform one way ANOVA.

For Alloy 1

Conclusion - P-value is more than alpha (0.05). Thus, we �𝐚�� 𝐭� reject the 𝐍𝐮�� Hypothesis (�10).
For Alloy 2
Creating Analysis of Variance model containing anova_lm for ANOVA analysis with a linear OLS Model.

Conclusion - P-value is more than alpha (0.05). Thus, we �𝐚�� 𝐭� reject the 𝐍𝐮�� Hypothesis (�20).

 Conclusion for this question - There is no difference between Dentists for Alloy 1 and Alloy 2
Q.7.4. Now test whether there is any difference among the methods on the hardness of
dental implant, separately for the two types of alloys. What are your conclusions? If the null
hypothesis is rejected, is it possible to identify which pairs of methods differ?

** Solution -------

(A) Alloy1
�10 : There is No difference among Methods in Alloy 1
�1� : There is difference among Methods in Alloy 1

(B) Alloy2
�20 : There is No difference among Methods in Alloy 2
�2� : There is difference among Methods in Alloy 2

.
18 | P a g e

For Alloy 1

P-value is less than alpha (0.05). Thus, we 𝐫�𝐣�𝐜𝐭 the 𝐍𝐮�� �𝐲��𝐭����� ( �10 ).

For Alloy 2

P-value is less than alpha (0.05). Thus, we 𝐫�𝐣�𝐜𝐭 the 𝐍𝐮�� �𝐲��𝐭����� ( �20 ).

Conclusion -- Null Hypothesis are Rejected thus we can say that there is a difference among
Methods for Alloy 1 and Alloy 2.

Q.7.5. Now test whether there is any difference among the temperature levels on the hardness of dental
implant, separately for the two types of alloys. What are your conclusions? If the null hypothesis is
rejected, is it possible to identify which levels of temperatures differ?

** Solution -------

(A) Alloy1
�10 : There is No difference among Temperature levels in Alloy 1
�1� : There is difference among Temperature levels in Alloy 1

(B) Alloy2
�20 : There is No difference among Temperature levels in Alloy 2
�2� : There is difference among Temperature levels in Alloy 2

For Alloy 1

P-value is greater than alpha (0.05). Thus, we fail to 𝐫�𝐣�𝐜𝐭 the 𝐍𝐮�� �𝐲��𝐭����� ( �10 ).

For Alloy 2
19 | P a g e

P-value is greater than alpha (0.05). Thus, we fail to 𝐫�𝐣�𝐜𝐭 the 𝐍𝐮�� �𝐲��𝐭����� ( �20 ).

 Conclusion -- Null Hypothesis are fail to Reject thus we can say that there is no difference among
Temprature Levels for Alloy 1 and Alloy 2.

Q.7.6. Consider the interaction effect of dentist and method and comment on the interaction plot,
separately for the two types of alloys?

** Solution -------
Checking interaction between Dentist and Method for Alloy 1

Checking interaction between Dentist and Method for Alloy 2


20 | P a g e

 Conclusion –
Alloy 1

 Dentist : P-value is greater than significance value , so Dentist is not a Major cause in
Response in Alloy 1
 Method : P-value is less than significance value , so Method is significant cause in
Response in Alloy 1
 Method 1 : Response is the highest but as he changes the Method to 2 or 3
the Response decreases.
 Method 2 is constant with all Doctors.
 Method 3 is the lower compared to other Method of 1 and 2.
 Method 3 is the lowest for Dentist 4 and 5.

Alloy 2

 Dentist : P-value is greater than significance value , so Dentist is not a Major cause in
Response in Alloy 2
 Method : P-value is less than significance value , so Method is the most significant
cause in Response in Alloy 2
Method 3 constant is most variable and lower than other methods.
Method 1 and Method 2 has shown regression as the Dentist alter from 1 to
5.
Method 2 is consistent with all Doctors.
Method 2 is the lowest for Dentist 3 and 5.

7.7 Now consider the effect of both factors, dentist, and method, separately on
each alloy. What do you conclude? Is it possible to identify which dentists are
different, which methods are different, and which interaction levels are different?

** Solution -------

�10: There is No interaction between Dentist and Method Alloy 1


�1� : There is interaction between Dentist and Method Alloy 1

 P-value for interaction between Dentist and Method is less than alpha (0.05). Thus, we reject the
Null Hypothesis �10
21 | P a g e

 There is interaction between Dentist and Response for Alloy 1

Hypothesis between Dentist and Method for Alloy 2

�10: There is No interaction between Dentist and Method Alloy 2


�1� : There is interaction between Dentist and Method Alloy 2.

 P-value for interaction between Dentist and Method is greater than alpha (0.05). Thus, we fail to
reject the Null Hypothesis �20
 There is no interaction between Dentist and Response for Alloy 2

 Conclusion-

 Alloy1 only have interaction effect of Doctor and Responses

 Hence we cannot identify which dentists are different, which methods are different, and which
interaction levels are different through Hypothesis and ANOVA methods , we need to identify some
other technique to check the interaction level.

***************** END of the Report *************************************

You might also like