Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 19

AS GUIDED

PROJECT
REPORT

P a g e 1 | 19
Contents:

Question 1: What are the probabilities of a fire, a mechanical failure, and a human error
respectively?

Question 2: What is the probability of a radiation leak?

Question 3: Suppose there has been a radiation leak in the reactor for which the definite cause
is not known. What is the probability that it has been caused by a) a fire b) a mechanical failure
c) a human error?

Question 4: What is the probability that a randomly chosen student gets a grade below 85 on
this exam?

Question 5: What is the probability that a randomly selected student scores between 65 and
87?

Question 6: What should be the passing cut-off so that 75% of the students clear the exam?

Question 7: Define the problem and perform an Exploratory Data Analysis"- Problem
definition,
questions to be answered - Data background and contents - Univariate analysis - Bivariate
analysis"

Question 8: Illustrate the insights based on EDA Key meaningful observations on individual
variables and the relationship between variables

Question 9: Do the users spend more time on the new landing page than the old landing
page?
- State the null and alternate hypotheses - Conduct the hypothesis test and compute the p-
value - Write down conclusions from the test results

Question 10: Does the converted status depend on the preferred language?
- State the null and alternate hypotheses - Conduct the hypothesis test and compute the p-value -
Write down conclusions from the test results

Question 11: Is the mean time spent on the new page same for the different language users?
- State the null and alternate hypotheses - Check the assumptions of the hypothesis test. - Conduct
the hypothesis test and compute the p-value - Write down conclusions from the test results

Question 12: Actionable Insights & Recommendations- Actionable Insights – Business


Recommendation

P a g e 2 | 19
Question 1: What are the probabilities of a fire, a mechanical failure, and a human
error respectively?
Ans: First let’s define what all are the events that we can have: F = Fire, M = Mechanical
Failure, H =Human Error, R =Radiation Leak, N = No accident Now let’s define the
probabilities already given in the problem Prob (R/F) = 0.2, Prob(R/M) =0.5, Prob(R/H)
=0.1,Prob(R Ω F) = 0.001,Prob(R Ω M)=0.0015,Prob(R Ω H)= 0.0012
Prob (F) = Prob (Prob(R Ω F)/ Prob(R/F)) = 0.001/0.2 = 0.005 Prob (M) = Prob(Prob(R
ΩM)/ Prob(R/M)) =
0.0015/0.5 = 0.003 Prob (H) = Prob (Prob(R ΩH)/ Prob(R/H)) = 0.0012/0.1 = 0.012
2.2
Question 2: What is the probability of a radiation leak?
We can have 3 types of possible accidents - Fire/Mechanical Error and Human Error
Probability of No accident Prob(N) = 1- (0.005 -0.003 -0.012) = 0.98 Prob (R/N) = 0
Prob( R Ω N) = Prob(R/N)Prob(N) =0 Probability Theorem: P(R) = 0.001
+0.0015+0.0012+0 = 0.0037
Question 3: Suppose there has been a radiation leak in the reactor for which the
definite cause is not known. What is the probability that it has been caused by a) a fire
b) a mechanical failure c) a human error?
Prob (F/R) = 0.001/0.0037 = 0.270 Prob (M/R)= 0.0015/0.0037 =0.405 Prob(H/R) =
0.0012/0.0037 =0.324
Question 4: What is the probability that a randomly chosen student gets a grade
below 85 on this exam?
Using the Z-score formula:
Z = (X - μ) / σ
where:
X = the value we want to find the probability for (85 in this case)
μ = the mean (77)
σ = the standard deviation (8.5)
Z = (85 - 77) / 8.5
Z = 0.941176

Now, we can use a standard normal distribution table or a calculator to find the cumulative

P a g e 3 | 19
probability corresponding to the Z-score of 0.941176.

From the standard normal distribution table, the cumulative probability (area under the
curve) for a Z-score of 0.941176 is approximately 0.8264.

Therefore, the probability that a randomly chosen student gets a grade below 85 is
approximately 0.8264, or 82.64%.
Question 5: What is the probability that a randomly selected student scores between 65 and 87?

Ans:
For 65:

Z1 = (65 - 77) / 8.5

Z1 = -1.411765
For 87:
Z2 = (87 - 77) / 8.5

Z2 = 1.176471

Using the standard normal distribution table or a calculator, we find the cumulative
probabilities corresponding to Z1 and Z2.

For Z1 = -1.411765, the cumulative probability is approximately 0.0793.

For Z2 = 1.176471, the cumulative probability is approximately 0.8790.

The probability of scoring between 65 and 87 is the difference between the cumulative
probabilities:

0.8790 - 0.0793 = 0.7997

Therefore, the probability that a randomly selected student scores between 65 and 87 is
approximately 0.7997, or 79.97%.
Question 6: What should be the passing cut-off so that 75% of the students clear the
exam?
Ans: From the standard normal distribution table or a calculator, we find the Z-score
corresponding to a cumulative probability of 0.75 is approximately 0.6745.
Using the Z-score formula:
Z = (X - μ) / σ
Substituting the known values:
0.6745 = (X - 77) / 8.5

P a g e 4 | 19
Solving for X:
X - 77 = 0.6745 * 8.5
X - 77 = 5.73425
X = 82.73425
Therefore, the passing cut-off should be set at approximately 82.73425 for 75% of the
students to clear the exam.

Question 7: Define the problem and perform an Exploratory Data Analysis"- Problem
definition, questions to be answered - Data background and contents - Univariate analysis
- Bivariate analysis"

Ans: Data Overview


The initial steps to get an overview of any dataset is to:

 observe the first few rows of the dataset, to check whether the dataset has been
loaded properly or not
 get information about the number of rows and columns in the dataset
 find out the data types of the columns to ensure that data is stored in the preferred
format and the value of each property is as expected.
 check the statistical summary of the dataset to get an overview of the numerical
columns of the data

 Shap of dataset:(100, 6)

 UNIVARIATE ANALYSIS:

P a g e 5 | 19
control 50
treatment 50

P a g e 6 | 19
P a g e 7 | 19
P a g e 8 | 19
P a g e 9 | 19
BIVARIATE ANALYSIS:

P a g e 10 | 19
P a g e 11 | 19
Do the users spend more time on the new landing page than the existing landing page?

P a g e 12 | 19
H0
: The mean time spent by the users on the new page is equal to the mean time spent by the
users on the old page.
Ha
: The mean time spent by the users on the new page is greater than the mean time spent by
the users on the old page.
This is a one-tailed test concerning two population means from two independent
populations.
The population standard deviations are unknown. Based on this information, select the
appropriate test.
The sample standard deviation of the time spent on the new page is: 1.82
The sample standard deviation of the time spent on the old page is: 2.58
The p-value is 0.0001392381225166549

As the p-value 0.0001392381225166549 is less than the level of significance, we reject the
null hypothesis.

P a g e 13 | 19
 Is the conversion rate (the proportion of users who visit the landing page and get
converted) for the new page greater than the conversion rate for the old page?

H0:
The conversion rate of the new page is equal to the conversion rate of the old page.

Ha:
The conversion rate of the new page is greater than the conversion rate of the old page.
This is a one-tailed test concerning two population proportions from two independent
populations.
Based on this information, a two proportion z-test would be the most appropriate.

The numbers of users served the new and old pages are 50 and 50 respectively

The p-value is 0.008026308204056278


As the p-value 0.008026308204056278 is less than the level of significance, we reject the
null hypothesis.

 Does the converted status depend on the preferred language?

P a g e 14 | 19
H0:
The converted status is independent of the preferred language.

Ha:
The converted status is dependent of the preferred language.

This is a problem of the test of independence, concerning two categorical variables -


converted status and preferred language.

Based on this information, a chi-square test for independence would be the most
approriate.

The p-value is 0.2129888748754345

As the p-value 0.2129888748754345 is greater than the level of significance, we fail to reject
the null hypothesis.

 Is the time spent on the new page same for the different language users?

P a g e 15 | 19
H0:
The mean time spent on the new lading page is the same across all preferred langauges.

Ha:
At least one of the mean times spent on the new landing page is different amongst the
preferred languages.
This is a problem, concerning three population means. Based on this information,
a one-way ANOVA test would be the most appropriate.
The p-value is 0.8040016293525696

Levene’s test
H0
: All the population variances are equal

Ha
: At least one variance is different from the rest

The p-value is 0.46711357711340173

Since the p-value is large, we fail to reject the null hypothesis, meaning the variances are
equal.

# Perform a one-way ANOVA test and determine the p-value

P a g e 16 | 19
The p-value is 0.43204138694325955

As the p-value 0.43204138694325955 is greater than the level of significance, we fail to


reject the null hypothesis.

 Draw inference

Since the p-value is greater than the level of significance at 5%, the null hypothesis fails to
be rejected.
This means that the mean time spent on the new landing page is relatively similar regardless
of the preferred language.

 Question 12: Actionable Insights & Recommendations- Actionable Insights –


Business Recommendations

Ans: Conclusion and Business Recommendations

 Conclusions:
 To answer the question if users spend more time on the new landing page than the
existing landing page,
 a two-sample independent t-test was performed.
 A p-value of 0.0001 has resulted from the test, which is less than the level of
significance of 5%.
 Therefore, the null hypothesis is rejected.
 What this means in context is that there is significant evidence that
 the mean time spent by the users on the new page is greater than the mean time
spent by the users on the old page. In order to answer the question
 if the conversion rate for the new page is greater than the conversion rate of the old
page,
 a two-proportion z-test was performed.
 A p-value of 0.008 has resulted from the test, which is less than the level of
significance of 5%.
 Therefore, the null hypothesis is rejected.
 What this means in context is that there is significant evidence that the conversion
rate of the new landing page
 was greater than the conversion rate of the old landing page. In order to answer the
question if the conversion status andn preferred language are related,
 a chi-square test for independence was performed.
 A p-value of 0.213 was resulted from the test, which is more than the level of
significance of 5%.
 Therefore, the null hypothesis is failed to be rejected. What this means in context is
that conversion status and the preferred langauge of the landing page are
independent of each other.
 In order to answer the question
 if the time spent on the new landing page differed based on the preferred language,

P a g e 17 | 19
 a one-way ANOVA test was performed.
A p-value of 0.432 resulted from the test, which is more than the level of significance of 5%.
Therefore, the null hypothesis is failed to be rejected.
What this means in context is that
the mean time spent on the new landing page was relatively similar across all the preferred
languages.

 Recommendations:
 E-News Express should fully implement the new landing page as
 it appears to gain a lot more traction than the old landing page.
 The time spent on the new landing page is greater than the time spent on the old
landing page is evidence that
 users prefer it.
 It might be beneficial to cut the losses with the old landing page as
 there are diminutive returns in average time spent and conversion rate.
 The new landing page has an increased conversion rate, therefore, more resources
should be directed towards it as
 it has more opportunity to increase membership.
 Deploy the new landing page incorporating all the exiting preferred language.
 As there is no signficant difference between the average time spent on the new page
across the preferred languages,
 the conversion rate to subscribers will be the similar throughout.
 Perhaps consider adding more languages to the portal to reach a wider audience.

P a g e 18 | 19
THANK
YOU

P a g e 19 | 19

You might also like