Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 6

Task Two

Question 1

The lifetime of a particular type of TV follows a normal distribution with μ = 3500 hours, and σ =300
hours.
a. > pnorm(3000, mean= 3500, sd=300, lower.tail = T)

[1] 0.04779035
Interpretation: This means that the chance that a randomly selected TV lasting for a mean period less
than 3000 hours is 4.7%. This implies that there is a higher chance that a randomly selected TV in this
context will last for a mean period more than 3000 hours.

b. set.seed(0)
n=100000
sample_means= rep(NA, n)
for(i in 1:n){
sample_means[i]= mean(rnorm(9, mean=3500, sd=300))
}
> sum(sample_means<3000)/ length(sample_means)

[1] 0
Interpretation: This probability being 0 implies that of a random sample of size 9, there is a zero
chance that a randomly selected TV among the sample will have a mean lasting period less than 3000 hrs.
This might be so because of the small sample size. The probability will change once the sample size
increases

Question 2
> qqnorm(Weights, pch=1, frame=T,main="A QQplot of Plant weights" )
> qqline(Weights, col="steelblue", lwd=2)### For the QQ-Plots

##For Boxplots
> library("ggpubr")
> ggboxplot(PlantGrowth, x="group", y="weight", color = "group", pallete=
c("#00AFBB", "#E7B800", "#FC4E07",order= c("ctr1", "trt1", "trt2"),main="The
Boxplot of Plants group and weight", ylab= "Weight", xlab= "Treatment"))

#### For the Density Plot


library("ggplot2")
ggplot(PlantGrowth, aes(PlantGrowth$weight))+
geom_density(aes(data=PlantGrowth$weight, fill=PlantGrowth$group),
position= "identity",
alpha=0.5)+
labs(x="Weight", y="Density")+ scale_fill_discrete(name= "Group")
+scale_x_continuous(limits= c(2,8))

Discussion: The above QQ-plot suggests that the data on the weights of plant yield are normally
distributed. This is due to the fact that many of these points cluster around the straight line in the plot
above.
group ctrl trt1 trt2

6.0

5.5
weight

5.0

4.5

4.0

3.5
ctrl trt1 trt2
group

Discussion: The above is the boxplot for the data on plant growth data considering the weights across
all the groups. We can see from the plot that the highest weight is recorded for treatment2 followed
by that of the control group. However, treatment1 has the lowest weight among others. The highest
weight is 6.310 which is found at treatment2.
Discussion: This is a density plot showing the weights of plant growth across all the groups. It is also
confirmed that the highest is recorded for treatment2.

2. Significance Tetsting

The most appropriate statistical test of significance here is the t-test. This is because we are testing the
difference in the mean of just two groups out of the three available.

## R codes for the two-sample t-test


val=PlantGrowth%>%
group_by(group)%>%
filter(group%in% c("ctrl", "trt2"))
t.test(val$weight~val$group, var.equal=T, conf.level=0.90)
Results
Two Sample t-test

data: val$weight by val$group


t = -2.134, df = 18, p-value = 0.04685
alternative hypothesis: true difference in means is not equal to 0
90 percent confidence interval:
-0.89541481 -0.09258519
sample estimates:
mean in group ctrl mean in group trt2
5.032 5.526

Interpretation: The above results show that there is a significant difference between the mean weight of
plant yield for both the control group and treatment2.

Assumptions made:

a. Normality of the data


b. Equality or homogeneity of variance of the two groups
c. Adequacy of sample size.

C. To show that the data are normally distributed, we use the Kolmogorov-Smirnov test as below
coupled with showing the QQ-plot

ks.test(val$weight,y)

Two-sample Kolmogorov-Smirnov test

data: val$weight and y


D = 0.93606, p-value = 2.331e-15
alternative hypothesis: two-sided

Interpretation: The data are normally distributed as the p-value is less than 0.05 alpha level.
Furthermore, the QQ-plot above confirmed that the data are normally distributed.

3. In other for me to minimize confounding if I were to carry out this experiment, the following are
what I would do:

 I would introduce stratification on the plant plot so as to see the true effects of each group on the
total weight of yield
 I would ensure random distribution of treatments among study groups

You might also like