Download as pdf or txt
Download as pdf or txt
You are on page 1of 1

Introduction to Statistics and Computation with Data Homework 5

http://home.icts.res.in/∼athreya/Teaching/ISCD24 Semester II 2023/24

Due: February 29th, 2024,10pm

1. Using rexp and runif command generate 100 samples from Exponential(1) and Uniform (0, 1).
Then using the qqnorm and qqline function decide how they deviate from the normal distribution.
2. The dataset SuppandiZoomLog.csv contains data of meetings that Suppandi did in 2023 on four
topics: Air, Earth, Fire and Water across cities in the world. The variables are:Topic of the meeting;
Start Date/time of the meeting (at Venue); End Date/time of the meeting (at Venue); Duration of
the meeting (in minutes); Participants (users logged in); Venue of the meeting (city name).
Using ggplot code plot the following graphs:
(a) Bar chart describing distribution over Venue stacked by Topic.
(b) Box plot with points jittered of duration times on each topic.
(c) Polar-coord bar chart on distribution of meetings across weekdays (hint: use package lubridate)
(d) Histogram of meetings over starting time (in hour units)
(e) Frame a hypothesis on any of the variables on the data set and try to test it as best as you
can graphically and with empirical statistics.
3. (Sums of Rolls) Suppose we wish to simulate in R the experiment that we did in class during the
first week of Rolling a die and noting down its sum. We can use the sample, matrix and apply.

> x = c(1,2,3,4,5,6)
> probx= c(1/6,1/6,1/6,1/6,1/6,1/6)
> Rolls=sample(x, size=1500, replace=T, prob=probx)
> Rollm=matrix(Rolls, nrow = 5)
> Rollsums = apply(Rollm, 2, sum)
> library(ggplot2)
> density = function(x,a,s){ (1/((2*pi)^(0.5)*s ))* exp(-(x-a)^2/(2*s^2))}
> dfrolls = data.frame(Rollsums)
> mu = mean(dfrolls$Rollsums)
> sigma= sd(dfrolls$Rollsums)
> ggplot(data=dfrolls) + geom_histogram(mapping=aes(x=Rollsums,y=..density..), color="#00846b", fil

(a) Describe the commands matrix and apply


R 21
(b) From the picture what does 12 density(x, mu, sigma)dx approximate ?
(c) If
Z b  2
1 x
Area under the histogram between 12 and 21 ≈ √ exp − dx,
a 2π 2
then what would be your guess for a and b
(d) Repeat the above for Dice data from class available on the course website.

4. Let X be a Normal Random variable with mean µ = 0 and standard deviation σ = 1, x =


−2, −1.9, −1, 8, ...0, ..., 1.9, 2, m = 100, p = 0.4 and Y ∼ Binomial(m, p).
(a) Using inbuilt pnorm plot the distribution function P (X ≤ x).
(b) Let Z = √Y −mp , using inbuilt pbinom plot the distribution function P (Z ≤ x)
mp(1−p)

(c) Plot the histogram of 1000 samples generated from Binomial(m, p) and density plot of a Normal
with mean mp and Variance mp(1 − p), like in the previous question. What can you conclude
?

You might also like