Professional Documents
Culture Documents
Statistical Inference Course Project
Statistical Inference Course Project
Daniel Villegas
03/21/2015
Overview
Here we are going to study some statistical properties of the Exponential Distribution. The data that is going
to be used is going to be generated by simulations.
First of all we are going to set up the enviroment for the simulation and compute the theoretical mean and
variance values as follows:
require(ggplot2)
## Loading required package: ggplot2
set.seed(1234)
lambda <- 0.2
pop_size <- 40
N <- 1000
theoreticalMean <- 1.0 / lambda
theoreticalVar <- 1.0 / lambda ^ 2
Simulations
Sample Mean
dist <- rexp(N * pop_size, lambda)
asymp <- data.frame(Sid=rep(1,N), Mean=rep(0, N), Var=rep(0, N))
for(i in 2:N){
samp<-sample(dist, i)
asymp$Sid[i-1] <- i
asymp$Mean[i-1] <- mean(samp)
asymp$Var[i-1] <- var(samp)
}
Sample Variance
Distribution
expdata <- data.frame(Sid=rep(0,N), Mean=rep(0, N), Var=rep(0, N), CLT=rep(0, N))
dist <- rexp(N * pop_size, lambda)
for(i in 1:N){
samp <- sample(dist, pop_size)
expdata$sid[i] <- i
expdata$Mean[i] <- mean(samp)
expdata$Var[i] <- var(samp)
expdata$CLT[i] <- (expdata$Mean[i] - theoreticalMean) / sqrt(theoreticalVar / pop_size)
}
1
10.0
Mean
7.5
5.0
2.5
0.0
0
250
500
750
1000
Sid
As it can be seen in the figure above, the mean tends to the theoretical mean as more samples are taken. In
comparison, the theoretical mean is 5 and the experimental mean is 5.012078.
100
Var
75
50
25
0
0
250
500
750
1000
Sid
As it can be seen in the figure above, the mean tends to the theoretical mean as more samples are taken. In
comparison, the theoretical mean is 25 and the experimental mean is 25.19515.
Distribution
gauss <- data.frame(x=seq(-10,10,0.1))
gauss$y <- (1/sqrt(2*pi))*exp((-gauss$x ^ 2.0)/2.0)
ggplot(expdata, aes(x=CLT)) + geom_histogram(aes(y=..density..), binwidth=0.2) +
geom_line(data=gauss, aes(x=x, y=y, col='red'))
0.4
density
0.3
"red"
red
0.2
0.1
0.0
10
10
CLT
In the plot above we compare the experimental data distribution with a Gaussian distribution of centered on
0 and with standard deviation 1. This shows that the distribution is indeed approximately normal.