Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

Statistical Inference Project Report

Daniel Villegas
03/21/2015
Overview
Here we are going to study some statistical properties of the Exponential Distribution. The data that is going
to be used is going to be generated by simulations.
First of all we are going to set up the enviroment for the simulation and compute the theoretical mean and
variance values as follows:
require(ggplot2)
## Loading required package: ggplot2
set.seed(1234)
lambda <- 0.2
pop_size <- 40
N <- 1000
theoreticalMean <- 1.0 / lambda
theoreticalVar <- 1.0 / lambda ^ 2

Simulations
Sample Mean
dist <- rexp(N * pop_size, lambda)
asymp <- data.frame(Sid=rep(1,N), Mean=rep(0, N), Var=rep(0, N))
for(i in 2:N){
samp<-sample(dist, i)
asymp$Sid[i-1] <- i
asymp$Mean[i-1] <- mean(samp)
asymp$Var[i-1] <- var(samp)
}
Sample Variance
Distribution
expdata <- data.frame(Sid=rep(0,N), Mean=rep(0, N), Var=rep(0, N), CLT=rep(0, N))
dist <- rexp(N * pop_size, lambda)
for(i in 1:N){
samp <- sample(dist, pop_size)
expdata$sid[i] <- i
expdata$Mean[i] <- mean(samp)
expdata$Var[i] <- var(samp)
expdata$CLT[i] <- (expdata$Mean[i] - theoreticalMean) / sqrt(theoreticalVar / pop_size)
}
1

Sample Mean versus Theoretical Mean

ggplot(asymp, aes(x=Sid, y=Mean)) + geom_line() + geom_hline(aes(yintercept=theoreticalMean, col='red'))

10.0

Mean

7.5

5.0

2.5

0.0
0

250

500

750

1000

Sid
As it can be seen in the figure above, the mean tends to the theoretical mean as more samples are taken. In
comparison, the theoretical mean is 5 and the experimental mean is 5.012078.

Sample Variance versus Theoretical Variance


ggplot(asymp, aes(x=Sid, y=Var)) + geom_line() + geom_hline(aes(yintercept=theoreticalVar, col='red'))

100

Var

75

50

25

0
0

250

500

750

1000

Sid
As it can be seen in the figure above, the mean tends to the theoretical mean as more samples are taken. In
comparison, the theoretical mean is 25 and the experimental mean is 25.19515.

Distribution
gauss <- data.frame(x=seq(-10,10,0.1))
gauss$y <- (1/sqrt(2*pi))*exp((-gauss$x ^ 2.0)/2.0)
ggplot(expdata, aes(x=CLT)) + geom_histogram(aes(y=..density..), binwidth=0.2) +
geom_line(data=gauss, aes(x=x, y=y, col='red'))

0.4

density

0.3

"red"
red

0.2

0.1

0.0
10

10

CLT
In the plot above we compare the experimental data distribution with a Gaussian distribution of centered on
0 and with standard deviation 1. This shows that the distribution is indeed approximately normal.

You might also like