Written Assignment Unit 2-Statistical Inference

Some say that optimal
estimators should be preferred

while others advocate the use of
more
robust estimators. What is your
opinion?
An optimal estimator is an
estimator that performs better
than any other estimator with
respect to that criteria. A robust
estimator, on the other hand, is
an estimator that is not
sensitive to misspecification of
the theoretical model. Hence, a
robust estimator may be
somewhat inferior to an optimal
estimator in the context of an
assumed model. However, if in
actuality the assumed model is
not a good description of reality
then the robust estimator will
tend to perform better than the
estimator denoted optimal.
(Yakir, 2011)
In the example of the house
prices in cities, where there are
low and high cost of living. A
population or sample for low-
cost or high cost of living areas
can be created. Sample of 2,500
versus the population of 5,000.
Robust estimators will help here
in the fewer factual or real
event, since optimal estimator is
easy for linear situations like
this and with the costs known,
there will be price ranges from
high to low and less outliers for
the living space (area) in
relation to the values (costs),
other than that the distribution
may
seem symmetric, with low
standard deviation betwixt the
values. Also, Uniform
distribution
will aid the explanation above,
assuming looking at the values,
its variances and standard
deviation which may cause the
deduction estimation accuracy
than sample average which
will give us better view but
outliers may cause error in the
average while mid-range
estimator
is supportive to get the cost
estimation. (Yakir, 2011)
Reference
Yakir, B. (2011). Introduction
to Statistical Thinking (With R,
Without Calculus). Jerusalem,
IL: The Hebrew University of
Jerusalem, Department of
Statistics. Retrieved
from
https://my.uopeople.edu/pluginf
ile.php/1524588/mod_resource/
content/5/Textbook
%20Statistical%20Inference.pdf
1. The distribution of the variable "frequency" is:
__ Skewed to the left, __ Symmetric, __ Skewed to the right.
Mark the most appropriate option and explain your selection
After the statistics were entered and the distributions were displayed using the R language, it
was evident that the majority of the concentrations were in the range from zero and 10, but
there's also a tailed of bigger numbers on the opposite side. It is right-skewed because the
mean is greater than the median.
2. The number of outlier observations in the variable "frequency" is: 45
Explain each step in the computation of the number of outlier observations
In the higher tail, there exist outliers. I attempted to enumerate to determine how many outliers
there were, but the numbers kept failing, so I used the approach. The higher limit is Q3+15. (Q3-
Q1).
3. Which of the following theoretical models is most appropriate to describe the distribution of
the variable "frequency"?
__ Binomial, __ Poisson, __ Uniform, __ Exponential, __ Normal.
Mark the most appropriate option and explain your selection
Estimating Parameters:
After careful consideration, one will characterize this form of poisson distribution since
frequency is presented as a discrete random variable.
4. The estimated value of the expectation of the measurement "frequency" is: 5.515
Explain your answer
The sample mean is the most accurate estimator for this assumption.
>mean(transfusion$frequency)
[1] 5.514706
5. The estimated value of the standard deviation of the measurement "frequency" is: 5.839307 or
2.348341
Explain your answer
Since the sampling standard is a reliable estimator for the following question, we may apply the
code:
>sd(transfusion$frequency)
[1] 5.839307
However, we may alternatively think of the probability as a Poisson distribution, where the
variance is equal to the mean as a result:
>sdqrt(5.514706)
[1] 2.348341
6. The estimated value of the standard deviation of the estimator that produced the estimate in
4. is: 0.2135062 or 0.08586385
Explain your answer
The standard deviation of the sample average is equivalent to the measurement's standard
deviation (sd), divided by the sampling size's square root. The measurement's deviation was
calculated in 5.
748 people make up the sample.
The codes can be used to calculate the numbers:
Taking sample standard deviation into account
>sd(transfusion$frequency)/sqrt(748)
[1] 0.2135062
Thinking about the Poisson distribution
>sqrt(mean(transfusion$frequency))/sqrt(748)
[1] 0.08586385
7. The estimated value of λ for the variable "monetary" is: 0.0007253333

Attach the R code for conducting the computation
the sample average is one over in the provided estimator for the parameter in the exponential
distribution. The code can be used to calculate the estimated value:
>1/mean (transfusion$monetary)
[1] 0.0007253333
8. The estimated value of the MSE of the estimator of λ is: 0.0007641609
Attach the R code for conducting the computation
The variables for the MSE estimations can be simulated.
>lam = 1/mean(transfusion$monetary)
>lam.hat = rep(0,10^5)
>for(i in 1:10^5)+ { X = rexp(748,lam)+ lam.hat[i]=1/mean(x)+ }

>var(lam.hat)+(mean(lam.hat)-lam)^2
[1] 0.0007641609

Written Assignment Unit 2-Statistical Inference

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Written Assignment Unit 2-Statistical Inference

Uploaded by

Copyright:

Available Formats

Some say that optimal

estimators should be preferred

748 people make up the sample.

The codes can be used to calculate the numbers:

Taking sample standard deviation into account

Thinking about the Poisson distribution

7. The estimated value of λ for the variable "monetary" is: 0.0007253333

The variables for the MSE estimations can be simulated.

>for(i in 1:10^5)+ { X = rexp(748,lam)+ lam.hat[i]=1/mean(x)+ }

You might also like