Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

Natalia Wood

MTH245(E40A)

Data Analysis Project


Project Summary & Findings

a. The data used for this study was extracted from the National Climatic Data Center of United
States of America. This center is in charge of providing, analyzing, and monitoring information
related to environmental changes in the U.S. to provide this information to the public and
whoever may need it. The information obtained from this center is the temperature collected
for over a century in the U.S. It goes from the years 1895 to 2013. The data reports that the
temperature fluctuates from 50.16 degrees in 1917 (minimum temperature identified) to 55.33
degrees in 2012 (maximum temperature identified). This data states that the temperature was
taking around spring or fall time or in American soil where winter and summer do not hit to
strong.
b. We can see from the histogram below, that distribution of data is approximately skewed to the
right. This can be seen from the fact that the mean (green vertical line) is greater than the
median (red vertical line). We superpose a normal distribution curve with the same values for
the mean and the standard distribution of the data. We observe that the bars in the histogram
are not symmetrically distributed around the mean.

Another way of to ensure this, is observing the boxplot below:


The distance between the first quartile Q1 and the median is 0.43, and the distance between
the median and the third quartile Q3 is 0.79, which is what commonly happens when distribution is
skewed to the right.

c. We use the 1.5IQR criterion. By the summarized data obtained in the table of descriptive
statistics in part A, c, we know that IQR=1.22 and Q3=52.78, so Q3 + 1.5IQR = 54.61. The only
potential outlier is 55.53, corresponding to average annual temperature in 2012.

d. I think the measure that best represents data, in this particular case, is the sample mean. It is
not too far from the median and, despite the distribution is skewed to the right, is almost
normal. Given that there are several values for the mode, this is not as good as mean or median
for characterize the given data.

e. The range goes from 50.16 to 55.3, i.e. 5.14 Fahrenheit degrees, with a standard deviation of
s=0.915.

f. We can be 90% confident that the mean average annual temperature in the U.S is between
52.05 and 52.32 Fahrenheit degrees.

g. Null hypothesis H0: μ = 52.1


Test statistic: z = 1.02
P-value = 0.15
h. Since the P-value is more than the significance level of 10%, there is insufficient evidence to
reject the null hypothesis, so we are not able to conclude that the mean average annual
temperature in the U.S. exceeds 52.1 Fahrenheit degrees.

i. By part C, we know the slope of the trend line is approximately 0.013. Given that this is a really
small value, we can conclude that average temperature in the U.S. has a slow rate of growth
across the years.

j. y(2016) = 27.328 + 0.013(2016) = 53.536 °F


y(2500) = 27.328 + 0.013(2500) = 59.828 °F

k. I trust in the prediction for 2016 because is a short period of time from 2013. We don’t know
which factors could emerge in 487 years, so the average temperature could have a very
different value.
Exploratory Data Analysis

A.

1.

2.
3.

b. Presence of definite outliers. We know by the 1.5IQR criterion that there is a possible
outlier of 55.13 °F. If we remove this point, we obtain the graphs below:
We can see a difference of 52.186 – 51.985 = 0.201 °F , which is a remarkable difference, given
that there is a total 119 points. So we can conclude that 55.13 is a definite outlier.

c.

B.
a. Given that n=119, we use a z instead of t
b. Scatter plot and trend line:

How has this project helped you understand the purpose and value of the study of statistics?

I have learned the importance of generate sample data from populations. This is a powerful tool for
predicting phenomena, so we can make the best choices in a particular setting, saving money and
time, and minimizing losses of all kind. So studying statistics is essential in the process of becoming
a professional

You might also like