Statistics Chapter 4 Project

Are You Hydrated?
For our Chapter 4 Statistics project we decided to compare the amount of water a
person drinks to the total of meals they eat per day. Something that everyone learns in their
health class is the amount of water you are supposed to drink; half of your body weight in
ounces. Going into this project we thought that the more meals they consume each day the
more water they would drink. Something that we also considered for our data was the amount of
athletes we have in our school. There is a very small percentage of kids in our school that do
not play sports, which means the water consumption seems like it should be higher than normal.
The way that we collected our data was through a google survey. We chose this method,
because we had to ask two questions and get at least 50 responses. We sent out our google
survey to every single student that currently attends Fowler High School and received over 50
responses in less than 24 hours. Our survey only allowed students to respond once and to only
type in numbers. This made collection much easier and organized; especially when we
transferred everything to google spreadsheets.
An explanatory variable is a variable that does not change or are not affected by another
variable. In our case we used the amount of meals someone eats per day as our explanatory
variable. We did this because we believe that when people eat they tend to want to drink more
water while consuming food. A response variable is the one that changes due to the outcome of
another. We believe that the amount of water someone drinks per day may depend on how
much they eat per day. If one was to increase then so would the other. If someone has 4 meals
a day then they would drink maybe 5 glasses of water per day.
After collecting our data we used google sheets to make a scatter plot. After creating our
graph it was very obvious that there was no correlation between the two variables. The next
step in the graph was to insert the squares regression line; better known as the best fit line.
Once we inserted this into our graph using google spreadsheets we could see that the graph
shows just a very small amount of positive correlation. If we were to choose one description
though, it would be no correlation.
We then looked out our graph and decided that there were no influential variables. If we
were to take away any of the data points our regression line would either not change or slightly
change. Our overall data would still have little to no correlation.
Our next step was to do some calculations to find a couple different things. Most of our
work was done with the help of google spreadsheets. We first decided to find ( x , y ) because it
was very simple. We used the average shortcut on google spreadsheets to find the mean of our
x and y columns. Our ( x , y ) was (2.408, 3.245). This is important, because it shows us the
middle of our squares regression line.
The second thing that we calculated was our coefficient of determination, or r 2. We

actually did not have to calculate anything to find this number, because our google spreadsheet
gave it to us once our graph was created. r 2 is how much correlation your graph has. We use a
number line from -1 to 1 to show what type of correlation a graph is. Like we had mentioned
above, our graph showed no correlation, but the regression line showed a very slight positive
response in variables.
Marginal change in simple terms is the slope of the linear regression line. It is how many
units of the response variable is expected to change for each unit of the explanatory variable. In
this survey it would be comparing how much water you would drink depending on how many
meals you have had each day. In our case it was very hard to find since we had no correlation.
There was not an exact line to follow, especially when our correlation was 0.006.
The amount of explained variation in our data set was 0.006 (0.6%) which means that
the amount unexplained was .994 (99.4%). We found this by using our r 2 and then subtracting
r 2 from 1. This tells us that almost 100% of our data is unexplained; they do not relate like we
thought they would.
We do not believe there are any lurking variables, because we do not know if anyone is
lying or not. Some people may not know exactly how much water they drink, but in our survey
we asked that they can give us an average or close guess to what they think. Our options as
answers were only whole numbers, so people were able to either round up or down.
The final thing we did was to use our linear regression line to predict a data point that is
not in our data set. Interpolation is to predict ^y by using an x value in our data set. In our data
set if we were to look at 3 meals per day our ^y could be around 3 glasses of water if we want
the option closest to our linear regression line. Extrapolation is when we predict the ^y using x
values that are not beyond our data set. We chose to see how many glasses of water someone
may drink if they were to eat 5 meals per day. After looking at the other points and where our
line would sit we would say somewhere between 4 and 5 glasses of water.
In conclusion, we were very surprised by our results from the survey. Our hypothesis in
the beginning was a positive correlation between the amount of meals consumed per day
compared to the amount of water drank per day. Once we retrieved all of our data and put it into
a graph we were both very confused. Why did our graph look like a perfect chart? It was
because we had whole numbers that were all very close to each other. If we were to change
anything in our survey it would change the amount of water someone may drink to the amount
of liquids in total someone may drink. It was brought to our attention that not everyone drinks
water. A lot of kids who play sports drink sports drinks, protein shakes, and more. In the end we
do not believe the amount of meals consumed each day and the amount of water drank each
day are not very well correlated.

Statistics Chapter 4 Project

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Statistics Chapter 4 Project

Uploaded by

Copyright:

Available Formats

Are You Hydrated?

The second thing that we calculated was our coefficient of determination, or r 2. We

You might also like