Regression Project Report

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

Megan Griessen

Algebra 2
Kiker 1

Regression Project Report


Smoking cigarettes gives adults who smoke a 73% chance that they will be effected by
lung cancer or lung disease. I researched the percentage of the female smoking prevalence in 15
different countries from different parts of the world. Albania, Argentina, Australia, Belgium,
Belize, Cambodia, Canada, Estonia, Finland, France, Germany, Greece, and Guyana. I then
researched the percentage of women who make it past the age of 65 in the same countries.
Everyone who smokes or has smoked a cigarette in their lifetime would value this information. I
wanted to see if there was a correlation between the age that the females in these countries live to
verses the percentage of women who smoke in those countries. I believe that people who smoke
have a worse chance of making it to the age of 65 verses those who don't.
I made a scatter plot of the different data points I took. And I expected to see the life
expectancy go up as the smoking prevalence went up, but it stayed more constant and even
pointed in a positive correlation. The information I gathered did not display how I expected it to,
the death percentages weren't that high for the smoking prevalence, so I think the death rates
ranges from different things like the poor countries verses the rich countries and the kind of
health care they have.
I also looked at the regression equation that could describe the data points. The cubic
function that described the data points was: y=-5.099x10-5x3 .0101x2 + 76.7716. In order to
figure out if this equation was best, I entered the different data points to see which r2 value was
more accurate. For the linear function the r2 value was: r2= .45557. Quadratic: r2= .49149.
Exponential: r2= .44534. The power function was: r2= .4576. And the cubic function was: r2= .
49151. I got cubic for the function because it had the highest r2 value. Even though it was the
highest r2 value, it was still pretty low for an r2 value closest to one. This means that the data was
more spread out and there was a low correlation between the data points.
In order to find the slope of a line we have to have a linear form. The linear form for this
set of data points is : y=.4681 , which means as the smoking prevalence rate goes up, so does the
survival rate. And this was unexpected because I expected a negative slope. The y intercept was,
79.21, this means that when the smoking prevalence rate is at 0, the survival rate is 79.21. Since
the survival rate is so high without smoking it doesn't seem like smoking affects the survival rate
that much. Here is the prediction of the y values if the x-values were 0, 10 , and 20:

x-value

y-value

76.77716

10

84.37

20

89.63

Megan Griessen
Algebra 2
Kiker 1

I used the cubic equation that my calculator came up with and three different x-values to come
up with three predicted y-values. And the table below is the data I came up with.

Female Survival to Age 65

Survival to Age 65

100.00

75.00

50.00

25.00

0
0

10.00

20.00

30.00

40.00

Female Smoking Prevalence

Anti-smoking campaigns could use the data to encourage people to stop smoking. They could
also use this data to predict what will happen if women smoking prevalence increases. This
information could also educate our society and possibly allow our smoking rates to go down.
Even though the data I collected did not turn out as I predicted to where the survival rate would
be lower to where the survival rate would be lower to the where the smoking prevalence is
higher, I could go further with my research and even compare the smoking rate to the rate of lung
cancer and other lung disease in the exact same countries I listed before because I found that
lung cancer effects most smokers.

Megan Griessen
Algebra 2
Kiker 1

Citations:
www.databank.com - Female Smoking Prevalence world wide and Female Survival Rate to
age 65 worldwide

You might also like