SCRIPT & CONSOLE LungCap2

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 9

SCRIPT

#Polynomial Regression

attach(LungCapData2)

summary(LungCapData2)

#Checking Linearity

plot(Height,LungCap, main = "Polynomial Regression", las=1)

#Regression

model1<-lm(LungCap~Height)

summary(model1)

abline(model1, lwd=3, col="red")

#Wrong Method

model2<-lm(LungCap~Height+Height^2)

summary(model2)

#tidak terdapat perubahan summary, masih terbaca spt model1

#Correct Method Number 1

model2<-lm(LungCap~Height + I(Height^2))

summary(model2)

#Correct Method Number 2

HeightSquare <- Height^2

model2lagi <- lm(LungCap~Height + HeightSquare)

summary(model2lagi)

summary(model2)
#Line for Model 2

lines(smooth.spline(Height, predict(model2)), col="blue", lwd=3)

#bentuk garisnya melengkung mengikuti data quadratic

#Apakah model2 lebih baik dari model1? Cek menggunakan anova

#Comparing 2 models

anova(model1, model2)

#Cubic Method --> dipangkatkan 3, apakah lebih signifikan?

model3 <- lm(LungCap~Height + I(Height^2) + I(Height^3))

summary(model3)

anova(model2, model3)

#Line for Model 3

lines(smooth.spline(Height, predict(model3)), col="green", lwd=3, lty=3)

#perbedaan dg garis model 2 tidak terlalu signifikan

#Legend

legend(46,15, legend = c("model1 : linear", "model2 : poly x ^2", "model3 : poly x ^3"), col =
c("red","blue","green"), lty = c(1,1,3), lwd = 3, bty = "n", cex = 0.9)
CONSOLE
> library(readxl)

> LungCapData2 <- read_excel("D:/MBA UGM/Kuliah online PRA MBA/Statistic for Business
Decision/R Studio/LungCapData2.xlsx")

> View(LungCapData2)

> #Polynomial Regression

> attach(LungCapData2)

The following objects are masked from LungCapData2 (pos = 3):

Age, Gender, Height, LungCap, Smoke

> summary(LungCapData2)

Age LungCap Height Gender

Min. : 3.000 Min. : 0.373 Min. :46.00 Length:654

1st Qu.: 8.000 1st Qu.: 3.943 1st Qu.:57.00 Class :character

Median :10.000 Median : 5.643 Median :61.50 Mode :character

Mean : 9.931 Mean : 5.910 Mean :61.14

3rd Qu.:12.000 3rd Qu.: 7.356 3rd Qu.:65.50

Max. :19.000 Max. :15.379 Max. :74.00

Smoke

Length:654

Class :character

Mode :character

> #Checking Linearity

> plot(Height,LungCap, main = "Polynomial Regression", las=1)

> #Regression

> model1<-lm(LungCap~Height)

> summary(model1)
Call:

lm(formula = LungCap ~ Height)

Residuals:

Min 1Q Median 3Q Max

-5.2550 -0.7986 -0.0120 0.7342 6.3581

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) -18.298036 0.544380 -33.61 <2e-16 ***

Height 0.395927 0.008865 44.66 <2e-16 ***

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.292 on 652 degrees of freedom

Multiple R-squared: 0.7537, Adjusted R-squared: 0.7533

F-statistic: 1995 on 1 and 652 DF, p-value: < 2.2e-16

> abline(model1, lwd=3, col="red")

> #Wrong Method

> model2<-lm(LungCap~Height+Height^2)

> summary(model2)

Call:

lm(formula = LungCap ~ Height + Height^2)

Residuals:

Min 1Q Median 3Q Max

-5.2550 -0.7986 -0.0120 0.7342 6.3581


Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) -18.298036 0.544380 -33.61 <2e-16 ***

Height 0.395927 0.008865 44.66 <2e-16 ***

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.292 on 652 degrees of freedom

Multiple R-squared: 0.7537, Adjusted R-squared: 0.7533

F-statistic: 1995 on 1 and 652 DF, p-value: < 2.2e-16

> #Correct Method Number 1

> model2<-lm(LungCap~Height + I(Height^2))

> summary(model2)

Call:

lm(formula = LungCap ~ Height + I(Height^2))

Residuals:

Min 1Q Median 3Q Max

-5.4031 -0.6878 -0.0076 0.6577 5.9910

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 16.080634 4.509553 3.566 0.000389 ***

Height -0.750147 0.149566 -5.015 6.83e-07 ***

I(Height^2) 0.009466 0.001233 7.675 6.07e-14 ***

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.238 on 651 degrees of freedom


Multiple R-squared: 0.7741, Adjusted R-squared: 0.7734

F-statistic: 1115 on 2 and 651 DF, p-value: < 2.2e-16

> #Correct Method Number 2

> HeightSquare <- Height^2

> model2lagi <- lm(LungCap~Height + HeightSquare)

> summary(model2lagi)

Call:

lm(formula = LungCap ~ Height + HeightSquare)

Residuals:

Min 1Q Median 3Q Max

-5.4031 -0.6878 -0.0076 0.6577 5.9910

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 16.080634 4.509553 3.566 0.000389 ***

Height -0.750147 0.149566 -5.015 6.83e-07 ***

HeightSquare 0.009466 0.001233 7.675 6.07e-14 ***

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.238 on 651 degrees of freedom

Multiple R-squared: 0.7741, Adjusted R-squared: 0.7734

F-statistic: 1115 on 2 and 651 DF, p-value: < 2.2e-16

> summary(model2)

Call:

lm(formula = LungCap ~ Height + I(Height^2))


Residuals:

Min 1Q Median 3Q Max

-5.4031 -0.6878 -0.0076 0.6577 5.9910

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 16.080634 4.509553 3.566 0.000389 ***

Height -0.750147 0.149566 -5.015 6.83e-07 ***

I(Height^2) 0.009466 0.001233 7.675 6.07e-14 ***

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.238 on 651 degrees of freedom

Multiple R-squared: 0.7741, Adjusted R-squared: 0.7734

F-statistic: 1115 on 2 and 651 DF, p-value: < 2.2e-16

> #Line for Model 2

> lines(smooth.spline(Height, predict(model2)), col="blue", lwd=3)

> #Apakah model2 lebih baik dari model1? Cek menggunakan anova

> #Comparing 2 models

> anova(model1, model2)

Analysis of Variance Table

Model 1: LungCap ~ Height

Model 2: LungCap ~ Height + I(Height^2)

Res.Df RSS Df Sum of Sq F Pr(>F)

1 652 1088.41

2 651 998.09 1 90.314 58.907 6.069e-14 ***

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> #Cubic Method --> dipangkatkan 3, apakah lebih signifikan?

> model3 <- lm(LungCap~Height + I(Height^2) + I(Height^3))

> summary(model3)

Call:

lm(formula = LungCap ~ Height + I(Height^2) + I(Height^3))

Residuals:

Min 1Q Median 3Q Max

-5.3885 -0.6900 0.0069 0.6511 5.9936

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) -6.293e-01 3.803e+01 -0.017 0.987

Height 9.179e-02 1.908e+00 0.048 0.962

I(Height^2) -4.567e-03 3.173e-02 -0.144 0.886

I(Height^3) 7.739e-05 1.749e-04 0.443 0.658

Residual standard error: 1.239 on 650 degrees of freedom

Multiple R-squared: 0.7742, Adjusted R-squared: 0.7731

F-statistic: 742.7 on 3 and 650 DF, p-value: < 2.2e-16

> anova(model2, model3)

Analysis of Variance Table

Model 1: LungCap ~ Height + I(Height^2)

Model 2: LungCap ~ Height + I(Height^2) + I(Height^3)

Res.Df RSS Df Sum of Sq F Pr(>F)

1 651 998.09

2 650 997.79 1 0.30066 0.1959 0.6582

> #Line for Model 3


> lines(smooth.spline(Height, predict(model3)), col="green", lwd=3, lty=3)

> #Legend

> legend(46,15, legend = c("model1 : linear", "model2 : poly x ^2", "model3 : poly x ^3"), col =
c("red","blue","green"), lty = c(1,1,3), lwd = 3, bty = "n", cex = 0.9)

You might also like