Professional Documents
Culture Documents
AB1202 Statistics and Analysis: Non-Linear Regression
AB1202 Statistics and Analysis: Non-Linear Regression
AB1202 Statistics and Analysis: Non-Linear Regression
Non-Linear Regression
• Variable Transformation
• Exponential Models
• Power Index Models
• Logistic Regression
NBS 2016S1 AB1202 CCK-STAT-018
3
Variable Transformation
• What we want is to re-use (linear) multiple regression
techniques.
• So we transform the variables so the data is more
aligned as a line, plane or hyperplane in higher
dimensions.
• In theory we can use any complicated functions (even
those that perfectly align data to a straight line). But in
practice, we only stick to standard functions in as
simple and understandable a way as possible.
1
▫ 𝑥 → 𝑥 → log(𝑥), 𝑥 → ln(𝑥), 𝑥 → 𝑥, 𝑥 → 𝑥 2 , 𝑥 →
,
𝑥
cos(𝑥), 𝑥 → sin(𝑥), 𝑥 → 𝑒 −𝑥
NBS 2016S1 AB1202 CCK-STAT-018
5
Exponential Models
• In exponential models, the relationship between the
outcome variable 𝑦 and explanatory variables 𝑥1 and 𝑥2 is:
𝑦 = 𝑏0 𝑏1 𝑥1 𝑏2 𝑥2
• Appling ln() function, we get the transformed linear
model:
ln 𝑦 = ln 𝑏0 + 𝑥1 ln 𝑏1 + 𝑥2 ln 𝑏2
d<-read.delim( textConnection(datatext),
header=TRUE, sep="",
Power Index model is:
strip.white=TRUE) ln(y) = 4.0625 – 0.6181
lny = log(d$y)
lnx = log(d$x)
ln(x)
d$lny = lny
d$lnx = lnx
model = lm(d$lny ~ d$lnx)
So, 𝒚 = 𝟓𝟖. 𝟏𝟏𝟗𝟒 × 𝒙−𝟎.𝟔𝟏𝟖𝟏 .
summary(model)
Model significance: 0.07736
Call: lm(formula = d$lny ~ d$lnx) Model is NOT significant
Coefficients:
Estimate Std. Error t value Pr(>|t|) at 5%, but significant at
(Intercept) 4.0625 0.9760 4.162 0.0088 ** 10%.
d$lnx -0.6181 0.2787 -2.218 0.0774 .
--- 𝑅2 and adj-𝑅2 are very low
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 and not acceptable.
Residual standard error: 0.2577 on 5 degrees of freedom
Multiple R-squared: 0.4958, Adjusted R-squared: 0.395
F-statistic: 4.918 on 1 and 5 DF, p-value: 0.07736
NBS 2016S1 AB1202 CCK-STAT-018
9
Logistic Regression
• Logistic regression is used in finding a model to predict the
likely binary outcomes when given other known data.
• Eg, campaign voting usually end up having two outcomes;
Brexit: Leave-vs-Remain, US Presidential Election: Clinton-
vs-Trump, NTU Student Union Presidential Election: Keller
vs Wayne
• If we know demographics of sample voters and their vote,
can we find a model that describes the voting outcome of any
given individual?
• Other applications:
▫ Predicting rain/no-rain knowing past wind speeds, cloud
heights, sunny days, etc
▫ Predicting business or investment success/failure, knowing
revenue, profit, asset, liability, management experience, etc
▫ Predicting surgery success/failure, knowing patient vital signs,
medical treatment done, etc
NBS 2016S1 AB1202 CCK-STAT-018
10