Professional Documents
Culture Documents
BStat Assignment 3
BStat Assignment 3
BStat Assignment 3
Multiple Regression
Using the data collected in part I, we can implement multiple regression to build 4 regression models
that will predict internet usage using data we have collected from countries in 4 income groups.
Firstly, we will use Microsoft Excel’s regression analysis tool to clean and select only the most significant
data, applying backward elimination. Considering the requirement significant level is 5% (α=0.05,
α/2=0.025), we apply the p-value method to determine significant variables.
Applying the backward elimination method, the final regression models are presented below:
Figure 3.1: Regression output and scatter plot for high-income countries.
b. Regression equation:
Y^ = 73.36 + 0.000312X
Y^ = Individual using the internet (% of the population)
X = GNI per capital (current US$)
According to theory, if GNI per capita increases by 1 unit (1 current USD), then Internet usage of high-
income countries will increase by 0.000312%. Applying this scenario, each 1000USD increase in
GNI/capita will result in a 0.312% increase in internet usage.
R-squared equal to 0.4362 (Figure 3.1), which suggests there is a 43,62% difference in the overall
internet usage. This difference could be explained by the independent variable. The remainder is
attributed to other variables not included in the model.
b. Regression equation:
Y^ = -159.699 + 2.253X
Y^ = Individual using the internet (% of the population)
Where X = Access to electricity
According to theory, if Access to electricity increases by 1 unit (1% of the population), then Internet
usage of upper-middle-income countries will increase by 2.253%. Applying this scenario, each 10%
increase in access to electricity will result in a 22.53% increase in internet usage.
R-squared equal to 0.324 (Figure 3.2), which suggests there is a 32,4% difference in the overall internet
usage. This difference could be explained by the independent variable. The remainder is attributed to
other variables not included in the model.
b. Regression equation
Y^ = b 0+ b1X
Y^ = -2.3902 + 0.0192X
Where:
b 0 = -2.3902
b 1 = 0.0192
R-squared equal to 0.7007 (Figure 3.3), which suggests there is a 70,07% difference in the overall
internet usage. This difference could be explained by the independent variable. The remainder is
attributed to other variables not included in the model.
Figure 3.4: Regression output and scatter plot for low-income countries.
b. Regression equation
Y^ = 10.077 + 0.282X
Where:
b 0 = 10.077
b 1 = 0.282
According to theory, if Access to electricity increases by 1 unit (1% of the population), then Internet
usage of lower-middle-income countries will increase by 0.282%. Applying this scenario, each 10%
increase in access to electricity will result in a 2.8% increase in internet usage.
R-squared equal to 0.801 (Figure 3.4), which suggests there is an 80,1% difference in the overall internet
usage. This difference could be explained by the independent variable. The remainder is attributed to
other variables not included in the model.