Exercise - MLR - Colaboratory

6/11/23, 11:26 Exercise_MLR Marcela.
ipynb - Colaboratory
# As seen in class, in order to explain the effect of the predictors (X6 to X18) on the outcome,
# we must interpret the standardized regression coefficients (i.e., the coefficients for the regression
# after standardizing all variables).
# Run the standardized regression completing the current file with your own code.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import scipy.stats as sc
import statsmodels.graphics.gofplots as sm
hbat = pd.read_csv('HBAT-CSV.csv', index_col = 0)
# selecting predictors (independent variables; x6 thorugh x18) and target (dependent, X19; satisfaction)
# first run: Confirmatory approach, all independnent variables included
variables = hbat.iloc[:, 5:18]
target = hbat[["X19"]]
# Create standardized variables to obtain standardized regression coefficients

from sklearn import preprocessing
z_target = preprocessing.scale(target)
z_variables = preprocessing.scale(variables)
# Now build your regression model with the standardized variables.

# Write your code here and respond to the following questions:
Z_test= sm.add_constant(z_variables) #intercept (beta_0 to our model)

model_test= sm.OLS(z_target,Z_test).fit() #sm.OLS(output,input)
predictions_test= model_test.predict(Z_test)
#print results
model_test.summary()
# Which predictors are the most important for predicting Customer Satisfaction (X19)?
#In order to know which ones are the most important for predicting Customer Satisfaction, we should consider the
# p-value for each coefficient. Therefore, the most important ones whould be under 0,05.
# Looking at the variables, the most important ones are (in the same order): x1,x2,x7
# Which are the least important?

# we can see that the less significant are x10,x13,x5 and x3
output
https://colab.research.google.com/drive/1ZQYrKemp4yrvormaDQTnl6n3cSvBrnu-#scrollTo=_ts4uIuOKyYB&printMode=true 1/2
6/11/23, 11:26 Exercise_MLR Marcela.ipynb - Colaboratory
OLS Regression Results

Dep. Variable: y R-squared: 0.804
# Now interpret the value of the coefficient associated with X6.
Model: OLS Adj. R-squared: 0.774
#X6 coefficient is 0.3999. This means that for each increment of 1 standard deviation in x6 product quality, the
Method: Least Squares F-statistic: 27.11
#operation: dependent variable (x19)/consumer satisfaction, is expected to increase 0.3999 standard deviation
Date: Mon, 06 Nov 2023 Prob (F-statistic): 5.89e-25
#This is assuming all other variables stay constant
#Since it'sTime:
a positive10:26:11
coefficient,Log-Likelihood:
it suggests that -60.449the perception in product quality is associated with customer
#satisfaction. nevertheless,
No. Observations: 100 the p-value must be148.9
AIC: also considered to determine its importance.
#The coefficient
Df Residuals:of 86
a standardized model represents
BIC: 185.4the change in the dependent variable (Y) in terms of standard
#deviation for a 1 standard
Df Model: 13 deviation change in the independent variable (X)
Covariance Type: nonrobust
# How does X6 (the perception of Product Quality) affect X19 (Customer Satisfaction)?
coef std err t P>|t| [0.025 0.975]
# X6 (perception of product quality) is not significant for predicting. Customer satisfaction has a p-value of 0.178,
const 1.311e-15 0.048 2.75e-14 1.000 -0.095 0.095
# which is greater than 0.05. It's a positive coefficient but it has no statistical relevance for predicting
# x1 0.4422
customer 0.062 7.161
satisfaction. 0.000we
In fact, 0.319 0.565
could delete this variable for a better fit of the model.
x2 -0.2681 0.080 -3.341 0.001 -0.428 -0.109
# Remember to interpret
x3 0.0452 the standardized
0.083 0.542 coefficients as standard deviations!
0.589 -0.121 0.211
#Given x4
it 0.1564
is a standardized
0.105 1.489 model,
0.140 the
-0.052variables
0.365 have already been previously standardized.
x5 -0.0325 0.059 -0.548 0.585 -0.151 0.086
x6 0.3999 0.294 1.359 0.178 -0.185 0.985
x7 0.7444 0.091 8.155 0.000 0.563 0.926
x8 -0.0615 0.062 -0.985 0.328 -0.186 0.063
x9 -0.0736 0.086 -0.852 0.397 -0.245 0.098
x10 -0.0037 0.050 -0.074 0.941 -0.102 0.095
x11 0.1115 0.081 1.369 0.175 -0.050 0.273
x12 0.2408 0.276 0.873 0.385 -0.307 0.789
x13 -0.1535 0.317 -0.485 0.629 -0.783 0.476
Omnibus: 5.761 Durbin-Watson: 2.285
Prob(Omnibus): 0.056 Jarque-Bera (JB): 5.850
Skew: -0.587 Prob(JB): 0.0537
Kurtosis: 2.838 Cond. No. 19.9
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
https://colab.research.google.com/drive/1ZQYrKemp4yrvormaDQTnl6n3cSvBrnu-#scrollTo=_ts4uIuOKyYB&printMode=true 2/2

Exercise - MLR - Colaboratory

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Exercise - MLR - Colaboratory

Uploaded by

Copyright:

Available Formats

6/11/23, 11:26 Exercise_MLR Marcela.

hbat = pd.read_csv('HBAT-CSV.csv', index_col = 0)

# Create standardized variables to obtain standardized regression coefficients

# Now build your regression model with the standardized variables.

Z_test= sm.add_constant(z_variables) #intercept (beta_0 to our model)

# Which are the least important?

OLS Regression Results

You might also like