Professional Documents
Culture Documents
Assignment SPSS - 6
Assignment SPSS - 6
Dear student,
Below you find the assignment of Week 5 of Research and Statistics.
- Make the assignment using SPSS and put the answers/tables in the file below.
- Upload the assignment as pdf in Canvas.
- Go to the internet quiz https://tueindhoven.limequery.com/488385?newtest=Y&lang=en
and fill in the answers there.
Please enter the id-number that is represented on your student ID-card in the
orange section of the card.
The computer will process your answers and you will receive personalized feedback on what
you did correct or wrong.
Contents:
• multiple regression for predicting and explaining
• method stepwise and enter
• multicollinearity
• dummy regression
Roman Oana
1656309
1
Case 1: predicting thermal sensation of museum visitors
Open the file ‘museum2_s.sav’. It contains the following variables:
therm_sens : perceived warmth (how do you feel at the moment?):
-3=cold, -2=cool, -1=slightly cool, 0=neutral, 1=slightly warm, 2=warm and 3=hot
temp_in : the current temperature inside the building (degrees Celsius)
rh_in : the humidity degree of the air inside the building
gender : 0=male, 1=female
age : age of the person in years
clo : index from 1 to 10 of how warm the clothes of the visitor are
We want to predict therm_sens using as predictors the variables: gender, age, clo, temp_in and rh_in.
Descriptive Statistics
N Minimum Maximum Mean Std. Deviation
Age 1140 9 91 55.82 17.827
gender 1140 0 1 0.64 0.480
therm_sens 1140 -3 3 -0.7 0.902
Clo 1140 1 10 10 1.276
temp_in 1140 19 24 21.18 1.107
rh_in 1140 40 60 50.40 3.285
Valid N (listwise) 1140
age2
Cumulative
Frequency Percent Valid Percent Percent
Valid under 18
between 18 and 64
65 years or older
Total
2
gender
Cumulative
Frequency Percent Valid Percent Percent
Valid 0 409 35.9 35.9 35.9
1 731 64.1 64.1 100
Total 1140 100 100
Correlations
therm_sens gender age clo temp_in rh_in
therm_sens Pearson Correlation 1 -0.079 -0.028 0.140 0.211 -0.164
Sig. (2-tailed) 0.008 0.344 0.001 0.001 0.001
N 1140 1140 1140 1140 1140 1140
gender Pearson Correlation -0.079 1 0.057 -0.042 0.002 0.013
Sig. (2-tailed) 0.008 0.054 0.158 0.956 0.655
N 1140 1140 1140 1140 1140 1140
Age Pearson Correlation -0.028 0.057 1 0.045 -0.064 -0.071
Sig. (2-tailed) 0.344 0.054 0.126 0.031 0.061
N 1140 1140 1140 1140 1140 1140
Clo Pearson Correlation 0.140 -0.042 0.045 1 -0.099 -0.225
Sig. (2-tailed) 0.001 0.158 0.126 0.001 0.001
N 1140 1140 1140 1140 1140 1140
temp_in Pearson Correlation 0.211 0.002 -0.064 -0.099 1 0.115
Sig. (2-tailed) 0.001 0.956 0.031 0.001 0.001
N 1140 1140 1140 1140 1140 1140
rh_in Pearson Correlation -0.164 0.013 -0.071 -0.225 0.115 1
Sig. (2-tailed) 0.001 0.655 0.061 0.001 0.001
N 1140 1140 1140 1140 1140 1140
**. Correlation is significant at the 0.01 level (2-tailed).
*. Correlation is significant at the 0.05 level (2-tailed).
3
3. Make a regression model to predict therm_sens based on the other available variables
(gender, age, Clo, tmep_in and RH_in). Use the ENTER method.
In table below:
- Report the (unstandardized) coefficients.
- Report the significance level:
A. Not significant
B. Significant at 10%
C. Significant at 5%
D. Significant at 1%
coeff significance
temp_in 0.196 D
rh_in -0.045 D
gender -0.133 C
Age -0.001 A
Clo 0.088 D
Coefficientsa
B Std. Error Beta P-value
1 (Constant) -2.109 0.637 -3.311 0.001
Gender -0.133 0.053 -0.071 -2.501 0.013
Age -0.001 0.001 -0.026 -0.919 0.358
Clo 0.088 0.020 0.125 4.302 0.001
temp_in 0.196 0.023 0.241 8.466 0.001
rh_in -0.045 0.008 -0.164 -5.653 0.001
a. Dependent Variable: therm_sens
gender:
A. Controlling for all other variables, the score on thermal sensation decreases with 0.133 if the visitor
is male;
B. Controlling for all other variables, the score on thermal sensation decreases with 0.133 if the visitor
is female;
4
5. What is the goodness-of-fit of the model? 0.102
Is this a high or a low fit?
A High
B Low
Model Summary
Adjusted R Std. Error of
Model R R Square Square the Estimate
1 0.319 0.102 0.098 0.857
a. Predictors: (Constant), rh_in, gender, age, temp_in, clo
5
Case 2: predicting people’s satisfaction with green in
neighborhood public spaces.
The dataset you are going to analyze in this case is gathered from the questionnaire ‘green in public
spaces’.
The survey includes an experiment where the respondent is asked to indicate his or her evaluation of
hypothetical neighborhoods. The neighborhoods that are presented vary in terms of a number of
attributes. The participants are randomly allocated to either one of the two versions of the experiment.
In both versions VR is used to present the neighborhoods. The versions differ in the mode used – a
game like mode where the respondent can self-navigate and a movie mode where the respondent
seemingly walks through the environment without this possibility.
The main research question is what are the effects of the attributes varied in the experiment on the
satisfaction judgement and what is the effect of VR mode on this?
Each respondent has received four neighborhood variants. Each row in the SPSS table is a judgement
of a neighborhood by a person (so there are 4 rows for each person). A judgment is the unit of
analysis.
1. satgreen Degree of satisfaction with the neighborhood (increasing scale from 0 to 24)
2. gender Gender (1=Male, 0=Female)
3. garden Private garden around your house ( 1=yes, 0=no)
4. size Size of the public space (1=1500m2, 0=750m2)
5. surface Surface of the public space(1=grass, 0=pavement)
6. water Water element in open space ( 1=yes, 0=no)
7. streetgrass Grass along the street (1=yes, 0=no)
8. tree Trees(1=yes, 0=no)
9. vertgreen Vertical greening on facades (1=yes, 0=no)
10. avestories Average number of surrounding building stories (1=six stories, 0=three
stories)
11. mode Mode of the presented virtual environment (0=movie, 1=game)
The variables 4-10 are attributes of the neighborhoods that are varied in the satisfaction judgement
task.
Mode
Frequency Percent Valid Percent Cumulative
Percent
Valid Movie
524 52.2 52.2 52.2
game
480 47.8 47.8 47.8
Total
1004 100 100 100
6
9. Conduct an independent samples t-test to determine whether there is a difference in the
average satisfaction score between the two modes.
Define Mode as the group variable and Satgreen as the dependent variable. Conduct the test two-
tailed, since we have no a-priori expectation of in which way the modes differ.
9a. What is the average satisfaction score in the movie-mode group? __14.47___
9b. What is the average satisfaction score in the game-mode group? __14.82___
10. Report the t-value of this test (choose the right t-value depending on whether equal
variances may or may not be assumed). The t-value is __-1.089___
Correlation coefficient
Gender -0.049
garden 0.061
Size of the public space -0.020
Surface of the public space 0.183
water 0.094
Grass along the street 0.097
Trees 0.471
Vertical greening on facades 0.206
Average number of surrounding building
0.053
stories
Mode 0.280
13. Which of the variables has the strongest correlation with satisfaction and which one the
weakest?
7
H. Vertical greening on facades
I. Average number of surrounding building stories
J. Mode
Focus now on the correlation coefficients of the variables that are related to attributes of the
neighborhood
14. Do all the correlations that ARE significant have the expected sign?
Report in the table below
A. Not significant
B. No
C. Yes
D. No prior expectation
Expected sign
16. Conduct a multiple regression-analysis with satgreen as dependent variable and all other
variables as predictors.
Use the ENTER method.
Report the unstandardized coefficients from the final model, and
report the significance level:
A. Not significant
B. Significant at 5%
C. Significant at 1%
coefficient significance
Gender -0.356 A
Garden 0.768 C
Size of the public space -0.034 A
Surface of the public space 1.962 C
water 0.656 B
Grass along the street 0.735 C
Trees 4.684 C
Vertical greening on facades 2.063 C
Average number of surrounding building 0.384 A
stories
Mode 0.191 A
Coefficientsa
8
Unstandardized Standardized
Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) 9.108 0.456 19.962 0.001
A first question is whether there is an influence of VR mode on the average satisfaction score
(satgreen). The independent samples t-test already provided an answer of this question. However, the
t-test is based on a bivariate analysis. The multiple regression analysis which you do next is a
multivariate analysis. This provides a more definite answer since any differences (of the two mode
groups) on other independent variables are then controlled for.
17b. The value of XXX is __0.191__ (fill in the value of XXX that is referred to in the answer
options A, B and C)
A second question is what the preferences of people (students of the Research and statistics course)
are regarding the different attributes.
Indicate in the table below which attributes have a significant effect on satgreen, use:
A. Not significant
B. Significant at 5%
C. Significant at 1%
Significance
size Size of the public space (1=1500m2, 0=750m2) A
surface Surface of the public space(1=grass, 0=pavement) C
water Water element in open space ( 1=water, 0=no water) B
streetgrass Grass along the street (1=yes, 0=no) C
9
tree Trees (1=yes, 0=no) C
vertgreen Vertical greening on facades (1=yes, 0=no) C
avestories Average number of surrounding building stories (1=six
A
stories, 0=three stories)
18b. The value of XXX is __-0.034__ (fill in the value of XXX that is referred to in the answer
options A, B and C)
Which attribute has the largest influence on the degree of satisfaction? (Make sure that you look at the
standardized beta coefficient).
19. The most important attribute is: ___Trees___
Predict the satisfaction score of the neighborhood of a female student who has a private
garden.
The predicted satisfaction score is ___18.969___
21. Run the last regression again using only game mode data, report the Model summary ( R2)
and Coefficients. (choose from the menu Data > Select cases and then set condition Mode = 1)
Model Summary
Model R R Square Adjusted R Square Std. Error of the
Estimate
1 0.539 0.291 0.277 3.947
a. Predictors:
Coefficientsa
10
Unstandardized Standardized
Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) 9.985 0.606 16.470 0.001
Then run the regression analysis and now using only movie mode data.
Model Summary
Model R R Square Adjusted R Square Std. Error of the
Estimate
1 0.605 0.366 0.354 4.288
a. Predictors:
Coefficientsa
Unstandardized Standardized
Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) 8.487 0.636 13.352 0.001
Compare the two sets of estimated values for the coefficients of the 7 attributes.
11
22. Are there notable differences between the two sets of estimates? And what does that tell
you?
A. The estimates agree more or less with each other. This indicates that the two modes do not have
a big impact on the measurement of preferences
B. There are substantial differences between the estimates. This indicates that the two modes do
have an impact on the measurement of preferences
24. Which mode offers a more accurate prediction of the satisfaction score for the
neighborhood?
A. Game mode
B. Movie mode
12