Professional Documents
Culture Documents
Ambreen 2338 18990 1 BRM Session 14 SPSS
Ambreen 2338 18990 1 BRM Session 14 SPSS
Ambreen 2338 18990 1 BRM Session 14 SPSS
Data Analysis
Using SPSS
1
SPSS – What Is It?
• SPSS means “Statistical Package for the Social Sciences” and
was first launched in 1968.
• SPSS is software for editing and analyzing all sorts of data.
These data may come from basically any source: scientific
research, a customer database, Google Analytics or even the
server log files of a website. SPSS can open all file formats
that are commonly used for structured data such as
• spreadsheets from MS Excel or OpenOffice;
• plain text files (.txt or .csv)
2
SPSS Views
4
Creating a data file and entering
data
• Defining Variables
• Name, Type, Width, Decimals, Label, Values, Missing,
Columns, Align, Measure
• Entering Data in SPSS
• Using Data Editor
• Using EXCEL
5
Descriptive Statistics
Instruction: Open (Survey.sav) file for this task
• Categorical Variable
• Frequency
• Crosstabs
• Continuous Variable
• Descriptive Statistics
• Missing Data
• Exclude Case Listwise
• Exclude cases Pairwise
• Replace with mean
6
Descriptive Statistics
Missing Values
• The Exclude cases listwise option will include cases in the analysis only
if they have full data on all of the variables listed in your Variables box
for that case. A case will be totally excluded from all the analyses if it is
missing even one piece of information. This can severely, and
unnecessarily, limit your sample size. (Remove the entire case from
analysis).
• The Exclude cases pairwise option (Recommended), however, excludes
the case (person) only if they are missing the data required for the
specific analysis. Variables will still be included in any of the analyses
for which they have the necessary information.
• The Replace with mean option (not recommended as data is biased),
which is available in some SPSS statistical procedures (e.g. multiple
regression), calculates the mean value for the variable and gives every
missing case this value. This option should never be used, as it can
severely distort the results of your analysis, particularly if you have a lot
of missing values.
7
Descriptive Statistics
• Assessing Normality
• Descriptive
• Skewness and
Kurtosis
• Histogram
8
Interpreting Skewness and Kurtosis
10
Reliability of a Scale
• Cronbach’s Alpha
• Values above .7 are considered acceptable; however, values above .8
are preferable.
• Inter-Item Correlation Matrix
• Check for negative values. All values should be positive, indicating that
the items are measuring the same underlying characteristic.
• Corrected Item-Total Correlation
• values shown in the Item-Total Statistics table give you an indication of
the degree to which each item correlates with the total score.
• Low values (less than .3) here indicate that the item is measuring
something different from the scale as a whole.
• If your scale’s overall Cronbach alpha is too low (e.g. less than .7) and
you have checked for incorrectly scored items, you may need to
consider removing items with low item-total correlations.
12
Interpretation (contd.)
13
Correlation
14
Interpretation of output from scatterplot
• Step 1: Inspecting the distribution of data points
• Are the data points spread all over the place? This suggests a very low
correlation.
• Are all the points neatly arranged in a narrow cigar shape? This suggests
quite a strong correlation.
• Could you draw a straight line through the main cluster of points, or would a
curved line better represent the points? If a curved line is evident (suggesting
a curvilinear relationship) Pearson correlation should not be used, as it
assumes a linear relationship.
• What is the shape of the cluster? Is it even from one end to the other? Or
does it start off narrow and then get fatter? If this is the case, your data may
be violating the assumption of homoscedasticity.
• Step 2: Determining the direction of the relationship between the variables
• The scatterplot can tell you whether the relationship between your two
variables is positive or negative.
15
Pearson Correlation
17
Correlation is often used to explore the relationship among a
group of variables, rather than just two as described above.
It is cumbersome to report all the individual correlation
coefficients in a paragraph; it would be better to present
them in a table. One way this could be done is as follows:
18
COMPARING THE CORRELATION
COEFFICIENTS FOR TWO GROUPS
• Sometimes when doing correlational research you may
want to compare the strength of the correlation
coefficients for two separate groups.
• For example, you may want to look at the relationship between
optimism and negative affect for males and females separately.
• Follow procedure handout
• Important:
• Remember, when you have finished looking at males and
females separately you will need to turn the Split File
option off. It stays in place until you specifically turn it off.
• To do this, click on Data, Split File and select the first button:
Analyze all cases, do not create groups.
19
Multiple Regression
20
Multiple Regression
1. Standard / Simultaneous
2. hierarchical or sequential
22
Standard Multiple Regression
• All the independent (or predictor) variables are entered into
the equation simultaneously.
• Each independent variable is evaluated in terms of its
predictive power (Beta), over and above that offered by all
the other independent variables.
• This is the most commonly used multiple regression
analysis. You would use this approach if you had a set of
variables (e.g. various personality scales) and wanted to
know how much variance in a dependent variable (e.g.
anxiety) they were able to explain (R-square)
• This approach would also tell you how much unique
variance in the dependent variable each of the
independent variables explained.
23
Steps for interpreting the SPSS output for Multiple Regression
1. Look in the Model Summary table, under the R Square and the Sig. F
Change columns. These are the values that are interpreted.
The R Square value is the amount of variance in the outcome that is accounted for
by the predictor variables you have used.
• Adjusted R Square: When a small sample is involved, the R square value in the
sample tends to be a rather optimistic overestimation of the true value in the
population. The Adjusted R square statistic ‘corrects’ this value to provide a
better estimate of the true population value.
• In the ANOVA Table, Sig. column (contains the p-value):
If the p-value is LESS THAN .05, the model has accounted for a statistically
significant amount of variance in the outcome.
If the p-value is MORE THAN .05, the model has not accounted for a significant
amount of the outcome.
24
Steps for interpreting the SPSS output for Multiple Regression
(Contd.)
2. Look in the Coefficients table, under the B, Std. Beta, and Sig., columns.
The B column contains the unstandardized beta coefficients that depict the
magnitude and direction of the effect on the outcome variable. Use these to make
the regression equation for model prediction.
The Std. Error contains the error values associated with the unstandardized beta
coefficients.
The Beta column presents standardized beta coefficients for each predictor
variable. Use these to identify the better predictor. These values for each of the
different variables have been converted to the same scale so that you can compare
them. Don’t predict the model using these values.
• The Sig. column shows the p-value associated with each predictor variable.
If a p-value is LESS THAN .05, then that variable has a significant association with
the outcome variable.
If a p-value is MORE THAN .05, then that variable does not have a significant
association with the outcome variable.
25
PRESENTING THE RESULTS FROM
MULTIPLE REGRESSION IN THE REPORT
• Multiple regression was used to predict levels of
stress (DV- Perceived Stress Scale) through Mastry
and Perceived control over internal states (PCOISS).
After entry of the Mastery Scale and PCOISS Scale as
IVs, the total variance explained by the model as a
whole was 47.4%, p < .001.
• The Mastery Scale recorded a higher beta value (beta
= –.44, p < .001) than the PCOISS Scale (beta = –.33, p
< .001), showing that perceived control over external
factors (Mastry) is a stronger predictor of perceived
stress than Perceived control over internal states
(PCOISS)
26