Introduction To SPSS TO MOANR

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 77

SPSS- Statistical Package for

Social Science

Akalu Teshome (PhD)


Senior Researcher & Socio-Economics
Research Coordinator in NARC

08/28/2023 1
 What is a statistical package?
• It is a computer program or set of
programs that provides many different
statistical procedures within a unified
framework.

• The advantages of such packages are


many
 much easier to use
 Possible to run complex analyses without
getting bogged down in the details of
computations,

08/28/2023 2
Overview of SPSS for windows

 It provides a powerful statistical analysis and data


management system
 This program can be used to analyze data from
surveys, tests, observations, etc.
 It can perform a variety of data analyses and
presentation functions, including statistical analysis
and graphical presentation of data.

08/28/2023 3
Launching SPSS
 Click on the Start button, then click
Programs, then SPSS for windows and finally
SPSS 20.0 for windows icon.

 Another method is to double click My


Computer on the Windows desktop, then the
(C:) drive icon, Programs, and finally SPSS
icon.

 The SPSS Data Editor opens with the window


looking approximately as the picture
displayed below:
08/28/2023 4
SPSS Data Editor

08/28/2023 5
Windows in SPSS
The two most common windows in SPSS are:
Data Editor.
 Displays the contents of the current (working) data

file.
 You can create new data files or modify existing
ones with the Data Editor.
 This window opens automatically when you start

SPSS .
 You can have only one data file open at a time.

 Has two views called data and variable view

 The components are displayed in the picture below:

08/28/2023 6
08/28/2023 7
Viewer
 This window displays the results of any
statistical procedures you run and other
text.
 In particular, tables, statistics, and charts
are displayed in the Viewer window.
 A Viewer window opens automatically the
first time you run a procedure that
generates output.
 This window is not accessible until output
has been generated

08/28/2023 8
Working with the Data Editor
 The Data Editor window can be displayed in
one of the two views: Data View or Variable
View.
 The Data View displays the contents of the
data file in the form of a spreadsheet.
 The Variable View defines all variables in the
data file.
 Switching from one view to the other can be
done by clicking the appropriate tab (Data
View or Variable View) at the bottom of the
Data Editor window. (see the picture
underneath).
08/28/2023 9
08/28/2023 10
Data Editor Menus
 The menu bar provides easy access to
most SPSS features. It consists of ten
drop-down menus:

08/28/2023 11
Data Editor Toolbar
 Clicking once on any of these buttons
allows you to perform an action, such as
opening a data file, or selecting a chart
for editing etc.

08/28/2023 12
 The Data View window is a grid,
• whose rows represent subjects (or cases)
and
• whose columns contain values of the
variables (gender, salary, age etc.) for
each subject.
• Each cell of the grid, therefore, will usually
contain the score of one particular subject
on one particular variable.

08/28/2023 13
08/28/2023 14
Variable view
 The Variable view contains descriptions of
the attributes of each variable in the data
file.
 In the variable view, rows are variables
and columns are variable attributes.
 In this table you can add or delete variables
and modify attributes of variables including
variable name, data type, number of
digits, ....

08/28/2023 15
08/28/2023 16
Variables
 The variable name must begin with a
letter and cannot end with a period.
 The length of the name cannot exceed 8
characters but not in the latest versions.
Variable names that end with an
underscore should be avoided.
 Blanks and special characters can not be
used (!, ?,” and *)
 To define a variable click the Variable View
tab at the bottom of the Data Editor
window. This will obtain the Variable View
window.
08/28/2023 17
 Enter the new variable name in the
column Name in any blank row. For
example, enter the name gender in the
first row. After entering the name, the
default attributes (Type, Width,...) are
automatically assigned. Then if you click
on the Type column, the variable type sub
dialog box appears

08/28/2023 18
Variable Type
 Numeric , Comma and dot – you can enter
values with any number of decimal
positions. The data editor displays only the
defined number of decimal positions

 String – used to hold alphanumeric values

 Date – you can use slashes, dashes,


spaces. Commas, or periods as delimiters
between day, month and year. (
dd/mm/yy)

 Scientific Notion- 1E6


08/28/2023 19
Variable Label
 Although variable names can only be
only 8 characters long (earlier
versions), variable labels can be up
to 256 characters long and these
descriptive labels are displayed in
output.

08/28/2023 20
Value Label
 To assign a label:
• enter the value in the value text box
• enter the label in the label text box then
• click on Add
• Repeat these steps until you define labels for
all values
• Click ok

08/28/2023 21
Missing Values
 With SPSS, there are two forms of missing
values: system-missing and user defined
missing.
 System-missing values are those that
SPSS automatically treats as missing. The
most common form of this type of value is
when there is a "blank" in the data file.
 User-defined missing values are those
that the user specifically informs SPSS to
treat as missing. Rather than leaving a
blank in the data file, numbers are often
entered that are meant to represent data
08/28/2023 22
Steps to define missing values
 Select the cell which is the intersection of
missing value attribute and the variable
you need
 Click the wizard button
 In the dialog box displayed,
• You can set up to 3 discrete missing values
• range plus one discrete missing values
 Finally click ok

08/28/2023 23
08/28/2023 24
Data management under
SPSS

08/28/2023 25
 Adding Cases: To insert a new case (row)
click on the row below the row where you
wish to enter the new case, click on Edit on
the menu bar, click on Insert Case from the
pull-down menu.

 Deleting Cases: To delete a case, click on


the case number that you wish to delete, click
on Edit from the menu, and then on Clear.

 Inserting new variables: Entering data in


an empty column in the data view or in an
empty row in the variable view automatically
creates a new variable with a default name
( the prefix var and sequential number).
08/28/2023 26
Sorting Data
 In order to sort the data, from the data menu
choose Sort Cases.
 If you select multiple sort variables, cases are
sorted by each variable within category of the
prior variable on the Sort list. The following
dialog box will be displayed

08/28/2023 27
Selecting Cases
 There are occasions on which you will want to
select a subset of cases from your data file for a
particular analysis. You may need to select the
subset based on a formally defined criteria or
randomly in case of a very large data file.

 To select subset of cases,


• Click on Data in the main menu and then
• Click on Select Cases from the pull-down
menu. This opens the Select Cases dialog box.

08/28/2023 28
08/28/2023 29
Combining Data files

 You can combine two files in two


different ways. Thus, you can:
• Merge files containing the same
variables but different cases (Add
Cases)
• Merge files containing the same cases
but different variables (Add Variables)

08/28/2023 30
Add Cases
 Have one of the files open, and you can
add cases to it from an external SPSS file.
To add cases: from the Data menu choose
merge files > Add Cases...
 This opens the Add Case Read file dialog
box. After selecting the file to be included,
a dialog box with the list of variables
appears. The boxes that appear are:

08/28/2023 31
Unpaired variables :- variables to be excluded from the new, merged
data file. Variable from the working data file are identified with an
asterisk (*). Variables from the external data file are identified with a
plus sign (+).
Variables in the new working data file:- variables to be included in
the new, merged data file. By default, all the variable that match both
name and data type are included on the list.

08/28/2023 32
Add Variables
 Add Variables merges the working
data file with an external data file
that contains the same cases but
different variables.
 Cases must be sorted in the same
order in both data files.

08/28/2023 33
Data Transformations
 Computing New Variables
• To create a new variable click on Transform
in Data Editor menu, and then on Compute
from the pull down menu. This opens the
Compute Variable dialog box.

08/28/2023 34
Conditional expressions

 To specify a conditional expression, click on If ...


in the compute variable dialog box. This opens
the If Cases dialog box. You can choose one of
the following alternatives:
 Include all cases: values are calculated for all
cases. It is the default.
 Include if case satisfies condition: The
expression can include variable names,
constants, arithmetic operators, numeric and
other functions, logical variables, and relational
operators.

08/28/2023 35
08/28/2023 36
Counting Occurrence
To Count Occurrences of Values within Cases, From the menus choose:
Transform > Count...
This dialog box creates a variable that counts the occurrences of the same
value (s) in a list of variables for each case

08/28/2023 37
 Target variable : is the name of the
variable that receives the counted value.

 Target label – Descriptive variable label


for the target variable

 Variables - selected numeric or string


variables from the source list. Can not
contain both numeric and string variables.

08/28/2023 38
Recoding Variables
 You have two options available for
recoding variables.
• You may recode values into the same
variable, which eliminates all record of
the original values.
• You also have the option to create a
new variable containing the recoded
values

08/28/2023 39
Recoding into the Same Variable
 To recode into the same variable, click on
Transform Recode Into Same Variable.
This opens the Recode into same variable
dialog box  move the variable to be
recoded to Variables box

08/28/2023 40
 Then click on Old and New Values. You will
obtain the following box:

On the left part type old value and


on the right type the new value and
then click on the add button.
Repeat this step for all values that
you want to recode. Then click on
continue
08/28/2023 41
Recoding into Different Variables
 It is the same with recoding into the
same variable except you are
prompted to inter new name and
label for newly generated variable.

08/28/2023 42
Ranking data
 To compute ranks: click Transform 
Rank Cases... the rank dialog box is
opened

08/28/2023 43
Replace missing values
 Replace Missing Values creates new
variables from existing ones, replacing
missing values with estimates computed
with one of several methods.
 To Replace Missing Values:
• Click Transform  Replace Missing Values...
• Select the variable that contain missing value
• Chose one method of estimation
• Click ok

08/28/2023 44
Data Analysis

08/28/2023 45
Evaluating Assumptions
 Many statistical procedures, such as
analysis of variance, require that all
groups come from normal populations with
the same variance.
 Therefore, before choosing a statistical
hypothesis, we need to test the hypothesis
that all the group variances are equal or
the samples come from normal
populations.
 If it appears that the assumptions are
violated, we may want to determine
appropriate transformations

08/28/2023 46
Tests of Normality
 To test whether our data have come from
a normal distribution, we can use the
normal probability plot.

 In a normal probability plot, each


observed value is paired with its expected
value from the normal distribution.

 If the sample is from a normal


distribution, we expect that the points will
fall more or less on a straight line.

08/28/2023 47
Exploring your data
 The Explore procedure provides a variety
of descriptive plots and statistics, including
stem-and leaf plots, box plots, normal
probability plots, and spread-versus-level
plots.
 Also the Levene test for homogeneity of
variance, Shapiro-Wilks' and Lilliefors tests
for normality, and several other
estimators of location are available
 To run the Explore procedure,
• Click Analyze > Descriptive Statistics >
Explore...
08/28/2023 48
 Select one or more dependent variables.
 Optionally, you can:

• Select one or more factor variables, whose


values will define groups of cases.
• Select an identification variable to label cases.
• Click Statistics for robust estimators, outliers,
percentiles, and frequency tables.
• Click Plots for histograms, normal probability
plots and tests, and spread-versus-level plots
with Levene's statistics.
• Click Options for the treatment of missing
values

08/28/2023 49
Both. Displays plots and statistics.
Statistics. Displays statistics only.
Plots. Displays plots only (suppresses all statistics).
There are also pushbuttons for statistics..., plots...
and options...

08/28/2023 50
Cross Tabulation And Measures Of
Association
 The Crosstabs procedure forms two-way
and multiway tables and provides a
variety of tests and measures of
association for two-way tables.
 If you specify a row, a column, and a layer
factor (control variable), the Crosstabs
procedure forms one panel of associated
statistics and measures for each value of
the layer factor.

08/28/2023 51
To Obtain Crosstabulations
 Click Analyze Descriptive Statistics
 Crosstabs...
 Select one or more row variables and one or
more column variables.
 Optionally, you can:
• Select one or more control variables.
• Click Statistics for tests and measures of
association for two-way tables or subtables.
• Click Cells for observed and expected values,
percentages, and residuals.
• Click Format for controlling the order of
categories.
08/28/2023 52
The Chi Square Test of
Independence
 The Chi Square Statistic (pearson
chi-square) is used to test the
hypothesis that the row and column
variables are independent
 Shortcomings
• Don’t tell us the strength and direction
of association
• Sample size dependence

08/28/2023 53
Measures of Association
 Measures of Association are divided
in to nominal measures, ordinal
measures and interval measures.

08/28/2023 54
08/28/2023 55
Independent-Samples T Test
 The Independent-Samples T Test
procedure compares means for two
groups of cases.
 Ideally, for this test, the subjects
should be randomly assigned to two
groups, so that any difference in
response is due to the treatment (or
lack of treatment) and not to other
factors.

08/28/2023 56
Assumptions.
 For the equal-variance t test, the
observations should be independent,
random samples from normal
distributions with the same population
variance.
 For the unequal-variance t test, the
observations should be independent,
random samples from normal
distributions.

08/28/2023 57
To Obtain an Independent-
Samples T Test
  From the menus choose:
  Analyze
   Compare Means
    Independent-Samples T Test...
  Select one or more quantitative test
variables. A separate t test is computed
for each variable.
  Select a single grouping variable, and
click Define Groups to specify two codes
for the groups that you want to compare.
08/28/2023 58
08/28/2023 59
One-Way ANOVA

 The One-Way ANOVA procedure


produces a one-way analysis of
variance for a quantitative
dependent variable by a single factor
(independent) variable.
 Analysis of variance is used to test
the hypothesis that several means
are equal. This technique is an
extension of the two-sample t test.

08/28/2023 60
 In addition to determining that
differences exist among the means,
you may want to know which means
differ.
 There are two types of tests for
comparing means:
• a priori contrasts and post hoc tests.
Contrasts are tests set up before
running the experiment, and
• post hoc tests are run after the
experiment has been conducted.
• You can also test for trends across
categories.

08/28/2023 61
One-Way ANOVA Data
Considerations
 Data. Factor variable values should be
integers, and the dependent variable
should be quantitative (interval level of
measurement).
 Assumptions. Each group is an
independent random sample from a
normal population.
 The groups should come from normal
populations with equal variances. To test
this assumption, use Levene's
homogeneity-of-variance test.

08/28/2023 62
To Obtain a One-Way Analysis of
Variance
  From the menus choose:
  Analyze
   Compare Means
    One-Way ANOVA...
  Select one or more dependent
variables.
  Select a single independent factor
variable.

08/28/2023 63
08/28/2023 64
Bivariate Correlations Data
Considerations
Data. Use symmetric quantitative
variables for Pearson's correlation
coefficient and quantitative variables
or variables with ordered categories
for Spearman's rho and Kendall's
tau-b.
 Assumptions. Pearson's correlation

coefficient assumes that each pair of


variables is bivariate normal.

08/28/2023 65
To Obtain Bivariate Correlations
From the menus choose:
  Analyze

   Correlate
    Bivariate...
  Select two or more numeric

variables.

08/28/2023 66
Correlation Coefficients
 For quantitative, normally distributed
variables, choose the Pearson correlation
coefficient.
 If your data are not normally distributed
or have ordered categories, choose
Kendall's tau-b or Spearman, which
measure the association between rank
orders.
 Correlation coefficients range in value
from –1 (a perfect negative relationship)
and +1 (a perfect positive relationship). A
value of 0 indicates no linear relationship.

08/28/2023 67
 When interpreting your results, be careful
not to draw any cause-and-effect
conclusions due to a significant
correlation.
 Test of Significance. You can select two-
tailed or one-tailed probabilities. If the
direction of association is known in
advance, select One-tailed. Otherwise,
select Two-tailed.
 Flag significant correlations. Correlation
coefficients significant at the 0.05 level are
identified with a single asterisk, and those
significant at the 0.01 level are identified
with two asterisks.

08/28/2023 68
 Pearson's correlation coefficient is
a measure of linear association.

 Two variables can be perfectly


related, but if the relationship is
not linear, Pearson's correlation
coefficient is not an appropriate
statistic for measuring their
association

08/28/2023 69
08/28/2023 70
Linear Regression
 Is used to predict the value of dependent variable
based on independent variables

 For example, you can try to predict a income of an


employee (the dependent variable) from
independent variables such as age, education, and
years of experience

08/28/2023 71
Linear Regression Data
Considerations
 The dependent and independent
variables should be quantitative.

 Categorical variables, such as


religion, major field of study, or
region of residence, need to be
recoded to binary (dummy) variables
or other types of contrast variables.
08/28/2023 72
Assumptions.
 For each value of the independent
variable, the distribution of the dependent
variable must be normal.
 The variance of the distribution of the
dependent variable should be constant for
all values of the independent variable.
 The relationship between the dependent
variable and each independent variable
should be linear, and all observations
should be independent.

08/28/2023 73
To Obtain a Linear Regression
Analysis
  From the menus choose:
  Analyze
   Regression
    Linear...
  In the Linear Regression dialog box,
select a numeric dependent variable.
  Select one or more numeric
independent variables.
08/28/2023 74
 Optionally, you can:
• Group independent variables into blocks
and specify different entry methods for
different subsets of variables.
• Choose a selection variable to limit the
analysis to a subset of cases having a
particular value(s) for this variable.
• Select a case identification variable for
identifying points on plots.
• Select a numeric WLS Weight variable
for a weighted least squares analysis

08/28/2023 75
08/28/2023 76
Thank you!!!

08/28/2023 77

You might also like