USING SPSS Guide Revised

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 46

USING SPSS FOR DATA

ENTRY AND ANALYSIS

Dwayne Devonish

NB: Can still be used for later


versions of SPSS despite some
operational and cosmetic changes
An update is pending
USING SPSS
• SPSS = Statistical Package for Social Sciences
• Three main steps in SPSS:
 Defining variables/items into SPSS (preparing the
data file) – This process involves defining and
labelling each variable as well as assigning numbers
to each possible response (Pallant, 2005).
 Entering the Data – After defining all variables, you
can enter your data.
 Conducting Statistical Analyses – SPSS can
perform a range of statistical analyses on your data
including frequencies, means and standard
deviations and crosstabulations.
QUESTIONS/VARIABLES
• You have to be familiar with different types of
questions or items on a questionnaire. A question or
item on a questionnaire is treated as a variable in
SPSS.
• A variable is any characteristic that can vary. For
example, gender, age and income.
• In SPSS, variables can assume different levels of
measurement.
• Nominal Variables and Ordinal Variables are
sometimes referred to as categorical variables
because they possess distinct labels or categories in
which persons or objects are placed. However,
ordinal variables consist of response
options/categories with some intrinsic order, that is,
these categories are rank-ordered. For example,
Strongly Agree, Agree, Neutral, Disagree and
Strongly Disagree or Primary, Secondary, First
Degree and Post graduate.
QUESTIONS/VARIABLES (2)
• Nominal variables have categories that have no
implied order such as gender (male or female) or
religious affiliation (Christian or Muslim) or type of
car (Nissan or Suzuki).
• Interval and Ratio variables are variables that elicit
numerical or scale data. Data are in the form of
numbers. These variables are also known as
continuous variables. For example, age, height,
weight, etc.
• Nominal, ordinal, interval, or ratio variables
represent different types of items on a
questionnaire.
QUESTIONS/VARIABLES
• All you need to know is that there are really three
basic types of questions/items on a questionnaire:
• Closed-Ended Items - Items with a specified
number of response options from which respondents
either circle or tick one or more – also known as the
categorical variables (nominal or ordinal).
• Open-Ended/Textual Items – Items that allow
respondents to provide textual responses (data in
words; qualitative data).
• Numerical Items – Items that only elicit data in the
form of numbers. Respondents write in a number as
a response (Interval or Ratio).
SPSS Version 11
 SPSS v.11 FOR WINDOWS: This guide presents an
introduction to guidelines on the SPSS 11 version;
however, for later versions up to v.15 to 20, it may
still help.
 For example SPSS v.15-20 have a very similar
structure and operational set-up but with some minor
changes.
• In SPSS, there are two main windows: the Data
Editor Window and Output Viewer.
• The Data Editor is used to define the variables and to
enter the raw data from completed questionnaires.
• There are two different views on the Data Editor
Screen: Variable View and Data View (at bottom left-
hand side of the screen of SPSS window).
• Let’s go through the three main phases in SPSS
Phase 1 – Setting up the Data
Sheet
 In Phase 1 (Setting up the data sheet) -
• In the Variable view (select by ticking the ‘tab’ at
the bottom), you can prepare your data sheet
(codebook). This involves defining your
variables/questions into the SPSS data sheet.
• Each row in the variable view represents a
variable to be defined. Relevant characteristics
used to define variables are listed along the top
of the data sheet.
• Characteristics are (from left to right): Name,
Type, Width, Decimals, Label, Values, Missing,
Columns, Align and Measure.
SAMPLE QUESTIONNAIRE
• Let’s use this short questionnaire:
1. Gender Male __ Female __

2. Age _______

3. Income $10-20 _ $21-30_


$30-45 _ $Over 45_
4. Suggestions
_______________________________
Phase 1
• Under Name –
 Type in a name for your variable. The variable
name allows you to identify the question/variable
in SPSS when entering your data from a
questionnaire. The variable name:
o Cannot be longer than 8 characters (in SPSS
V.11); however, more than 8 characters exist in
later versions of SPSS
o Must be unique (each variable must have a
different name)
o Must begin with a letter
 For our questionnaire (see previous slide), we
give our first question/variable a name – gender.
Variable
characteristics
are listed along
the top

The data editor window has


two views: Data View and
Variable View which can be
selected at the bottom of the
screen. Click Variable view to
define your variables.
Phase 1
• Under Type –
 This option represents the type of variable or data
you are entering. There are two common variable
types you should be familiar with: 1) Numeric and 2)
String. Don’t worry about the others !!
 The ‘numeric type’ is used with 1) questions that
are closed-ended (i.e. have categories or response
options from which to choose – in SPSS, a number
is assigned to a category or response option on
these types of questions/variables) and 2)
questions/variables that elicit (seek to obtain)
numerical data (interval/ratio variables or continuous
or scale variables). Continuous/scale variables (aka
interval or ratio) are another term that represent
numerical questions.
Phase 1 (Under Type)
 Gender is a categorical variable (nominal) and is a
closed question (i.e. people have two categories to
choose from) because it places people into two
distinct groups/categories = Male or Female. You
must choose numeric for our first question
measuring gender.
 The ‘string type’ is used only for questions that seek
to obtain textual data (data in the form of words).
These questions are open-ended.
 You can activate the ‘Type’ cell by clicking the grey
box right of the cell
 Numeric is the default in SPSS, so you don’t have to
change this option
Click on
the grey
box to
open the
dialogue
box
containing
variable
type
options.
Phase 1
• Under Width –
 The width is the character or digit span - The
default value for width is 8, permitting 8
characters or digit values when entering data
(e.g. for 8-digit numbers). This width size can
accommodate most data, however, if you have a
continuous variable which has very large values
($7,000,000, 000), you would have to increase
the width size. Otherwise, leave it at 8. For
string variables, increase width to the maximum
(256 characters in SPSS 11.) since you will be
accommodating words instead of numbers.
Phase 1
• Under Decimals –
 The default setting for decimals is 2. However, you
should set all decimals to 0 (zero). If your variable
requires decimal places (e.g. income as a
continuous variable - $500.28), you can adjust the
decimals to suit.
• Under Label –
 The label column allows to give a longer description
for your variable than the 8 characters that are
allowed under Variable Name. For example, you can
type in “Gender of Respondent” or “What is your
Gender” in the space. When you conduct statistical
tests, the label appears over your generated
statistical output so you can match it to the question
you analyse.
Phase 1
• Under Values –
 This column allows you to assign numbers to
represent categories or labels on your variable –
only should be activated for closed-ended items.
 Using the grey box on the right of the cell, you can
activate a dialogue box with which you can insert
the value (number) and the corresponding value
label (category or response option). You add this
information into the lower field of the dialogue box
until you have finished assigning a number to each
label. You click OK.
 Remember that all value labels and corresponding
values must be added into the lower field of the
box, before clicking OK.
Click Add to register your coding
scheme so that it goes in the lower field
of the box.
‘1=Male’ has been registered in lower
box already. You must click add to
register ‘2=Female’ in the lower box.
You can click ‘OK’ when you are done.
Phase 1
• Under Missing –
 You can assign specific values to indicate missing
values for you data. For example, if you arrive at a
missing response on gender, you can specify a
number that SPSS would register as a missing
response. When entering data, you can
alternatively leave the field or cell blank if you
encounter a missing response – this is
preferred. SPSS registers this blank field as a
missing response. If you are planning to use a
value instead to represent a missing response,
you can activate your missing column using the
grey box on the right of the cell – otherwise you
can leave it alone.
2. Unclick the 1. Grey
‘No Missing box
Values’ and click activates
‘Discrete ‘missing’
missing values.’ column
You can type a
value that would 3. Click OK
represent a after you
missing have
response like specified the
‘9999’, for your value.
example.
Phase 1
• Under Columns and Align – You do not have to
adjust or change these settings. Move on to
Measure.
• Under Measure – You can specify the level of
measurement (nominal, ordinal or scale) your
variable assumes. For gender, you can specify a
nominal measure. Measure speaks to those types
we were discussing: nominal, ordinal or scale.
Remember scale covers both interval or ratio
(continuous).
• Congratulations, you have successfully coded
gender into SPSS.
• Later versions of SPSS (e.g. 19 and 20) have
another variable tool known “Role” (after
‘Measure). You can leave this tool at “Input’: the
default selection.
Phase 1
• For the income variable, you should follow the
same procedure:
 Name – income, Type – numeric (closed question
– have categories to choose from), Width - 8,
Decimals - 0, Label - Income of Respondent,
Values – Four categories = 1= 10-20, 2= 21-30, 3
= 31-45, 4= Over 45, Missing = 9999, Measure =
ordinal.
• For the age question, the same procedure is used
but when you arrive at “Values”, you do not have
to assign a number because age is a
numerical/continuous variable, that is, the data are
in the form of numbers already. Leave values as
‘none’. The measure is specified as ‘Scale’
Phase 1
• For the final question on our survey (Suggestions),
the respondent is asked to offer suggestions in a
space provided. This question is open-ended and
requires textual data (i.e. words). When you reach
‘type’, change the option to String which allows to
put in text or words into the field or cell during the
data entry stage. Increase characters (width) to
255 (maximum character span). The Decimals,
Values and Missing columns are inactive. Type in
the label – e.g., Suggestions from Respondents.
You are done !!!
DON’T FORGET TO
SAVE YOUR FILE

You can click the


‘Data View’ tab to
enter your data
(phase 2)
Variables are
listed across
the top of the
You enter data sheet in
your data in a DATA VIEW.
row from left
to right
Phase 2 – Data Entry
• Let’s enter this completed questionnaire:
• Gender Male X Female _
• Age 24
• Income 10-20 _ 21-30 X
31-45 _ Over 45 _
• Suggestions
Eat more healthy
1. You can
click icon
(with a label)
on the menu
bar to see
your value
labels instead
of raw
numbers
2. You can see male
(instead of 1) and 21-30
(instead of 2) when your
value labels have been
requested.
Phase 3 – Analysis of Data
• After entering your data, you can now move on to
analyse your data.
• Descriptive statistics are a school of statistics that
are used to describe the characteristics of your
sample. Descriptive statistics include frequencies
and percentages, means and standard deviations,
and crosstabulations.
• For categorical variables (nominal or ordinal), that is,
those questions that have a number of categories or
response options attached, you can use
frequencies/percentages to obtain the
number/proportion of persons that fall into the
different categories. For example, gender and
income are categorical and must be analyzed using
frequencies.
Phase 3 - Analysis of Data (2)
• For continuous or scale variables such as
age (numerical data), you have to use means
and standard deviations. These statistics (like
frequencies for gender) give you a descriptive
account on your sample in terms of age. The
mean is same as the average. You use the
mean and standard deviation for
summarising numerical or continuous data
(age).
Descriptive Statistics
• Main descriptive statistics include:

• Frequencies (and percentages)


• Means and standard deviations
• Crosstabulations
2. Click
the
‘Analyse’
tab at the
top of the
menu bar

3. Move to
1. We are going to analyse ‘Descriptive
Statistics’,
‘gender’ (a categorical
then click
variable) using the Frequencies
frequencies/percentages in the next
command. box
2. To run
frequencies on a
variable, choose
the variable from
the left box field
and with the use
1. A dialogue box appears of the arrow in
marked ‘Frequencies’. the middle,
On the left box field, you transfer it to the
have a list of the variables right box field,
you have defined in your under variable(s).
variable view.
1. Gender
is
transferred
to right box
field. We
want to
2. We can choose different analyse
charts to accompany our gender.
frequency command on
gender.
Click on ‘Charts’ to see
options.
2. Click
Continue and
then you can
click OK on the
first dialogue
1. A chart dialogue box.
box appears.
You can choose bar
charts or pie charts.
Let’s choose pie
charts.
SPSS opens a separate
window called the ‘Output’
View where the statistical
output is presented.
This box tells
This table you the number
shows the of valid (those
frequencies who have filled
and in information
percentage for gender) and
s for males missing cases
and (those who have
females in not indicated
the sample. their gender).
There are ten
valid cases.
To analyse
continuous
variables such
as age, you
must use mean
and standard
deviation
statistics. Go to
Analyse,
‘Descriptive
Statistics’ and
then
‘Descriptives’
Click options to
see the various
descriptive
statistics you can
conduct on age.
Again, use the arrow
here to drag age into
right box field.
Click
Continue
and then
click OK
Ensure on first
that mean dialogue
and box to see
standard the
deviation statistical
options results.
are
ticked.
N = number of persons who indicated their age (sample
size).
Minimum = Smallest value on age (youngest person= 13).
Maximum = Largest value on age (oldest person= 53)
Mean = Average = The average (mean) age of the sample is
26.60 years (SD=12.39). Remember the standard deviation
(SD) is the spread of the scores around the mean.
Remember, mean and standard deviation are used to
summarise continuous data (age, number of children,
weight, height).
Descriptive Statistics
• Crosstabulations are descriptive tools that are used
to analyse two variables at a time. These variables
must be categorical. Crosstabulations can
summarise a relationship between two variables.
• For example, let’s say we want to analyse income
and gender to determine the number (and
percentage) of males and females that fall into
different income categories. We have to use a
crosstabulation because we are looking to analyse
two variables which are categorical.
To analyse two
variables using a
crosstabulation,
Go to Analyse,
‘Descriptive
Statistics’, then
Go to ‘Crosstabs’
2. Before conducting the
crosstabulation, click on
the tab at the bottom
‘Cells’ to request
percentages.
1. Using the arrows, you can transfer income variable
from the left to the right box field under ‘Row’ and
transfer gender variable to other field below income
under ‘Column’. You are analysing gender and income,
so one variable must go into the row box and next
must go into the column box.
2. Click ‘Continue’
and then click OK
on the first
dialogue box to run
the analysis.

1. Click ‘column’ under


percentages to find out
the percentage of
males and females in
different income
categories
This table
tells you
the valid
number
and
proportion
of cases Male and
Female
categories
Income of gender
categories are in
are in the columns.
rows This table is referred to as
a contingency table.
Interpretation of Contingency Table
• The table in prior slide tells you that 100%
(5 out 5 females) of females fell into the
$10-$20 income bracket, whereas all
males fell into the $21-$30 income
bracket.
ENJOY SPSS
Remember, SPSS is a comprehensive statistical
programme that is used to conduct various
statistical analyses on data derived from
surveys. It can be used to generate charts,
tables and other statistics on your data.
You can use it with any course that involves a
research project, thesis or dissertation.

You might also like