Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

1

© Sanjay Singh
2

SPSS For Uninitiated:


A Visual Odyssey for Mortals

'The road to every heaven goes through a hell. Bear this in mind."
- Swami Vivekanand, Complete Works

Sanjay Singh

© Sanjay Singh.
Email: sanjay.singh3210@)gmail.com

Images and screenshots of IBM SPSS Statistics software is used in this book for
only learning purposes. The source/credit of any other illustration, image or
resource is duly acknowledged wherever applicable. This book is "work in
progress" and I will keep it continuously updated. Please do not feel offended if
you find typos or errors anywhere and kindly do not rush for an assessment of
IQ & EQ of the author based on some unintentional mistakes that he as a mortal
may commit. I am yet to organize content for many chapters and proof reading
the book is a distant dream. This book is like an experimental release, and it will
take time to give it a final shape. . Kindly behave and do not redistribute content
in unauthorized manner. Any positive comment and suggestion to improve book
is welcome.

2020 © Sanjay Singh


3

Chapter Importing Data in SPSS

Research involves working with data that we in different file types. IBM SPSS Statistics is a
powerful software package level of handling multiple types of files. No matter if you are a social
researcher comfortable with Excel files or analytics or designs professional working with
advanced database files, SPSS offers a straightforward and intuitive interface to import various
types of data files. The following, we will see the available file import options and how to work
with them.

TYPES OF DATA FILES IN SPSS STATISTICS

If you want to import any file in SPSS, go to File > Open > Data {Figure 1 (a)}.

Alternatively, you can directly click the (folder) icon, which directly opens the location of
your data file {Figure 1 (b)}.

Figure 1 (b): Opening Data Files

Figure 2 (a): Opening Data Files


Once you click on this option, you have to
select where your data file is located. There is a
range of options available here. By default, in
, .sav file type is selected, which is the
standard file extension type in SPSS. Standard
means the current version of SPSS works with it
and it is the most commonly used file format in
SPSS. Apart from this, if you click on the
option, a range of other file formats are
available with SPSS (Figure 2). Let us understand
the use of these file formats.

© Sanjay Singh
Figure 3 : File formats available in SPSS
4

Available File Types

1.SPSS Statistics (.sav) - Standard (commonly used) file extension type in SPSS.

2. SPSS Statistics Compressed (.zsav)- The format that leads to the SPSS statistics compressed
file type. It is a compressed file format of the standard extension type. Whenever we are working
with the large data set, and we have to save disk space, we would want to compress our data set
file. Softwares like WinZip or WinRar help us to compress a large file into a smaller
one. This file format .zsav, can be used to open the compressed file formats in the SPSS.

3. SPSS/PC+ (*.sys) - The extension of this file format is .sys file format. This file format is not
commonly used these days. A person is most likely to use this file format if she is working with
the DOS operating system or Disk Operating Systems. These file formats used to be compatible
with the old IBM PCs, which are no longer in work today. Nevertheless, IBM has maintained
some of the file formats which are not very commonly used these days.

4. Portable (.por) - It stands for portable file formats. It is an important file format that you can
use when you want to share your data files across various types of operating systems and various
versions of SPSS. por ensures that your file gets easily opened if somebody is using an older
version of IBM SPSS (say version 15 or 9) or if somebody is using an operating system different
than yours.

5. Excel (*xls, *xlsx, *xlsxm) - .xls refers to the excel file format. It is the most popular file
format, other than the SPSS standard .sav, with which we work in SPSS. Most researchers
collect data in Microsoft Excel or another spreadsheet program and then import data in SPSS for
analysis.

6. Lotus file format (*.w*) - Lotus was one of the spreadsheet programs by IBM, and it was one
of IBM's most popular programs. If you have your data into Lotus's spreadsheet format, you can
use this extension type to open the Lotus file.

7. Sylk (*.slk) - Sylk stands for the symbolic link format. It is a Microsoft file format used to
exchange data between spreadsheets. Sylk and Lotus are not very common with analysts these
days. Excel is more popular as compared to these two spreadsheet programs.

8. dBase (*.dbf) - The extension of this file format is .dbf. It is used for opening old database
management files based on microprocessors. dBase was one of the first database management
system for microcomputers, and it was very popular in olden times, but not today. We can again
club it into the old database categories which are not very popular these days but are available in
SPSS.

© Sanjay Singh
5

9. SAS file format (*sas7bdat, *.sd7, *.sd2, *.ssd01, *.ssd04, *.xpt) - It clubs various SAS file
types. SAS is a very popular and important analytics program. It is used for doing data analysis
and modeling. You can open the SAS files in SPSS by selecting this file format.

10. Stata (*.dta) – Stata is another software program for data analysis we see more popular with
economists. You can open the Stata files in SPSS, which has a .dta extension. Though SYSTAT
is not listed here, you can also open SYSTAT files, which is another software program for doing
quantitative analysis.

11. Text file formats (*.txt, *.dat, *.csv, *.tab) - There are various kind of textual or text file
formats, with extensions like .txt, .dat, .csv, .tab. We have a fairly good amount of experience
with working textual files.

 .txt is a useful file format used with text-based programs like MS Notepad or MS Word
pad
 .dat is another text-based file format.
 If you compare the file formats - .txt, .dat, .csv, .tab - these have been defined according
to the kind of delimiters or separators that they use for separating the values. .txt is
generally the raw file format, while for the .dat extension, the values are delimited or
separated by a tab.
 .csv is another popular file format. You can directly create .csv files from the MS Excel.
Other software packages like SmartPLS typically work with the .csv file format.
 .tab extension is another kind of text-based file format in which the delimiter is the tab.

In the following, we will learn how to open these various types of file formats in SPSS.

OPENING VARIOUS FILE TYPES IN SPSS

OPENING AN EXCEL FILE IN SPSS

Excel is one of the most popular file types for working with SPSS. Often, as a researcher, we
collect our data in MS Excel or other Excel spreadsheets and we try to import that data in SPSS.
An excel data file looks as follows (Figure 3):

© Sanjay Singh
6

Figure 4: Sheets in Excel file

Now, we will open a sample Excel dataset, ‘survey_sample’ . The data set contains three excel
sheets, 'survey_sample' data, 'smokers' data, and 'poll' data .
The three different types of data set have been combined for demonstration purposes. When you
collect your data, you might save your data in various spreadsheets and you can import that data
either from different spreadsheets (like we have 3 sheets here) or from a single sheet.

To get practice dataset click the web here:

© Sanjay Singh
7

Let us see how we can import data in SPSS.


To import an excel sheet, you can click on the
Folder icon at the top left in SPSS Data
Editor. This is a short cut to import data, but
if you don't want to use it, you can go to File
> Open > Data (Figure 5). Next, you need to
locate your file.

Figure 5: Opening an Excel file in SPSS

You may not see any excel file in the dialog box, even if there is an excel file. This occurs
because the default file of type is SPSS Statistics (.sav). You need to change the file type from .sav
( ) to Excel ( ) from the dropdown list of
file Type to see the Excel data stored there.

Figure 6: Excel files become visible when chosen in 'Files of Type'


Once you select and click Open, the Excel sheets would become visible. Click on the Excel file
(Figure 6).

© Sanjay Singh
8

You will notice that SPSS prompts you to select the worksheet you want to import from the
chosen Excel file. When you open the Worksheet drop-down list, you will see the sheet names
available in the Excel file selected. In the current database, there are three worksheets,
‘survey_sample,’ ‘smokers,’ and ‘poll data.’ Currently, we want to import only ‘survey_sample’
data. Choose only ‘survey_sample’ data (Figure 7).

Figure 7: Dialog box for Opening Excel Data Source


You can specify a particular range of data you want to import. Suppose in the ‘survey_sample’
datasheet, you only want the first three columns from A to C then type A1: C16 to select data from
Excel cell A1 till C16 in the Range box ( ). If you want to import the entire
worksheet, you can leave Range as blank and click OK. Your entire data will be imported. Below
we are importing only from Cell A1 to C16 (Figure 8).

© Sanjay Singh
9

Figure 8: Excel Data imported in SPSS

Challenge: Suppose we do not want to import the entire worksheet and instead want to
import only the first ten rows of the data. How will you accomplish it?

Solution: Click the Folder icon, locate the file and select the file type as Excel and click Open. In
this case, the complete survey data consists from rows A1 to D17. To import the data only for the
first ten observations from rows A to D, define the range from ‘A1: D11’. You would wonder,
why D11 and not up till D10? Because the first row is only a descriptor of your variable and we
require data of the first ten individuals. So, that is why we take D11. Click OK. So, that's how you
can import your entire excel sheet or just import a selective part of it (Figure 9).

Figure 9: Importing selective ranges


of data from Excel

© Sanjay Singh
10

OPENING A COMMA SEPARATED OR CSV FILE TYPE IN SPSS

Let us learn how to import a CSV file formatting in SPSS. CSV file format stands for comma-
separated values, and these are important file formats with which often work in the SPSS. Apart
from this, if you are working with some specialized software programs like SmartPLS, which is
used for the partial least square structural equation modeling (PLS-SEM), you generally use the
CSV file format. We will use a CSV file for demonstration purpose, i.e., bankloan.csv which
come inbuilt with SPSS. I have created the CSV file by using the sample data set in the SPSS.

To locate ‘bankloan.sav’ dataset visit C:\Program


Files\IBM\SPSS\Statistics\25\Samples\English

Figure 10: Locating a dataset in SPSS root folder

© Sanjay Singh
11

Now, let us import the file in SPSS.


CSV file formats are imported into
SPSS by using the Import wizard. Click
on the Folder icon at the top left of
SPSS. Change the file type to Text (.txt,
.dat, .csv) (Figure 11) so we can see the
bankloan.csv file. Click Open.

It generates a preview for you. Now, the


Figure 11: Opening CSV file via Folder Icon in SPSS Text Import wizard starts working, and
it performs the process in six steps. It
will ask you certain questions and you have just to follow the steps.

Step 1
1. Does your text file
match a predefined format? We
are not using any predefined
format. Click No > Next. If you
1
select any predefined format, so
you have to first Browse and
identify your file format type.

Figure 12: Step 1 of importing CSV in SPSS

© Sanjay Singh
12

Step 2

2. How are your


variables arranged? This
2
checks the arrangement of
variables, whether they are
delimited, or they have a
3 fixed width. Here, we have
delimited our file with the
comma. So, we are using a
Delimited, not Fixed width
type.
4
3. Are variable names
included at the top of your
file? . Variable names
are included at the top of our
file, like say, gender, age, etc.
Click Yes. The line number
that contains the variable
name is line number one
Figure 13 : Step 2 of importing a CSV File
in SPSS .
If it is located in line number 2 or some other
line number, which is very unusual, you can indicate the same line number for that.

4. What are the decimal symbols? Decimal symbols are , in our case. If it is Comma, you
can select Comma and you can see the preview of your data in the dialog box below this
question. Click .

© Sanjay Singh
13

Step 3

5 5. The first case of data begins with the


line number?. It's line number 2 because,
at line number 1, we have the variable
6
names.
6. How are your cases represented?
7 is true in our case. A
case is an observation. It is asking which
line represents an observation?
7. How many cases do you want to
import? The correct options for this
question are - as our interest
is in importing all the cases instead of the
first few cases or a random percentage of
cases. In case you want to select only a
random part or percentage of your data

Figure 14 : Delimited Step 3 of importing CSV in


you can select the third option and if you
SPSS
want to import only the first few rows of
your data like 100 cases, 1,000 cases or 2,000 cases, you can select this option
. Select All of the cases. I find this last option very useful because it helps
in doing a sort of random sampling. Again, you can see the preview here and proceed to Next.

Step 4

© Sanjay Singh
14

8. What delimiter appears between the


variable? The types of the limiters available
here - Tab, Space, Comma, Semicolon, and
Other. Our delimiter is Comma. Note that
by default is checked. You need to
8
9 change it to .
9. What is the text qualifier? ,
There is no text qualifier in the current case.
10. Leading and trailing spaces: This
questions the presence of any leading and
trailing spaces at the beginning or the end of
numbers. In case it is there, you can select
for removal of the leading and trailing
spaces. It is not applicable in our case. So,
we will not select either option here. Click
.
Figure 15 : Delimited Step 4 of importing CSV in
SPSS
Step 5

11. Specifications for variable


selected in the data preview: In Variable
11 name, we are taking all the variables the
way they have been defined in the
original CSV data set so we are not
making any changes. However, if you
want, you can make changes. In Data
format, we are selecting the
option because
SPSS automatically defines for you what
kind of data format it is. For instance,
is a numeric variable and if you
select Automatic, it will be defined it as
a numeric variable but if you want to
change it to any other format, you can do
Figure 16 : Step 5 of importing CSV in SPSS that which would require spending much
time identifying and defining each and
every variable. So, we are not going to do that. Keep this option as Automatic or default. When

© Sanjay Singh
15

in doubt while working with SPSS, I will suggest choosing the default options in most of the
cases. Further, click on .

Step 6

12. Would you like to save this file


format for future use? Currently, we
are not saving it for the future use so
12 select
13. Would you like to paste the
syntax? If you want to paste the
13 syntax in the syntax editor, you can
select Yes. If you don't want to paste
the syntax, you can click on No.
Click .

Figure 17: Step 6 of importing CSV in SPSS

Figure 18: The imported CSV dataset in SPSS

© Sanjay Singh
16

The CSV data has been imported into SPSS. If you select another kind of file format, like Tab-
delimited or some other space-limited file type of format, in that case, the procedure is the same.
You have utilized the Text Import wizard.

© Sanjay Singh

You might also like