RESEARCHMETHODOLOGYLAB

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 20

Atisha Jain 41314901718

MODULE 1
INTRODUCTION TO SPSS

1.1 MEANING OF SPSS:


SPSS stands for statistical package for social sciences.

SPSS Statistics is a software package used for interactive or batched, statistical analysis. Long
produced by SPSS Inc., it was acquired by IBM in 2009. The current versions (2015) are named
IBM SPSS Statistics.

The software name originally stood for Statistical Package for the Social Sciences (SPSS)
reflecting the original market, although the software is now popular in other fields as well,
including the health sciences and marketing.

1.2ABOUT SPSS :
SPSS is a widely used program for statistical analysis in social science. It is also used by market
researchers, health researchers, survey companies, government, education researchers, marketing
organizations, data miners and others. The original SPSS manual (Bent & Hull, 1970) has been
described as one of "sociology's most influential books" for allowing ordinary researchers to do
their own statistical analysis. In addition to statistical analysis, data management (case selection,
file reshaping, creating derived data) and data documentation (a metadata dictionary is stored in
the data file) are features of the base software.

SPSS Statistics places constraints on internal file structure, data types, data processing, and
matching files, which together considerably simplify programming. SPSS datasets have a two-
dimensional table structure, where the rows typically represent cases (such as individuals or
households) and the columns represent measurements (such as age, sex, or household income).
Only two data types are defined: numeric and text (or "string"). All data processing occurs
sequentially case-by-case through the file (dataset). Files can be matched one-to-one and one-to-
many, but not many-to-many. In addition to that cases-by-variables structure and processing,
there is a separate Matrix session where one can process data as matrices using matrix and linear
algebra operations.

1|Page
Atisha Jain 41314901718

Several variants of SPSS Statistics exist. SPSS Statistics Grand packs are highly discounted
versions sold only to students. SPSS Statistics Server is a version of SPSS Statistics with a
client/server architecture. Add-on packages can enhance the base software with additional
features (examples include complex samples which can adjust for clustered and stratified
samples, and custom tables which can create publication-ready tables). SPSS Statistics is
available under either an annual or a monthly subscription license

SPSS Statistics can read and write data from ASCII text files (including hierarchical files), other
statistics packages, spreadsheets and databases. SPSS Statistics can read and write to external
relational database tables via ODBC and SQL.

1.3FEATURES OF SPSS:
Completely redesigned web reports: Version 23 brings with it the new Web Report with a lot
more interactivity. And because it is web based, you don’t have to worry about the recipient
having a copy of SPSS.

A wider range of R programming options: The combination is really proving powerful, so


SPSS now allows you to call SPSS from R.

Compare Subgroups Plot: Another bit of big news in this release is that there are a ton of new
programmability plug-ins in the menus. IBM has written these for you so you don’t have to
know any Python. In fact, you don’t really have to know where they came from except that you
have to select Install Python when you install Version 23. As an example, there is a nifty plot in
the Graphs menu that automatically chooses appropriate graphic based on the Level of
Measurement of the variables.

Split into Files: Another one of the Python plug-in macros. It makes it super easy to create files
for each category in a categorical file — for instance, you may want to create a file for new
customers and a separate file for established customers

Create Dummy Variables: Another great Python plug-in. This one creates true/false variables
for each category in a categorical variable. This is a requirement in Regression. Many people
have been doing this manually for years, but this plug-in makes it easier.

Styling Output: There are a couple of great Version 22 features that you may not be using yet.
This is a fantastic recent addition that hasn’t gotten enough attention. You can conditionally
format your pivot tables — for instance, all percentages above 80 percent could be highlighted.

Generalized Spatial Association Rule (GSAR): One of the new Geo Spatial Modeling Wizard
options allows you to build a Time Series model using geo mapping information. The idea is to
map events taking place in space over slices of time. For instance, a lot of urban crime is at night,
but suburban breaking-and-entering crimes tend to happen during the workday.

2|Page
Atisha Jain 41314901718

1.4ROLE OF COMPUTERIESD DATA ANALYSIS:


Software packages are available for the analysis of quantitative and qualitative data. Each packed
has different features and the researcher needs to choose carefully. The aim of all of the packages
is to assist in the categorization and matching process. The packages can save time, but there is
still a great deal of time required to set them up and input the data and check through the process.

1.5ELEMENTS OF SPSS:
NAME: is the variable's machine readable name. This is the name used to refer to the variable
in SPSS's underlying code and, if no "Label" is defined, the name that will appear at the top of
the column in the "Data View."

TYPE: indicates the type of data that can be stored in the variable's column. The most
frequently used types are "String" (for text) and "Numeric." SPSS uses the type to know what
rules can be applied to a specific variable. It won't do arithmetic on a string variable, for
example.

Width: indicates the allowed number of characters per instance.

Decimals: sets the number of decimal places allowed in variable instances.

Label: sets the name that will be displayed at the top of the column in the Data Editor, allowing
for a human readable representation of the variable name.

Values: sets names given to coded values (e.g. if the variable contains survey responses where
a "0" represents "no" and "1" represents a "yes" this field can be used to tell SPSS to display the
text values instead of the numerical raw data).

Missing: sets the values that will be encoded as "Missing."

Columns: sets the displayed column length.

Align: sets the displayed alignment (right, left, or center).

Measure: sets the statistical level of measurement. SPSS distinguishes between "Scale"
(variables that represent a continuous scale like population or temperature), "Ordinal" (variables
that can be rank ordered but do not represent precisely measured values), and "Nominal"
(variables that cannot be ranked such as those that represent labels or classifications).

Role: is used by some SPSS dialogues to distinguish between the variable's intended usage in
some predictive applications (e.g. regression, clustering, and classification). For most dialogues
the role won't be significant.

3|Page
Atisha Jain 41314901718

1.6SCALE OF MEASUREMENT:
Types of scale of measurement are:

 Nominal variables
 Ordinal variables
 Interval variables;
 Ratio variables.

NOMINAL:

A variable can be treated as nominal when its values represent categories with no intrinsic
ranking. For example the department of the company in which an employee work

ORDINAL:

A variable can be treated as ordinal when its values represent categories with some intrinsic
ranking. For example, levels of service satisfaction from highly dissatisfied to highly satisfied.

INTERVAL:

The interval scale is defined as a quantitative measurement scale where the difference between 2
variables is meaningful. Interval scale is the 3rd level of measurement. In other words, the
variables are measured in actuals and not as a relative manner, where the presence of zero is
arbitrary.

RATIO:

Ratio scales are the ultimate nirvana when it comes to data measurement scales because they tell
us about the order, they tell us the exact value between units, AND they also have an absolute

4|Page
Atisha Jain 41314901718

zero–which allows for a wide range of both descriptive and inferential statistics to be applied. At
the risk of repeating myself, everything above about interval data applies to ratio scales, plus
ratio scales have a clear definition of zero.

1.7 TYPES OF DATA VIEW:

Data View: is where we inspect our actual data.

 The data editor has tabs for switching between Data View and Variable View. For now,
make sure you're in Data View.
 Columns of cells are called variables. Each variable has a unique name (“gender”) which is
shown in the column header.
 Rows of cells are called cases. Oftentimes, each respondent in a study is represented as a
single case.
 In SPSS, values refer to cell contents.
 The status bar may give useful information on the data.

5|Page
Atisha Jain 41314901718

Variable View: is where we see additional information about our data.

 In the left bottom corner we find tabs for switching between Variable View and Data View.
For now, select Variable View.
 In Variable View, variables are shown as rows of cells.
 The first column shows the variable name for each variable.
 The fifth column may or may not contain a variable label. This describes the exact meaning
of each variable.
 The sixth column shows value labels: descriptions of the meaning of one, many or
all values that a variable may contain.

1.8BENEFIT OF SPSS:

1. Effective data management:

While it is spot on that a spreadsheet program offers more control with regards to the data
organization, this can also be seen as a demerit. In contrast, you cannot move data blocks in
SPSS as it is meant for organizing data in an optimal manner. A row represents one case,
whereas a column denotes one variable. SPSS makes data analysis quicker because the program
knows the location of the cases and variables. When using a spreadsheet, users must manually
define this relationship in every analysis.

2. Wide range of options:

6|Page
Atisha Jain 41314901718

SPSS is specifically made for analyzing statistical data and thus it offers a great range of
methods, graphs and charts. General programs may offer other procedures like invoicing and
accounting forms, but specialized programs are better suited for this function. SPSS also comes
with more techniques of screening or cleaning the information in preparation for further analysis.
Furthermore, normal spreadsheet programs may only support data analysis immediately
following installation, with extra plug-ins being required for accessing more intricate techniques.

3. Better output organization:

SPSS is designed to make certain that the output is kept separate from data itself. In fact, it stores
all results in a separate file that is different from the data. However, in programs like Excel,
results of an analysis are placed in one worksheet and there is a likelihood of overwriting other
information by accident.

1.9USES OF SPSS:

 Data Collection and Organization:

SPSS is often used as a data collection tool by researchers. The data entry screen in SPSS looks
much like any other spreadsheet software. You can enter variables and quantitative data and save
the file as a data file. Furthermore, you can organize your data in SPSS by assigning properties to
different variables.

 Data Output:

Once data is collected and entered into the data sheet in SPSS, you can create an output file from
the data. For example, you can create frequency distributions of your data to determine whether
your data set is normally distributed. The frequency distribution is displayed in an output file.
You can export items from the output file and place them into a research article you're writing.

 Statistical Tests:

The most obvious use for SPSS is to use the software to run statistical tests. SPSS has all of the
most widely used statistical tests built-in to the software. Therefore, you won't have to do any
mathematical equations by hand. Once you run a statistical test, all associated outputs are
displayed in the data output file.

7|Page
Atisha Jain 41314901718

MODULE 2
MANAGING DATA IN SPSS
2.1 FINDING OUT THE CASE SUMMARY
Case summary are used to understand the nature of data

2.1.1 ON THE BASIS OF GENDER


1. Click Analyze>>Reports>>Case Summaries.

2. A dialogue box named “Summarize Cases” will appear, then Add “Final Marks” in
the “Variables” column
And,
“Gender” in “Grouping Variables” column.

8|Page
Atisha Jain 41314901718

3. Then go to statistics and select mean in Statistics cell & Press “continue”.

In output statistics viewer, summarize case summarizes appear containing marks obtained in
final exam on the basis of “Gender” with “Mean”

Case Processing Summary


Cases
Included Excluded Total
N Percent N Percent N Percent
Final Marks * What is your 20 100.0% 0 0.0% 20 100.0%
gender ?
a. Limited to first 100 cases.

Case Summaries
Final Marks
What is your gender ? Female 1 72
2 69
3 65
4 60
5 70
6 61
7 73
8 56
9 61
10 64
11 57
12 55
Total Mean 63.58
N 12
Male 1 62
2 58
3 67
4 63
5 60

9|Page
Atisha Jain 41314901718

6 59
7 28
8 59
Total Mean 57.00
N 8
Total Mean 60.95
N 20
a. Limited to first 100 cases.

2.1.2 ON THE BASIS OF CASTE


1. Click Analyze>>Reports>>Case Summaries.
2. A dialogue box named “Summarize Cases” will appear, then Add “Final Marks” in
the “Variables” column
And,
“To which caste do you belong ?” in “Grouping Variables” column.
3. Then go to statistics and select mean in Statistics cell & Press “continue”.
4. In output statistics viewer, summarize case summarizes appear containing marks
obtained in final exam on the basis of “Caste” with “Mean”

Case Processing Summary


Cases
Included Excluded Total
N Percent N Percent N Percent
Final Marks * To which caste 20 100.0% 0 0.0% 20 100.0%
do you belong?
a. Limited to first 100 cases.

Case Summaries
Final Marks
To which caste do you belong? SC 1 69
2 61
Total Mean 65.00
N 2
ST 1 67
2 57
Total Mean 62.00
N 2
OBC 1 73
2 59
Total Mean 66.00
N 2
MINORITY 1 62
2 64
Total Mean 63.00
N 2
GENERAL 1 72
2 65
3 60
4 58

10 | P a g e
Atisha Jain 41314901718

5 70
6 61
7 63
8 60
9 56
10 28
11 59
12 55
Total Mean 58.92
N 12
Total Mean 60.95
N 20
a. Limited to first 100 cases.

2.2 COMPUTING NEW VARIABLE


1. Click Transform>>Compute Variable.

2. In the open pop up window, on the top left corner define Target Variable as “Mean”.

3. Select Function Group as “STATISTICAL” & Function and Statistical Variables as “Mean”.

11 | P a g e
Atisha Jain 41314901718

4. Now drag Midterm 1, 2,3,4,5 from Type & Label.

5. Click OK.

Now the result has been executed and a New Variable of Mean has been added to the DATA &
Variable View.

MODULE 3
CODING AND RECODING IN SPSS
3.1 RECODING INTO DIFFERENT VARIABLES : OLD & NEW VALUE
1. Click Transform>> Recode into different variables.

12 | P a g e
Atisha Jain 41314901718

FIG.1
2. Move Final Marks to Numeric Value & Define the OUTPUT Value as “Grade” and
click on Change.

FIG2.

3. Go to Old & New Values.


4. Then define Range & New Values as shown & opt for “Output variables are string in
the right bottom corner”.

5. Click on Continue>>OK.

6. Now the result has been executed and a New Variable of Grade has been added to the
DATA & Variable View, depicting grade A,B & C according to the marks of final
exam.

13 | P a g e
Atisha Jain 41314901718

3.2 RECODING INTO SAME VARIABLE : OLD & NEW VALUE


1. Go to Transform >> Recode into same variables.

2. Dialogue box named “Recode into Same Variables” will appear.


3. Select Mean [Midterm] as “Numeric value “expression.

4. Select Old & New Value option.


5. . Dialogue box named “Recode into Same Variables: Old & New Values” appears.
Choose different range i.e. 0 to 5 as “3”; 5.1 to 6 as “2”; 6.1 to 10 as “1”.

14 | P a g e
Atisha Jain 41314901718

Then presses continue.

6. Once “OK” is pressed, column named MIDTERM reappeared in data editor sheet
depicting new values.

MODULE 4
SELECTING, SORTING, AND ANALYSING THE DATA IN SPSS
4.1SELECT CASES
1. Go to Data>>Select Cases from the drop down menu.

15 | P a g e
Atisha Jain 41314901718

2. A dialogue box named “Select cases” will appear then select “if condition is
satisfied”.

3. Choose “if” and type “Gender=1” in variable box and then press continue.

16 | P a g e
Atisha Jain 41314901718

4. In Data View data appear with some changes. Female students remain unmarked
since they are selected cases.

4.2CASE SUMMARIES
1. Go to Analyze>> Reports>> Case summaries.

2. Case summarize dialogue box appears; select “FINAL MARKS” in variable


column and “What is your Gender” in grouping variable column.

17 | P a g e
Atisha Jain 41314901718

3. Go to “STATISTICS”, a dialogue box named “summary statistics report” appears.


Choose “Mean” in cell statistic column and Click Continue.

4. Output appears in the output data viewer showing marks obtained in


Final Exam by females.

Case Processing Summary


Cases
Included Excluded Total

N Percent N Percent N Percent


Final Marks * What is your 12 100.0% 0 0.0% 12 100.0%
gender?
a. Limited to first 100 cases.

Case Summaries
Final Marks
What is your gender ? Female 1 72
2 69

18 | P a g e
Atisha Jain 41314901718

3 65
4 60
5 70
6 61
7 73
8 56
9 61
10 64
11 57
12 55
Total N 12
Mean 63.58
Total N 12
Mean 63.58
a. Limited to first 100 cases.

4.3SORT CASES
1. Go to Data>>Sort Cases from the drop down menu.

2. Select which section one want to sort. Here we selected “What is your
Gender”.

19 | P a g e
Atisha Jain 41314901718

3. Select “ascending order” as sorting order and press “OK”


4. In Data View, data gets arranged on the basis of gender. Since females
were assigned 1 as value label, they appear before males who were
assigned 2 as the value label.

20 | P a g e

You might also like