Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 65

PRACTICAL FILE

ON
“RESEARCH

METHODOLOGY”
GURU GOBIND SINGH INDRAPRASTHA UNIVERSITY

BACHELOR OF BUSINESS
ADMINISTRATION
BATCH 2020-2023
SUBMITTED BY : SAKSHI GARG
SUBMITTED TO:-
DR. MADHU ARORA
CERTIFICATE

This is to certify that the practical titled “RESEARCH METHODOLOGY


LAB” submitted by SAKSHI GARG to NEW DELHI INSTITUTE OF
MANAGEMENT, Guru Gobind Singh Indraprastha university in partial fulfilment
of required for the award of the Bachelor of Business Administration degree is an
original piece of work carried out under my guidance and may be submitted for
evaluation.

The assistance rendered during the study has been duly acknowledged.
No part of this work has been submitted for any other degree.

Place : New Delhi


Faculty guide
DR.MADHU ARORA
Date :
ACKNOWLEDGMENT

Any accomplished requires the effort of many people and this work is not different.
Regardless of the source, I wish to express my gratitude to those who may have
contributed to this work, even though anonymously.

I would like to pay my sincere thanks to my RESEARCH METHODOLOGY


LAB faculty DR. PREETI AGGARWAL under whose guidance I was able to
complete my practical successfully. I have been fortunate enough to get all the
support, encouragement and guidance from her needed to explore, think new and
initiate.

My final thanks goes out to my parents, family members, teachers and friends who
encouraged me countless time to preserve through this entire process.
TABLE OF CONTENTS
S.No. CONTENTS Page No.

Chapter 1 Introduction to SPSS

1.1 Introduction to SPSS

1.2 How to install SPSS

Chapter 2 Layout of SPSS

2.1 Layout of SPSS

2.2 Components of SPSS

2.3 Graphs

Chapter 3 SPSS Lab Exercise

3.1 Exercise 1- Descriptive Statistics

3.2 Exercise 2 – Histogram

3.3 Exercise 3 – Crosstabs

3.4 Exercise 4 – Chi Square

3.5 Exercise 5 – T Test

3.6 Exercise 6 – Independent Sample Test

3.7 Exercise 7 – One-Way ANOVA

3.8 Exercise 8 – Correlations

3.9 Exercise 9 – 3D Bar graph

3.10 Exercise 10 – Pie Chart

Annexure Questionnaire
EXERCISE
1
INTRODUCTION TO
SPSS
SPSS - What Is It?
SPSS means “Statistical Package for the Social Sciences" and
was first launched in 1968. Since SPSS was acquired by IBM in
2009, it's officially known as IBM SPSS Statistics but most users
still just refer to it as "SPSS”.
SPSS is software for editing and analysing all sorts of data.
These data may come from basically any source: scientific
research, a customer database, Google Analytics or even the
server log files of a website. SPSS can open all file formats that
are commonly used for structured data such as
spreadsheets from MS Excel or Open Office;
plain text files (.txt or .csv); • relational (SQL) databases;
Stata and SAS.
SPSS Data View
After opening data, SPSS displays them in a spreadsheet-like
fashion as shown in the screenshot below from free lancers.
Sav .
This sheet called data view- always displays our data values. For
instance, our first record seems to contain a male respondent
from 1979 and so on. A more detailed explanation on the exact
meaning of our variables and data values is found in a second
sheet shown below.

SPSS Variable View


An SPSS data file always has a second sheet called variable
view. It shows the meta data associated with the data. Metadata
is information about the meaning of variables and data values.
This is generally known as the "codebook" but in SPSS it's
called the dictionary For non SPSS users, the look and feel of
SPSS' Data Editor window probably come closest to an Excel
workbook containing two different but strongly related sheets.
Data Analysis
Right, so SPSS can open all sorts of data and display them -and
their metadata in two sheets in its Data Editor window. So how
to analyse your data in SPSS? Well, one option isusing SPSS'
elaborate menu options . For instance, if our data contain a
variable holding respondents' incomes over 2010, we can
compute the average income by navigating to Descriptive
Statistics as shown below. Doing so opens a dialog box in which
we select one or many variables and one or several statistics
we'd like to inspect.

SPSS Output Window


After clicking Ok, a new window opens up: SPSS' output viewer
window. It holds a nice table with all statistics on all variables
we chose. The screenshot below shows what it looks like .As we
see, the Output Viewer window has a different layout and
structure than the Data Editor window we saw earlier. Creating
output in SPSS does not change our data in any way, unlike
Excel ,SPSS uses different windows for data and research
outcomes based on those data. For non-SPSS users, the look and
feel of SPSS' Output Viewer window probably comes closest to
a Power point slide holding items such as blocks of text, tables
and charts.
SPSS Reporting
SPSS Output items, typically tables and charts, are easily copy-
pasted into other programs. For instance, many SPSS users use a
word processor such as MS Word, OpenOffice or Google Docs
for reporting. Tables are usually copied in rich text format,
which means they'll retain their styling such as fonts and
borders. The screenshot below illustrates the result.
SPSS Syntax Editor Window
The output table we showed was created by running Descriptive
Statistics from SPSS' menu . Now, SPSS has a second option for
running this (or any other) command: we can open a third
window, known as the syntax editor window. Here we can type
and run SPSS code known as SPSS syntax. For instance, running
descriptive income_2010. has the exact same result as running
this command from SPSS' menu like we did earlier Besides
typing commands into the Syntax Editor window, most of them
can also be pasted into it by clicking through SPSS' menu
options. Like so, SPSS users unfamiliar with syntax can still use
it. But why use syntax if SPSS has such a nice menu? The basic
point is that syntax can be saved, corrected, rerun and shared
between projects or users. Your syntax makes your SPSS work
replicable. If anybody raises any doubts regarding your
outcomes, you can show exactly what you did and -if needed-
correct and rerun it in seconds. For non SPSS users, the look and
feel of SPSS' Syntax Editor window probably come closest to
Notepad: a single window basically just containing plain text.
SPSS - Overview Main Features
Now that we have a basic idea of how SPSS works, let's take a
look at what it can do. Following a typical project workflow,
SPSS is great for Opening data _files, either in SPSS' own file
format or many others ;Editing data such as computing sums and
means over columns or rows of data. SPSS hasoutstanding
options for more complex operations as well. Creating tables and
charts containing frequency counts or summary statistics over
(groups of) cases and variables. running inferential statistics
such as ANOVA, regression and factor analysis. saving_ data
_and output in a wide variety of file formats.
We'll now take a closer look at each one of these features.

Opening Data Files


SPSS has its own data file format. Other file formats it easily
deals with include MS Excel, plain text files, SQL, Stata and
SAS.
Web analytics data -often downloaded as Excel files- can easily
be opened and further analysed in SPSS.
Editing Data
In real world research, raw data usually need some editing
before they can be properly analysed. Typical examples are
creating means or sums as new variables, restructuring data or
detecting and removing unlikely observations. SPSS performs
such tasks -and more complex ones with amazing efficiency .For
getting things done fast, SPSS contains many numeric functions,
string functions, date functions and other handy routines.
Tables and Charts
All basic tables and charts can be created easily and fast in
SPSS. Typical examples aredemonstrated under Data Analysis.
A real weakness of SPSS is that its charts tend to be ugly and
often have a clumsy layout. A great way to overcome this
problem is developing and applying SPSS chart templates.
Doing so, however, requires a fair amount of effort and expertise
SPSS clustered bar chart with chart template applied.

Inferential Statistics
SPSS contains all basic statistical tests and multivariate analyses
such as
. t-tests;
. chi-square tests
. ANOVA
. correlations and other association measures;
. regression
. nonparametric tests
. factor analysis
. cluster analysis
Some analyses are available only after purchasing additional
SPSS options on top of the main program. An overview of all
commands and the options to which they belong is presented in
Overview All SPSS Commands
SPSS One Sample T-Test Output Example
Saving Data and Output
SPSS data can be saved as a variety of file formats, including
. MS Excel
. plain text (.txt or .csv);
. Stata
. OSAS
The options for output are even more elaborate: charts are often
copy-pasted as images inpng format. For tables, rich text format
is often used because it retains the tables' layout ,fonts and
borders Besides copy-pasting individual output items, all output
items can be exported in one go pdf, HTML, MS Word and
many other file formats. A terrific strategy for writing a report is
creating an SPSS output file with nicely styled tables and chart.
Then export the entire document to Word and insert explanatory
text and titles between the output items .Right, I hope that gives
at least a basic idea of what SPSS is and what it does. Let's now
explore SPSS in some more detail, starting off with the Data
Editor window. We'll present many more examples in the next
couple of tutorials as well.

APPLICATIONS OF SPSS
Statistical Package for the social sciences (SPSS) is a window-
based program first launched in 1968. In 2009, SPSS is acquired
by IBM. Hence, it is officially known as IBM SPSS statistics.
SPSS is widely used in the social and behavioural sciences. It is
also used by health researchers, market researchers, survey
companies, education researchers, government, etc. Various
windows can be opened when using SPSS such as data editor,
output navigator, pivot table editor, chart editor, text output
editor, and syntax editor. The data editor is a spreadsheet in
which variables can be defined and entered into the data. Each
row corresponds to a case while each column represents a
variable. This window opens automatically when SPSS is
started. The output navigator window displays the statistical
results, tables, and charts from the analysis. Output displayed in
pivot tables can be modified in many ways with the pivot table
editor. It is possible to modify and save high-resolution charts
and plots by invoking the chart editor for a certain chart in an
output navigator window. Text output not displayed in pivot
tables can be modified with the text output editor. SPSS contains
all basic statistical tests and multivariate analyses such as t-tests,
chi-square tests, ANOVA, correlations and regressions, non-
parametric tests, cluster analysis, etc.IBM SPASS statistics 26
continues to increase accessibility to advanced analytics through
improved tools, integration, output, and ease-of-use features.
This release mainly focuses on increasing the analytic
capabilities of the software through quantile regression, ROC
analysis, Bayesian statistics, one sample binomial and Poisson
enhancements, reliability analysis, and command enhancements.
SPSS software is used for editing and analysing all sorts of data
available from scientific research, clinical studies, customer
database, Google Analytics, etc. SPSS can open all file formats
that are commonly used for structured data such as spreadsheets
from MS Excel, plain text files, relational database, stats, SAS,
etc. SPSS Statistics can read and write data from ASCII text
files, other statistics packages, spreadsheets, and databases.
Statistical output is a proprietary file format and the proprietary
output can be exported to text or Microsoft word, pdf, excel, and
other formats. The typical workflow of SPSS software is as
follows:
• Opening data files in SPSS file format or others.
• Editing data such as computing sums and means over
columns or rows of data.
• Creating tables and charts containing frequency counts or
summary statistics over cases and variables.
• Running inferential statistics such as one-way ANOVA,
two-way ANOVA, regression, correlation, factor analysis, etc.
• Saving data and output in different file formats.

It will look like this after you Apply ( IF Condition) in excel


sheet.
EXERCISE
2
LAYOUT OF SPSS

Opening SPSS Data


When SPSS is launched, a pop-up window (Error! Reference source not found.) with a few
options will appear. Assume the goal is to analyze a data set, one can select New Dataset or open
a file recently used or another file under Recent Files and then click OK. The other windows
shows What’s New, Modules and Programmability and Tutorials, which help one to navigate
SPSS. IBM SPSS Welcome Screen Sometimes you have already entered the SPSS session as
described above, worked on a data set for a while, and then want to open and work on another
data set. You do not have to quit the current SPSS session to perform this. Simply click on the
File menu, follow Open then Data… and find your file .

OPEN SPSS

SPSS looks like this when you start it


SPSS Windows

The SPSS program has three main types of windows: the data editor, output window and syntax
window. The data editor window is open by default, and contains the data set. It consists of two
views, the Data View and the Variable View. This window is described in more detail in the
section on Working With Data and Variables. Data files are saved with a file type of .sav.

The output window holds the results of analyses. This window will open automatically once an
analysis is requested. The tables of the Output Viewer are saved (click File, Save or Save As)
with a file type of .spv, which can only be opened with SPSS software.

The syntax window contains written commands corresponding to each menu command and
options. Syntax can be created by hitting {Paste} instead of {oK} on main windows for each
procedure. Using {Paste} will not cause the procedure to be performed. To run procedures from
the syntax window, click on .

The syntax window will only open if a syntax file is opened by the user, or if the paste option is
used when executing a command. Output and syntax files can be saved and opened using the File
menu. Multiple output and syntax files can be open at the same time. Syntax files are saved as
plain text and almost any text editor can open them, but with a file extension of .sps.

Dialogue Boxes

Although each dialog box is unique, they have many common features. A fairly typical example
is the dialog box for producing frequency tables (tables with counts and percents). To bring up
this dialog box from the menus in the data window, click on Analyze  Descriptive Statistics 
Frequencies.
Working with Data and Variables

Viewing data and variables

Data in SPSS can be viewed in two different ways: data view and variable view. The data view
allows the user to look at the entire data set, with each row showing a different observation, and
each column representing a different variable. Another way to view the data is to use the variable
view. This shows the variable names and general properties for each variable. The user can
alternate between these views using the tabs at the bottom left hand side of the SPSS data editor
window, Figure 8 below shows the data view.
Define variable properties

To define or change the attributes of variables, change to “Variable View” to see a list of all the
variables with their properties from the current data file. Click or double click the variable you
would like to specify or change. Descriptions of each attribute are shown below
Name is the name of the variable. Rules for establishing variable names can be found on IBM
SPSS help  Command Syntax Reference  Universals  Variables  Variable Names.

Type is the type of a variable. Common options are Numeric for numbers, Date for dates, and
String for character strings. The string option allows the user to type in any set of characters
including punctuation marks and blank spaces. It is ideal for inputting open- ended questions
which are not coded.

Width is the maximal number of characters or digits allowed for a variable. Generally a width
large enough to accommodate all the possible values of the variable should be chosen; otherwise
any values with length greater than the specified value will be truncated.

Decimals are valid for numeric variables only. It specifies the number of decimals to be kept
for a variable. All the extra decimals will be rounded up and the rounded numbers will be used in
all the analysis, so be careful to specify the number of decimals to fit the required precision.

Label is the descriptive label for a variable. One can assign descriptive variable labels up to
256 characters long, and variable labels can contain spaces and reserved characters not allowed in
variable names.
Values is the descriptive value labels for each value of a variable. This is particularly useful if
the data file uses numeric codes to represent non-numeric categories (for example, codes of 1 and
2 for male and female).

Missing specifies some data values as user-missing values. Refer to the Missing Values section
for more detail.

Columns is the column width for a variable. Column formats affect only the display of values
in the Data Editor. Changing the column width does not change the defined width of a variable. If
the defined and actual width of a value are wider than the column, asterisks (*) are displayed in
the Data 10 view. Column widths can also be changed in the Data view by clicking and dragging
the column borders.

Align controls the display of data values and/or value labels in the Data view. The default
alignment is right for numeric variables and left for string variables. This setting affects only the
display in the Data view.

Measure is the level of measurement as scale (numeric data on an interval or ratio scale),
ordinal, or nominal. Nominal and ordinal data can be either string (alphanumeric) or numeric.
Nominal and ordinal are both treated as categorical. The variable, origin (Country of Origin) is
measured on a nominal scale as the cars are distinguished on the basis of a name or label, i.e.
American, European, and Japanese; whilst the variable gallon (miles per gallon) is measured on a
scale, specifically, ratio measurement scale because the difference between measurements and
ratios are meaningful, and that they have a true zero value.

To download the Cars data file as an SPSS file (i.e. with the .sav extension and all variable
attributes edited as in the example) already click here.

Missing Values

Missing values are a topic that deserves special attention. This section explains why they arise
and how to define them. In SPSS there are two types of missing values: user defined missing
values and system missing values. By default in SPSS, both types of missing values will be
disregarded in all statistical procedures, except for analyses devoted specifically to missing
values, for example, replacing missing values. In frequency tables, missing values will be shown,
but they will be marked as such and will not be used in computation.
User Defined Missing Values

User defined missing values indicate data values that are either missing, due to reasons like
nonresponse, or are not desired to be used in most analyses (e.g. “Not Applicable”.) By default
SPSS uses “.” to represent missing values. In some cases, there might be the need to distinguish
between data missing because a respondent refused to answer and data missing because the
question did not apply to that respondent, and thus would like more than one expression for
missing values. One can achieve this by setting up the “Missing” property of the corresponding
variable to specify some data values as missing values. These options allows one to enter up to
three discrete missing values, a range of missing values, or a range plus one discrete missing
value. All string values, including null or blank values, are considered valid values unless they
are explicitly defined as missing. To define null or blank values as missing for a string variable,
enter a single space in one of the fields for discrete missing values. You will notice missing
values denoted by “.”, for the variable mpg observations 11-15. The example in below shows
how to specify user defined missing values for variable mpg by setting up its Missing” property
System Missing Values

System missing values occur when no value can be obtained for a variable during data
transformations. For example, if there are two variables, one indicating a person’s gender and the
other whether she or he is married and you create a new variable that tells whether (a) a person is
male and married, (b) female and married, (c) male and not married, all females that are not
married will have a system missing value (“.”) instead of a real value.

Modifying and creating new variables

Insert

The easiest way to manually input a new variable is to scroll through the data view spreadsheet
horizontally until the first empty column is encountered, and entering in the data. The new
variable can be named appropriately in the variable view spreadsheet. Alternatively, selecting the
“Insert Variable” option under the “Data” menu allows insertion of a new variable at other
locations in the table. By default, this inserts the new variable in the first column of the
spreadsheet, but this can be changed by highlighting the column to the right of the desired
location.

Recode

The recode function can be used to collapse ranges of data into categorical variables, and
reassigning existing values to other values. To create a new variable as a function of another (log,
sin, etc), use “Compute” (described in the next section.)

1. Select Recode into Different Variables under the Transform menu. Recoding into Same
Variables is not recommended, since it will change existing variables and you will lose the
original values.

2. Select each variable to be transformed, and move it into the section on the right hand side
using the button. Note that the same transformation will be applied to all of these variables. If
different types of transformations are required, each transformation needs to be done separately.

3. If new variables are being created, define name for the output variable on the right hand side. If
desired, a label can be entered as well, though it is not required. Once the desired name and label
are entered, you must click the Change button.
4. Select the Old and New Values button and the window below will appear. In the Old Value
side of the window, select the appropriate original values to be recoded.

a) By selecting Value one can specify a value to replace (e.g. “male” or “1”). It is case sensitive,
so “A” and “a” are considered to be different values.

b) System-Missing and System- or user-missing allows missing values to be replaced by actual


values. It is not recommended to recode missing values using this method, unless there is a strong
reason to do so. Missing values should be handled with care, using techniques such as multiple
imputation.

c) The three range options partition numeric variables into categories. The above figure
demonstrates how a range of continuous variables can be condensed into a category. Rather than
running any procedures to find out the range of variables, the range options with LOWEST
through value: and value through HIGHEST: can be used to catch every point in the data set.

d) All other values can be used to pick up values not specifically referenced elsewhere.
5. On the New Value side, type in the new value. Then click the Add button to add it to the Old-
>New list. When recoding into different variables, one has the option of changing numbers to
strings, or converting numbers saved as strings to numbers. Unless otherwise specified, the new
variable will be saved in the same format as the original variable. Click Continue to close the
window. On the main screen, click OK or Paste to finish

Compute

Suppose you want to create a new variable, measuring the ratio of the vehicle’s weight to its
horsepower, you define the new variable as weihorse for the weight per unit horsepower. To
create a new variable as a function of one or more existing variables, select Compute from the
Transform menu. Enter the name of the new variable, weihorse, in the Target Variable box. In the
Numeric Expression box, use the keypad, function list, and the variable list to write out the
equation used to compute the new variable, (in this example: weight/horse). Click OK or Paste to
close the window..
CREATING GRAPHS

Graphs in SPSS may be generated using one of two options. The first option is the Legacy
Dialogs, which allows one to create basic charts and graphs. The second option is to use the Chart
Builder which allows one to generate charts either from a predefined gallery or by specifying
individual parts (for example, axes and bars). The steps to create a few common graphs are
shown below. However, SPSS has the ability to produce many other graphs such as population
pyramid, error bar, and 3-D bar chart. The Chart Builder allows more flexibility in creating
graphs. For any graph generated in SPSS, one can double click on the graph to invoke a Chart
Editor window, inside which one can double click any part of the graph to edit it.

Scatterplot

Suppose, we seek to investigate the linear relationship between miles per gallon and the vehicle
weight, we first plot a scatterplot to see the direction in which they are related. We will introduce
the Simple Scatterplot. In the “Graphs” menu, choose Legacy Dialogs  Scatter/dot. Select
Simple Scatter, click on the Define button to get the window shown below. Select a variable for
the Y-axis and a variable for the X-axis. These variables must be numeric and not in date format.
One can also select a categorical variable to define rows of panels and another categorical
variable to define columns of panels. Using the “Title” button one may specify the title, subtitles
and the footnotes for the plot. In the following example we are plotting mpg against vehicle
weight, using model year to define rows of panels
Histogram

A histogram shows the distribution of a single numeric variable. By selecting Legacy Dialogs
Histogram in the Graphs menu, one can generate a histogram. One can check the Display normal
curve option to require an estimated normal curve displayed over the histogram. Suppose, you
want to draw histograms of the miles per gallon based on the origin of the vehicle, in the Panel by
dialogue box, either in the rows or columns, you can put the variable, origin, as shown below
Q-Q Plot

The Q-Q Plot (quantile-quantile plot) procedure plots the quantiles of a variable's distribution
against the quantiles of a variable from a test distribution. Q-Q plots are generally used to
investigate whether the distribution of a variable is consistent with a proposed distribution.
Specifically, Q-Q plots can be used to investigate whether a variable (e.g. residuals in a
regression model) follows a Normal distribution. If the distribution of the variable and the
proposed distribution are the same, points in the Q-Q plot follow a straight line. If the
distributions are not similar, points in the Q-Q plot deviate from the straight line. Suppose you
want to generate a Q-Q plot with a Normal distribution as the test distribution. Select Descriptive
Statistics  Q-Q Plots in the Analyze menu. Enter the variables you want to plot into the
Variables box, and select Normal by clicking Test Distribution. Click OK to generate the plot.
Syntax File

Here is an example of the syntax for the Q-Q plot in Figure 18. After selecting Descriptive
Statistics  QQ Plots in the Analyze menu. You enter the variables, mpg (Miles per Gallon), you
want to plot into the Variables box, and select Normal by clicking Test Distribution. Then you
click Paste to generate the syntax, below

You can save the syntax as a .sps file for later access in running the analysis.
DATA VIEW of SPSS
VARIABLE VIEW of SPSS
EXERCISE 1

 DESCRIPTIVE STATISTICS
In the Analyze menu, the option Descriptive Statistics produces a submenu with the
choices Frequencies, Descriptives, Explore, Crosstabs, and Ratio. Of these,
Crosstabs and Descriptives have some particularly useful features which this
manual will cover. For more information on the other three, more information can
be found in the SPSS help menu, which is discussed on section “Help in SPSS” of
this manual.

Descriptives

The descriptives procedure calculates univariate statistics for selected variables.


In addition, it provides the option of creating a standardized variable for the
selected variables. Simply check the box at the bottom of the window to save
the standardized variable. The options menu provides a list of univariate
statistics available. For more statistics or computing statistics by group, see the
Means procedure under Compare Means
Frequencies

[DataSet1]

Statistics

how was the after


purchasing how was our
what is how often do you how did the how was the services(warranty, brand is better
martial your typically use the product purchasing customer services, than other
gender age status income ? beer shampoo ? perform ? experience ? etc.) brands ?

N Valid 9 9 9 9 9 9 9 9 9

Missi
1 1 1 1 1 1 1 1 1
ng

Frequency Table

Total 9 90.0 100.0


Missing 1 10.0
System
Total 10 100.0
Gender

Cumulative
Frequency Percent Valid Percent Percent

Valid 1.0 5 50.0 55.6 55.6

2.0 4 40.0 44.4 100.0

Total 9 90.0 100.0


Missing System 1 10.0
Total 10 100.0

Age

Cumulative
Frequency Percent Valid Percent Percent

Valid 1.0 5 50.0 55.6 55.6

2.0 2 20.0 22.2 77.8

4.0 2 20.0 22.2 100.0

Total 9 90.0 100.0


Missing System 1 10.0
Total 10 100.0

martial status

Cumulative
Frequency Percent Valid Percent Percent

Valid 1.0 7 70.0 77.8 77.8

2.0 2 20.0 22.2 100.0

Total 9 90.0 100.0


Missing System 1 10.0
Total 10 100.0
what is your income ?

Cumulative
Frequency Percent Valid Percent Percent

Valid 1.0 4 40.0 44.4 44.4

2.0 1 10.0 11.1 55.6

3.0 1 10.0 11.1 66.7

4.0 3 30.0 33.3 100.0

Total 9 90.0 100.0


Missing System 1 10.0
Total 10 100.0

how often do you typically use the beer shampoo ?

Cumulative
Frequency Percent Valid Percent Percent

Valid 1.0 3 30.0 33.3 33.3

2.0 4 40.0 44.4 77.8

3.0 1 10.0 11.1 88.9

4.0 1 10.0 11.1 100.0

Total 9 90.0 100.0


Missing System 1 10.0
Total 10 100.0

how was the after purchasing services(warranty, customer services, etc.)

Cumulative
Frequency Percent Valid Percent Percent

Valid 1.0 1 10.0 11.1 11.1

2.0 2 20.0 22.2 33.3

3.0 2 20.0 22.2 55.6

4.0 3 30.0 33.3 88.9

5.0 1 10.0 11.1 100.0

Total 9 90.0 100.0


Missing System 1 10.0
Total 10 100.0

how was our brand is better than other brands ?

Cumulative
Frequency Percent Valid Percent Percent

Valid 1.0 1 10.0 11.1 11.1

2.0 4 40.0 44.4 55.6

3.0 3 30.0 33.3 88.9

5.0 1 10.0 11.1 100.0

Total 9 90.0 100.0


Missing System 1 10.0
Total 10 100.0

EXERCISE 2

Histogram

The histogram is a popular graphing tool. It is used to summarize discrete or


continuous data that are measured on an interval scale. It is often used to illustrate
the major features of the distribution of the data in a convenient form. It is also
useful when dealing with large data sets (greater than 100 observations). It can help
detect any unusual observations (outliers) or any gaps in the data.
A histogram divides up the range of possible values in a data set into classes or
groups. For each group, a rectangle is constructed with a base length equal to the
range of values in that specific group and a length equal to the number of
observations falling into that group. A histogram has an appearance similar to a
vertical bar chart, but there are no gaps between the bars.
EXERCISE 3

Crosstabs
The Crosstabs procedure forms two-way and multi-way tables and
provides a variety of tests and measures of association for two-way
tables. Multi-way tables are formed using the ‘Layer’ button. Note that
tests are not made across layers. When layers are used, comparisons are
made for the row and column variables at each value of the layer
variable. The Statistics button at the bottom allows various statistics to
be computed, including correlations and Chi-square tests. To help
uncover patterns in the data that contribute to a significant chi-square
test, the Cells button provides options for displaying expected
frequencies and three types of residuals (deviates) that measure the
difference between observed and expected frequencies. Each cell of the
table can contain any combination of counts, percentages, and residuals
selected

Case Processing Summary

Cases

Valid Missing Total

N Percent N Percent N Percent

gender * martial status 9 90.0% 1 10.0% 10 100.0%


gender * what is your income
9 90.0% 1 10.0% 10 100.0%
?
gender * how often do you
typically use the beer 9 90.0% 1 10.0% 10 100.0%
shampoo ?

Gender* martial status

Crosstab

martial status

1.0 2.0 Total

gender 1.0 Count 3 2 5

% within gender 60.0% 40.0% 100.0%

% within martial status 42.9% 100.0% 55.6%

2.0 Count 4 0 4

% within gender 100.0% 0.0% 100.0%

% within martial status 57.1% 0.0% 44.4%


Total Count 7 2 9
% within gender 77.8% 22.2% 100.0%

% within martial status 100.0% 100.0% 100.0%

Chi-Square Tests

Asymp. Sig. (2- Exact Sig. (2- Exact Sig. (1-


Value df sided) sided) sided)

Pearson Chi-Square 2.057a 1 .151


b
Continuity Correction .394 1 .530
Likelihood Ratio 2.805 1 .094
Fisher's Exact Test .444 .278
Linear-by-Linear Association 1.829 1 .176
N of Valid Cases 9

a. 4 cells (100.0%) have expected count less than 5. The minimum expected count is .89.
b. Computed only for a 2x2 table

Chi-Square Tests

Asymp. Sig. (2-


Value df sided)

Pearson Chi-Square 3.262a 3 .353


Likelihood Ratio 4.048 3 .256
Linear-by-Linear Association 1.600 1 .206
N of Valid Cases 9
a. 8 cells (100.0%) have expected count less than 5. The minimum
expected count is .44.

EXERCISE 4
CHI – SQUARE

The Chi-Square Test of Independence determines whether there is an association


between categorical variables (i.e., whether the variables are independent or
related). It is a nonparametric test.

This test is also known as:


 Chi-Square Test of Association.

This test utilizes a contingency table to analyze the data. A contingency table (also
known as a cross-tabulation, crosstab, or two-way table) is an arrangement in
which data is classified according to two categorical variables. The categories for
one variable appear in the rows, and the categories for the other variable appear in
columns.

The Chi-Square Test of Independence is commonly used to test the following:

 Statistical independence or association between two or more categorical


variables.

The Chi-Square Test of Independence can only compare categorical variables. It


cannot make comparisons between continuous variables or between categorical and
continuous variables. Additionally, the Chi-Square Test of Independence only
assesses associations between categorical variables, and cannot provide any
inferences about causation.

Chi-Square Tests

Asymp. Sig. (2-


Value df sided)
a
Pearson Chi-Square 2.250 3 .522
Likelihood Ratio 3.001 3 .391
Linear-by-Linear Association 1.800 1 .180
N of Valid Cases 9

a. 8 cells (100.0%) have expected count less than 5. The minimum


expected count is .44.
EXERCISE 5
T-Test
A t-test is a type of inferential statistic used to determine if there is a significant
difference between the means of two groups, which may be related in certain
features. It is mostly used when the data sets, like the data set recorded as the
outcome from flipping a coin 100 times, would follow a normal distribution and
may have unknown variances. A t-test is used as a hypothesis testing tool, which
allows testing of an assumption applicable to a population.

A t-test looks at the t-statistic, the t-distribution values, and the degrees of
freedom to determine the statistical significance. To conduct a test with three or
more means, one must use an analysis of variance.

T-Test

Group Statistics

gender N Mean Std. Deviation Std. Error Mean

how did the product 1.0 5 2.800 1.3038 .5831


perform ? 2.0 4 3.000 .8165 .4082
how was the purchasing 1.0 5 3.200 1.3038 .5831
experience ? 2.0 4 3.000 .8165 .4082
how was the after 1.0 5 3.200 1.3038 .5831
purchasing 2.0
services(warranty, customer 4 3.000 1.4142 .7071
services, etc.)
how was our brand is better 1.0 5 2.800 1.4832 .6633
than other brands ? 2.0 4 2.250 .5000 .2500
EXERCISE 6
INDEPENDENT SAMPLE TEST

The Independent Samples t Test compares the means of two independent groups in
order to determine whether there is statistical evidence that the associated
population means are significantly different. The Independent Samples t Test is a
parametric test.

This test is also known as:

 Independent t Test
 Independent Measures t Test
 Independent Two-sample t Test
 Student t Test
 Two-Sample t Test
 Uncorrelated Scores t Test
 Unpaired t Test
 Unrelated t Test

The variables used in this test are known as:

 Dependent variable, or test variable


 Independent variable, or grouping variable

Independent Samples Test

Levene's Test for t-test for Equality of Means


Equality of
Variances
95% Confidence Interval of

Sig. (2- Mean Std. Error the Difference

F Sig. t df tailed) Difference Difference Lower Upper

how did the Equal


product perform ?variances 1.031 .344 -.266 7 .798 -.2000 .7521 -1.9785 1.5785
assumed

Equal
varianc
es not -.281 6.727 .787 -.2000 .7118 -1.8971 1.4971
assum
ed
how was the Equal
purchasing varianc
experience ? es 1.896 .211 .266 7 .798 .2000 .7521 -1.5785 1.9785
assum
ed
Equal
varianc
es not .281 6.727 .787 .2000 .7118 -1.4971 1.8971
assum
ed
how was the after Equal
purchasing varianc
services(warranty, es .007 .934 .220 7 .832 .2000 .9071 -1.9450 2.3450
customer services, assum
etc.) ed
Equal
varianc
es not .218 6.287 .834 .2000 .9165 -2.0181 2.4181
assum
ed
how was our brand Equal 1.922 .208 .702 7 .505 .5500 .7835 -1.3028 2.4028
is better than other varianc
brands ? es
assum
ed
Equal
varianc
es not .776 5.080 .472 .5500 .7089 -1.2636 2.3636
assum
ed

EXERCISE 7
Oneway ANOVA
The one-way analysis of variance (ANOVA) is used to determine whether there are any
statistically significant differences between the means of two or more independent (unrelated)
groups (although you tend to only see it used when there are a minimum of three, rather than two
groups). For example, you could use a one-way ANOVA to understand whether exam
performance differed based on test anxiety levels amongst students, dividing students into three
independent groups (e.g., low, medium and high-stressed students). Also, it is important to realize
that the one-way ANOVA is an omnibus test statistic and cannot tell you which specific groups
were statistically significantly different from each other; it only tells you that at least two groups
were different. Since you may have three, four, five or more groups in your study design,
determining which of these groups differ from each other is important. You can do this using a
post hoc test (N.B., we discuss post hoc tests later in this guide).

This "quick start" guide shows you how to carry out a one-way ANOVA using SPSS Statistics, as
well as interpret and report the results from this test. Since the one-way ANOVA is often
followed up with a post hoc test, we also show you how to carry out a post hoc test using SPSS
Statistics. However, before we introduce you to this procedure, you need to understand the
different assumptions that your data must meet in order for a one-way ANOVA to give you a
valid result. We discuss these assumptions next.

ANOVA

Sum of Squares df Mean Square F Sig.

how did the product Between Groups 3.189 2 1.594 1.678 .264
perform ? Within Groups 5.700 6 .950

Total 8.889 8
how was the purchasing Between Groups 6.689 2 3.344 9.121 .015
experience ? Within Groups 2.200 6 .367
Total 8.889 8
how was the after purchasing Between Groups 3.189 2 1.594 .986 .426
services(warranty, customer Within Groups 9.700 6 1.617
services, etc.) Total 12.889 8
how was our brand is better Between Groups 2.422 2 1.211 .932 .444
than other brands ? Within Groups 7.800 6 1.300

Total 10.222 8

EXERCISE 8
Correlations
The Bivariate Correlations procedure computes Pearson's correlation coefficient, Spearman's rho,
and Kendall's tau-b with their significance levels. Correlations measure how variables or rank
orders are related. Before calculating a correlation coefficient, screen your data for outliers
(which can cause misleading results) and evidence of a linear relationship. Pearson's correlation
coefficient is a measure of linear association. Two variables can be perfectly related, but if the
relationship is not linear, Pearson's correlation coefficient is not an appropriate statistic for
measuring their association.

For each variable: number of cases with nonmissing values, mean, and standard deviation. For
each pair of variables: Pearson's correlation coefficient, Spearman's rho, Kendall's tau- b, cross-
product of deviations, and covariance.

Correlations

how was the


after purchasing how was our
how did the how was the services(warrant brand is better
product purchasing y, customer than other
perform ? experience ? services, etc.) brands ?
how did the product perform ? Pearson Correlation 1 .800** .571 .478

Sig. (2-tailed) .010 .108 .193

N 9 9 9 9
how was the purchasing Pearson Correlation .800** 1 .644 .466
experience ? Sig. (2-tailed) .010 .061 .206
N 9 9 9 9
how was the after purchasing Pearson Correlation .571 .644 1 .474
services(warranty, customer Sig. (2-tailed) .108 .061 .197
services, etc.) N 9 9 9 9
how was our brand is better Pearson Correlation .478 .466 .474 1
than other brands ? Sig. (2-tailed) .193 .206 .197

N 9 9 9 9
**. Correlation is significant at the 0.01 level (2-tailed).

EXERCISE 9
3D BAR GRAPH

Three-dimensional graphs are rarely used in practice except for didactic purposes.
They are kind of cool though and especially helpful for visualizing the idea of the
regression plane in a two-predictor multiple regression. Good luck visualizing a
four-dimensional graph for three predictors, however! The 3d orientation of the
plots for the various plotting methods below seems to vary considerably. I like the
orientation used on the R scatterplot3d package the best.

SPSS
The GGRAPH command is used and there are a number of options for appearances that I did not
employ. The order of the dimensions under the GUIDE statements is dimension 1 (x-width),
dimension 2 (y-depth), and dimension 3 (z-height). The dependent variables is typically put on
the vertical axis (z dimension). The name "graphdataset" appearing on the NAME keyword is an
arbitrary name and it can be any name you choose. It names the data set read out and used in the
later SOURCE command, so these two names must match exactly. Note that the Years Since PhD
axis values are descending rather than ascending.
EXERCISE 10
PIE CHART
A Pie Chart is a type of graph that displays data in a circular graph. The pieces of the graph are
proportional to the fraction of the whole in each category. In other words, each slice of the pie is
relative to the size of that category in the group as a whole. The entire “pie” represents 100
percent of a whole, while the pie “slices” represent portions of the whole.

Pie charts give you a snapshot of how a group is broken down into smaller pieces. The
following chart shows what New Yorkers throw in their trash cans. You could read that New
Yorkers (perhaps surprisingly) throw a lot of recyclables into their trash, but a pie graph gives a
clear picture of the large percentage of recyclables that find their way into the trash.

BM SPSS statistics is software specifically designed for stats, especially in the social sciences.
The software is capable of creating a large number of graph types with a huge variety of options.
Unlike simpler programs like Excel, SPSS gives you a lot of options for creating pie charts.
Questionnaire

You might also like