Data Analysis and Interpretation

DATA ANALYSIS AND
INTERPRETATION
Quantifying Data
Data Entry
• Define variables, enter case data, conduct runs
• Coding and Recoding

• If numeric values not pre-assigned, decide on coding
system
• If there is open-ended data, would need to decide

how to deal with responses

• Defining your variables

Data Cleaning
• Reread each set of responses back (immediately) to
confirm accuracy
• “Possible-code cleaning”
• easiest way to check is to run a frequency distribution

• Contingency cleaning
• On the “if” questions

• “Sort” by response
• do you recycle… then check the “what do you recycle”
variable
• Can also run cross tabs and make sure cells are empty
Basic Analysis – Measures of Central Tendency
• Mean: sum of values divided by the
number of cases
• simple average
• Median: middle attribute in a list of

observed attributes
• extreme cases eliminated

• Mode: most frequently occurring attribute

• used with nominal variables, i.e.. sex
• most respondents were women
• usually report with percentage, 60% were
women
Cross Tabs
• Used often with Bivariate data
• Convention usually places

• “independent variables” across top in
columns
• “dependent variables” in rows below
Coding and data entry options
• Transfer sheets are special forms ruled off in 80
columns
• Edge coding involves recording code #'s in margins of

questionnaires
• Direct data entry involves entering data directly into

computer; eliminating transfer sheets
• Data entry by interviewer (CATI)
• Optical scan sheets

Coding
• What is it?
• It is the assignment of numerical values to information or responses gathered
by a research instrument
• Codebook: describes the locations of variables and lists the codes

assigned to the attributes of the variables
Coding
Data Management Process
• concerned with the process by which raw data gathered by some
instrument are converted into numbers for analysis purposes
• Collect information with data gathering
instrument
• Use codebook to transfer this information to a

transfer sheet or code sheet (optional)
• Create data file from information on code sheet

by entering data from a computer keyboard
• Check/clean up data file for accuracy

• Data cleaning done by
• Computer edit programs
• Examine distributions
• Contingency cleaning
• What about open-ended items?
• Read through responses a create a preliminary code based on
responses
• If more than 10% of responses fall into "other" category, code
needs to be revised to include many of these responses
Elementary Quantitative Analyses
• To understand the meaning of univariate,
bivariate, and multivariate analysis
• To become familiar with the meaning of

several univariate and bivariate statistics
Analysis Strategies
• Why do we have to have them?
• People who read our ‘research’ are
interested in the highlights
• Should try to communicate findings in
an understandable and ‘painless
fashion’
Three types of analysis
• Univariate analysis
• the examination of the distribution of cases on only
one variable at a time (e.g., college graduation)
• Bivariate analysis
• the examination of two variables simultaneously
(e.g., the relation between gender and college
graduation)
• Multivariate analysis
• the examination of more than two variables
simultaneously (e.g., the relationship between
gender, race, and college graduation)
“Purpose”
• Univariate analysis
• Purpose: description
• Bivariate analysis
• Purpose: determining the empirical relationship
between the two variables
• Multivariate analysis
• Purpose: determining the empirical relationship
among the variables
Types of Statistics
• Techniques that summarize and describe
characteristics of a group or make comparisons of
characteristics between groups are knows as
descriptive statistics.
• Inferential statistics are used to make generalizations

or inferences about a population based on findings
from a sample.
• The choice of a type of analysis is based on the

evaluation questions, the type of data collected, and
the audience who will receive the results.
Univariate Analysis
• Involves examination of the distribution of
cases on only ONE variable at a time
• Frequency distributions are listings of the

number of cases in each attribute of a variable
• Ungrouped frequency distribution
• Grouped frequency distribution
• Proportions express number of cases of the

criterion variable as part of the total
population; frequency of criterion variable
divided by N
• Percentages are simple 100 X proportion
• Or [100 X (frequency of criterion variable divided
by N)]
• Rates make comparisons more meaningful by

controlling for population differences
Measures of Central Tendency
• Measures of central tendency reflect the
central tendencies of a distribution
• Mode reflects the attribute with the greatest
frequency
• Median reflects the attribute that cuts the

distribution in half
• Mean reflects the average; sum of attributes

divided by # of cases
Measures of Dispersion
• Measures of dispersion reflect the spread or
distribution of the distribution
• Range is the difference between largest & smallest
scores; high – low
• Variance is the average of the squared differences

between each observation and the mean
• Standard deviation is the square root of variance
Types of Variables
• Continuous: increase steadily in tiny fractions
• Discrete: jumps from category to category

Subgroup Comparisons
• Somewhere between univariate &
bivariate, are Subgroup Comparisons
• Present descriptive univariate data

for each of several subgroups
• Ratios: compare the number of cases in
one category with the number in
another
Bivariate Analysis
• Bivariate analysis focus on the
relationship between two variables
Contingency Tables
• Format: attributes of independent
variable are used as column headings and
attributes of the dependent variable are
used as row headings
• Guidelines for presenting & interpreting

contingency tables
• Contents of table described in title
• Attributes of each variable clearly described
• Base on which percentages are computed should
be shown
• Norm is to percentage down & compare across
• Table should indicate # of cases omitted from
analysis
Multivariate Analysis
• Multivariate Analysis allow the separate and
combined effects of the independent variable
to be examined
Editing Data
Data have to be edited, especially when they relate to responses to open-ended
questions of interviews and questionnaires, or unstructured observations. In other
words, information that may have been noted down by the interviewer, observer, or
researcher in a hurry must be clearly deciphered so that it may be coded
systematically in its entirety. Lack of clarity at this stage will result later in confusion.
Handling Blank Responses
Answers may have been left blank because the respondent did not understand the
question, did not know the answer, was not willing to answer, or was simply indifferent
to the need to respond to the entire questionnaire. In the last situation, the respondent
is likely to have left many of the items blank. If a substantial number of questions—say,
25% of the items in the questionnaire—have been left unanswered, it may be a good
idea to throw out the questionnaire and not include it in the data set for analysis.
Entering Data
If questionnaire data are not collected on scanner answer sheets,

which can be directly entered into the computer as a data file, the
raw data will have to be manually keyed into the computer. Raw
data can be entered through any soft- ware program. For instance,
the SPSS Data Editor, which looks like a spread- sheet, can enter,
edit, and view the contents of the data file. Each row of the editor
represents a case, and each column represents a variable. All
missing values will appear with a period (dot) in the cell. It is
possible to add, change, or delete values easily after the data have
been entered.
Table 2.1 consists of data, or scores, for the number of psychology courses taken by ten students,
five men and five women.
DATA ANALYSIS
A frequency distribution of the nominal variables of interest should
be obtained. Visual displays thereof through histograms/bar charts,
and so on, can also be pro- vided through programs that generate
charts. In addition to the frequency distributions and the means and
standard deviations, it is good to know how the dependent and
independent variables in the study are related to each other. For
this purpose, an intercorrelation matrix of these variables should
also be obtained.
It is always prudent to obtain (1) the frequency distributions for the
demo- graphic variables, (2) the mean, standard deviation, range,
and variance on the other dependent and independent variables,
and (3) an intercorrelation matrix of the variables, irrespective of
whether or not the hypotheses are directly related to these analyses.
These statistics give a feel for the data. In other words, examination of
Types of Measurement Types of Descriptive Analysis
Nominal Frequency table, Proportion percentage, Mode
Ordinal Median, Quartiles, Percentiles, Rank Order correlation
Interval Arithmetic mean, Correlation Coefficient
Ratio Index Numbers, Geometric mean, Harmonic Mean

Data Analysis and Interpretation

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Data Analysis and Interpretation

Uploaded by

Copyright:

Available Formats

DATA ANALYSIS AND

• Coding and Recoding

• If there is open-ended data, would need to decide

• Defining your variables

• Median: middle attribute in a list of

• Mode: most frequently occurring attribute

• Convention usually places

• Edge coding involves recording code #'s in margins of

• Direct data entry involves entering data directly into

• Data entry by interviewer (CATI)

• Optical scan sheets

• Codebook: describes the locations of variables and lists the codes

• Use codebook to transfer this information to a

• Create data file from information on code sheet

• Check/clean up data file for accuracy

• To become familiar with the meaning of

• Inferential statistics are used to make generalizations

• The choice of a type of analysis is based on the

• Frequency distributions are listings of the

• Proportions express number of cases of the

• Rates make comparisons more meaningful by

• Median reflects the attribute that cuts the

• Mean reflects the average; sum of attributes

• Variance is the average of the squared differences

• Discrete: jumps from category to category

• Present descriptive univariate data

• Guidelines for presenting & interpreting

If questionnaire data are not collected on scanner answer sheets,

You might also like