Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 74

Ph.

D In Nursing

RESEARCH METHODS in nursing

Module No: 7 DATA PREPARATION

Name of the subtopic


7.3 data verification and Qualitative vs.
Quantitative data analysis techniques
Faculty Name
Date:
Subject Code:
School of Nursing
Learning Objectives
 At the end of this module the students will be able to ;
 Understand the Meaning of data verification
 list the Errors in data
 Explain the Steps in data verification
 Describe the Importance of data cleaning and maintaining
consistency
 Discuss the Data verification procedures using WinDEM
 Getting ready for analysis
 Selecting a Data Analysis Strategy
 Explain the Quantitative and qualitative analytical techniques

2
List of contents
 introduction
 Meaning of data verification
 Errors in data
 Steps in data verification
 Importance of data cleaning and maintaining consistency
 Data verification procedures using WinDEM
 Getting ready for analysis
 Selecting a Data Analysis Strategy
 Quantitative and qualitative analytical techniques
 Summary
 references
3
Introduction
nursing science is the body of knowledge that supports
evidence-based practice
The format of the data after data entry and data verification
processes have been completed, is often not the best format for
the use in data analyses. In order to manipulate, analyze and
report the information collected in a convenient and efficient
way, the data needs to be organized in a database system. Such
a database system is a structured aggregation of data-elements
which can be related and linked to each other through specified
relationships. The data-elements can then be accessed through
specified criteria and through a set of transaction operations,
which are usually implemented through a data-retrieval
language.
4
Meaning of Data verification

 A critical step in the management of survey data is the


verification of the data. It must be ensured that the data are
consistent and conform to the definitions in the codebook and
are ready for analytical use.

5
Causes of data errors
 Whenever data are collected, almost inevitably errors are
introduced by various sources. Substantial delays occur when
these errors are not anticipated and safeguards built into
procedures.
 Errors in the data may be caused:
 (a) by faulty preparation of the field monitoring and survey
tracking instruments,
 (b) by the assignment of incorrect identifications to the
respondents,
 (c) during the field administration,
 (d) during the preparation of the instruments including the
coding of the responses, and
 (e) during transcription of the data.
6
Causes of data errors-contd

 questions may have been skipped or answered in a way not


intended because of misunderstanding during translation and/or
ambiguities in the question. Such deviations can result in:
 (a) variables which have been skipped but which should have
been included; or, variables which have been included when
they should not have been (e.g. when they were misprinted);
(b) incorrectly coded variables;
 (c) variables which have a content different from that specified
by the codebook.

7
Causes of data errors-contd

 The amount of work involved in resolving these problems,


often called “data cleaning” can greatly be reduced by using
well-designed instruments, qualified field administration and
coding personnel and appropriate transcription mechanisms.

8
Data verification steps

 The steps that must be undertaken to verify the data are


implied in the quality standards that have been defined for the
corresponding survey. Procedures must be implemented for
checking invalid, incorrect and inconsistent data, which may
range from simple deterministic univariate range checks to
multivariate contingency tests between different variables and
different respondents.

9
Data verification steps
 The steps that must be undertaken to verify the data are
implied in the quality standards that have been defined for the
corresponding survey. Procedures must be implemented for
checking invalid, incorrect and inconsistent data, which may
range from simple deterministic univariate range checks to
multivariate contingency tests between different variables and
different respondents.

10
Data verification steps

 The criteria on which these checks are based depend, on the


one hand, on variable type (i.e. different checks may apply to
data variables, control variables, and identification variables)
and, on the other hand, to the manner and sequence in which
questions are asked.
 For some questions a certain number of responses are required,
or responses must be given in a special way, due to a
dependency or logical relationship between questions.

11
Data verification steps

 Depending on the data collection procedures used, it must be


defined when and at what stage data verification steps are
implemented.
 For example, some range validation checks may be applied
during data entry whereas more complex checks for the
internal consistency of data or for outliers may be more
appropriately applied after the completion of the data entry.
Problems that have been detected through verification
procedures need to be resolved efficiently and quickly.

12
Data verification steps-(validation check)

Substruction: Operational System

Pain Intensity Functional Status


Instrument: Instrument:1-5 Likert scale,
VAScontinuous
10 cm scale 1=low & 5=high function
Scale: or
discrete?
(low to high pain)
Scale: continuous or discrete?

13
Data verification steps-(validation
check)
Scaling
Discrete: non-parametric (Chi square)
 Nominal gender
 Ordinal low, medium, high income
Continuous: parametric (t or F tests)
 Interval Likert scale, 1-5
functionality
 Ratio money, age, blood pressure

14
Rigor in Quantitative Research

 Theoretical Grounding: Axioms & postulates –


substruction-validity of hypothesized relationships
 Design validity (internal & external) of research design;
Instrument validity and reliability
 Statistical assumptions met (scaling, normal curve, linear
relationship, etc.)

(Note: Polit & Beck: reliability, validity, generalizability,


objectivity)

15
Data verification steps-(validation
check)

Design Validity
 Statistical conclusion validity
 Construct validity of Cause & Effect (X & Y)
 Internal validity
 External validity

16
Data verification steps-(validation
check)

Design Validity

 Statistical Conclusion Validity rxy?


 Type I error (alpha 0.05)
 Type II error (Beta) Power = 1-Beta, inadequate power,
i.e. low sample size
 Reliability of measures

Can you trust the statistical findings?


Data verification steps-(validation
check)

Design Validity

 Construct Validity of Putative Cause & Effect (X  Y?)


 Theoretical basis linking constructs and concepts
(substruction)
 Outcomes sensitive to nursing care
 Link intervention with outcome theoretically

Is there any theoretical rationale for why X and Y should be


related?
Data verification steps-(validation
check)

Design Validity

Internal Validity
 Threat of history (intervening event)
 Threat of maturation (developmental change)
 Threat of testing (instrument causes an effect)
 Threat of instrumentation (reliability of measure)
 Threat of mortality (subject drop out)
 Threat of selection bias (poor selection of subjects)

Are any Z variables causing the observed changes in Y?


Data verification steps-(validation
check)

Design Validity

External Validity
 Threat of low generalizability to people, places, & time

 Can we generalize to others?


Data verification steps-(validation
check)

Building Knowledge

 Goal is to have confidence in our descriptive, correlational, and


causal data.
 Rigor means to follow the required techniques and strategies
for increasing our trust and confidence in the research findings.
Data verification -contd

 Some problems can, using certain assumptions, be resolved


automatically on the basis of cross-checks in the datafiles.
Other problems will require further inspection and manual
resolving. Where problems cannot be resolved, the problematic
data-values must be recoded to special missing codes.
 The criteria on which the checking was based depended, on the
one hand, on the type of variables that were used to code the
information (for example, different criteria apply to data
variables,

22
Data verification -contd

 identification variables, control variables, filter variables, and


dependent variables) and, on the other hand on the way and
sequence in which questions were asked (for example, for
some questions a certain number of responses are required, or
responses must be given in a special way, or there is a
dependency, or a logical relationship between certain
questions).

23
Data verification steps

 Data verification steps


 Usually, data verification is undertaken through a series of
steps, which for each survey need to be established and
sequenced in accordance with the quality requirements, the
type of data collected, and the field operations and data entry
procedures applied.

24
Data verification steps-contd
Common data verification steps are:
 • the verification of returned instruments for completeness and
correctness;
 • the verification of identification codes against field
monitoring and survey tracking instruments;
 • a check for the file integrity, in particular, the verification of
the structure of the datafiles against the specifications in the
codebook;

25
Data verification steps-contd
 the verification of the identification codes for internal
consistency;
 • the verification of the data variables for each student or
teacher against the validation criteria specified in the
codebook;
 • the verification of the data variables for each student and
teacher against certain control variables in the datafiles;

26
Data verification steps-contd

 the verification of the data obtained for each respondent for


internal consistency (for example, the responses for questions
which were coded through split variables the answers to these
can be cross-validated);
 • the cross-validation of related data between respondents;
 • the verification of linkages between related datafiles,
especially in the case of hierarchically structured data; and
 • the verification of the handling of missing data.

27
Data verification steps-contd
 1. Verification of file integrity
 As a first step it needs to be ensured that the overall structure
of the datafiles conforms to the specifications in the codebook.
For example, if raw datafiles are used, then each record should
have the same length in correspondence with the codebook.
Often it is useful to introduce column-control variables in the
codebook at regular intervals which the coder should code with
blanks.

28
Data verification steps-contd
Special recordings
 Sometimes it is necessary to recode certain variables before
they can be used in the data analyses. Examples for such
situations are:
 • The sequence of questions or items has been changed for
some reason and is not any more in accordance with the
codebook;
 • A question or item may have been asked to some respondents
in a different way or format than was intended;
 • A question or item has not been asked to some respondents
but the missing information could be derived from other
variables.

29
Data verification steps-contd

 Value validation
 Background questions and test items for which a fixed set of
codes rather than open-ended values applied need to be
checked against the range validation criteria defined in the
codebook. Variables with open-ended values (e.g. “Student
age”) need to be checked against theoretical ranges.

30
Data verification steps-contd
Treatment of duplicate identification codes
 Each datafile should be checked for duplicate identification
codes. It is thereby often useful to distinguished between the
following two cases:
 • Respondents who have identical identification codes but
different values for a number of key data variables. These
respondents have probably been assigned invalid identification
codes.
 • Respondents who have identical identification codes and at
the same time also identical data values for the key data
variables. These respondents are most likely duplicate entries
in which case the second occurrence of the respondents data
was removed from the datafiles.

31
Data verification steps-contd
Internal validation of an hierarchical identification system
 If a hierarchical identification system is used for identifying
respondents at different levels, then the structure of this system
can be verified for internal consistency. The number of errors
in identification codes which are often a serious threat to the
use of the data can thus be dramatically reduced. Often
inconsistencies can then be resolved automatically or even
avoided during the entry of data.

32
Data verification steps-contd

Verification of the linkages between datafiles


 Often data are collected at different levels, for example data
are collected for students and for the teachers which teach the
classes in which the students are enrolled. Then it is important
to verify the linkages between such levels of data collection.

33
Data verification steps-contd
 There may be cases where for a teacher in the teacher datafile
there are no students linked to him or her in the student datafile
(that is, for a certain Teacher ID in the teacher datafile there is no
matching Teacher ID in the student datafile);
 • There may be cases in the student datafile which do not have a
match in the teacher datafile even though they are associated
with a Teacher ID;
 • There may be cases where the Class IDs of all students which
were linked to a teacher were different from the Class ID of this
teacher in the teacher datafile; and
 • There may be cases where the Class IDs of all students which
were linked to a teacher are different from the Class ID of this
teacher in the teacher datafile

34
Data verification steps-contd

 Verification of participation indicator variables against data


variables. for example, in the first situation we would base the
student score only on the answers given in the first testing
session and exclude the items in the second testing session
from scoring, whereas in the second situation case we would
score all items in the second testing session as wrong.
 To allow the verification of this, the datafile should, besides the
variables with the questions and test items, contain also
information about the participation status of the respondents in
the different

35
Data verification steps-contd

 testing sessions. It is often useful to group the variables in the


codebook into blocks. Each of these blocks can then begin with
a variable the code of which indicates the participation of the
student in the respective testing session. It can then be verified
whether the participation indicator variables matched the data
in the corresponding data variables.

36
Data verification steps-contd

 Verification of exclusions of respondents


 Checking for inconsistencies in the data

37
Data verification procedures using
WinDEM

 Data verification procedures using WinDEM


 WinDEM offers some simple data verification procedures. All
the procedures are found under the “Verify” menu. The
following section describes some of the fundamental ones:

38
Data verification procedures using
WinDEM
1. Unique ID check
 There must be only one record within a file for each unit of
analysis surveyed. This verification procedure checks whether
each record has been allocated with a unique identification
code.
2.Column check
 When a series of similar variables exist in a file, it is possible
that the enterer skips a variable or enters a variable twice, and
consequently a column shift occurs. This can be avoided if you
introduce variables in the datafile at regular positions in the
codebook, into which the data entry personnel must enter a
blank value.

39
Data verification procedures using
WinDEM-contd

2.Column check-contd
 . In order to be recognized by the automatic checking routines
of the WinDEM program, the names of these variables must
have the prefix “CHECK”. Column shift should not occur if
the data enters followed these directions of entering the blank
values. You can also see that the data entry proceeded correctly
by looking at the “Table entry” from the “view” menu.

40
Data verification procedures using
WinDEM-contd
3.Validation check
 As mentioned before, WinDEM assures that the values are
within the range specified in the structure file unless the data
puncher explicitly confirms the out-of-range values entered.
This validation criteria check will show all the variables of all
the cases that have been “confirmed” to contain out-of-range
values. This can be useful especially when many data enterers
are involved in the survey study.

41
Data verification procedures using
WinDEM-contd

Merge check
 WinDEM allows you to check the consistency between
variables. This check detects records in a datafile that do not
have matches in a related datafile for a higher level of data
aggregation

42
Data verification procedures using
WinDEM-contd

 Using “Merge check” from the “Verify” menu, you can select
the variables (or variable combinations) by which the records
in the selected data file are matched against the records in the
higher-level aggregated data file. The software will ask you to
specify the datafile against which to check the merge of the
current datafile in the “File Open Dialog”.
 The program will notify you if some errors are found. The
software will ask you if you want to open the data verification
report

43
Data verification procedures using
WinDEM-contd

 Double coding check


 In order to produce high-quality data, it is sometimes
recommended to enter data into two different computers
(requiring two different data enterers). This allows you to
examine if the two files have exactly the same structures and
the values on all records. In order to check this, however, you
will have to have these two files under different names and the
two corresponding codebooks on one

44
GETTING DATA READY FOR
ANALYSIS

 After data are obtained through questionnaires, interviews,


observation, or through secondary sources, they need to be
edited. The blank responses, if any, have to be handled in some
way, the data coded, and a categorization scheme has to be set
up. The data will then have to be keyed in, and some software
program used to analyze them.

45
Selecting a Data Analysis Strategy
Earlier Steps (1, 2, 3) of the Marketing Research Process

Known Characteristics of Data

Properties of Statistical Techniques

Background & Philosophy of the Researcher

Data Analysis Strategy


Quantitative techniques of data analysis
A Classification of Univariate
Techniques Univariate Techniques

Metric Data Non-numeric


Data

One Sample Two or More One Sample Two or More


Samples Samples
* t test * Frequency
* Z test * Chi-Square
* K-S
* Runs
* Binomial
Independent Related
* Two- * Paired Independent Related
Groups t * t test
test * Chi-Square * Sign
* Z test * Mann-Whitney * Wilcoxon
* One-Way * Median * McNemar
ANOVA * K-S * Chi-Square
* K-W ANOVA
A Classification of Multivariate
Techniques
Multivariate Techniques

Dependence Interdependenc
Technique e Technique

One Dependent More Than One Variable Interobject


Variable Dependent Interdependenc Similarity
Variable e
* Cross- * Multivariate * Factor * Cluster Analysis
Tabulation Analysis of Analysis *
* Analysis of Variance and Multidimensiona
Variance and Covariance l Scaling
Covariance * Canonical
* Multiple Correlation
Regression * Multiple
* Conjoint Discriminant
Analysis Analysis
Types of Statistical Analysis Used in Research
 Descriptive Analysis:
 Mean, Mode, Median and Standard deviation.
 Inferential Analysis:
 Hypothesis testing and estimation of true population values.
 Differences Analysis:
 Determination of significant differences exit in the
population.
 Associative Analysis:
 Investigation of how two and more variables are related.
 Predictive Analysis:
 It is used to enhance prediction capabilities of marketing
researcher. Ex: regression analysis
Understanding Data Via Descriptive Analysis

 Measure of Central Tendency:


 Mode
 Highest occurrence in a set of variables.
 Median
 Occurrence in the middle of a set values.
 Mean:
 Arithmetic average of a set of numbers.
Understanding Data Via Descriptive Analysis (cont.)

 Measure of Variability:
 Frequency Distribution:
 Number of times that each different value appears.
 Range:
 Identifies the distance between the lowest and the highest
value in an ordered set of variables.
 Standard Deviation:
 The degree of variation or diversity in the values in a such
a way to be translated in a normal bell-shaped distribution.
Understanding the Data Via Descriptive
Statistics (cont.)
 Other Descriptive Measures:
 Measure of Skewness:
 Reveals the degree of direction of asymmetry in a
distribution. A ‘0’ value indicates symmetric distribution,
a negative value indicates distribution has tail to the left, a
positive value indicates distribution has tail to the right.
 Kurtosis:
 How pointed and peaked a distribution appears. A ‘0’
value indicates distribution is bell shaped, a negative value
indicates distribution is more flat, a positive value
indicated distribution is more peaked than the bell shaped
curve.
Other Quantitative techniques of data analysis

2. Programming technique
a) Linear programming
b) Inventory control
c) Game theory
d) Network analyze
e) Non-linear program
Limitation of quantitative technique

 The inherent limitation concerning mathematical expressions.


 High cost are involved in the use of quantitative techniques
 Quantitative techniques do not take into consideration the
intangible factors (i.e.) non-measurable human factors
 Quantitative techniques are just the tools of analysis and not
the complete decision making process
Qualitative data analysis techniques

 The qualitative researcher, however, has no system for pre-


coding, therefore a method of identifying and labelling or
coding data needs to be developed that is bespoke for each
research. - which is called content analysis.
 Content analysis can be used when qualitative data has been
collected through:
 Interviews
 Focus groups
 Observation
 Documentary analysis

56
Qualitative data analysis techniques

 The analysis of qualitative research involves aiming to uncover


and / or understand the big picture - by using the data to
describe the phenomenon and what this means. Both
qualitative and quantitative analysis involves labelling and
coding all of the data in order that similarities and differences
can be recognised. Responses from even an unstructured
qualitative interview can be entered into a computer in order
for it to be coded, counted and analysed.

57
Qualitative data analysis techniques

 Content analysis is '...a procedure for the categorisation of


verbal or behavioural data, for purposes of classification,
summarisation and tabulation.'
 The content can be analysed on two levels:
 Basic level or the manifest level: a descriptive account of the
data i.e. this is what was said, but no comments or theories as
to why or how
 Higher level or latent level of analysis: a more interpretive
analysis that is concerned with the response as well as what
may have been inferred or implied

58
Qualitative data analysis techniques

 Content analysis involves coding and classifying data, also


referred to as categorising and indexing and the aim of context
analysis is to make sense of the data collected and to highlight
the important messages, features or findings.

59
Six Steps in Qualitative Data Analysis

1. Give codes from the notes.

2. Note personal reflections in the margin.

3. Sort and sift the notes to identify similar and different

relationships between patterns.


Six Steps in Qualitative Data Analysis

4. Identify these patterns, similarities and differences.

5. Elaborate a small set of generalizations that cover the

consistencies.

6. Examine those generalizations and form grounded theory.


Grounded Theory Analysis
Strategies
 Grounded theory:
 A process of constructing various data
 Inductive process by collecting, analyzing and comparing data
systematically.
 Theory is grounded on data to explain the phenomena.
 The main purpose is to develop theory through understanding
concepts that are related by means of statements of
relationships.
Grounded Theory Analysis
Strategies-contd

 Recur by moving back and forth with the data, analyzing,

collecting more data and analyzing some more until reaching

conclusions.

 An interactional method of theory building by comparing and

analyzing the data.


Grounded Theory Analysis
Strategies-contd

 Three steps in the grounded theory analytic process:

1. Open coding:

Break data into small parts compare for similarities and


differences explain the meanings of the data by focusing on
“ who, when, where, what, how much, why” (ask questions to
get a clear story)
Grounded Theory Analysis
Strategies-contd

2. Axial coding:

After open coding, make connection (sort) between categories


and confirm or disconfirm your hypotheses.

3. Selective coding:

Select the core category (match hypotheses) and explain the


minor category (against hypotheses) with additional supporting
data.
 Coding process:

 Open coding

 Axial coding

 Select coding
Interpretation Issues in
Qualitative Data Analysis

A. Triangulating Data
 Use multiple methods and data sources to support the
strength of interpretations and conclusion

Ex) semi-structured interviews, consent form, grounded


theory
Interpretation Issues in
Qualitative Data Analysis-contd

B. Audits
 Questions to examine the data for interpretations and conclusion

1. Is sampling appropriate to ground the findings?

2. Are coding strategies applied correctly?

3. Is the category process appropriate?

4. Do the results link hypotheses? (examine literature review)

5. Are the negative cases explained? (minority’s voice)


Suggestions

 Four steps of negative case testing

1. Make a rough hypothesis

2. Conduct a thorough search

3. Discard or reformulate hypothesis

4. Examine all relevant cases


Interpretation Issues in
Qualitative Data Analysis-contd
C. Cultural bias
 Discuss cultural differences with different groups of
participants
 To see whether divergence is based on culturally different
interpretations
Interpretation Issues in
Qualitative Data Analysis-contd
D. Generalization
 Not appropriate for qualitative research
 Two perspectives for generalization

1. Case-to-case translation (transferability)-

by providing thick description to apply to another setting

2. Analytic generalization-

form a particular set of results to a broader theory

Ex) use deviant cases


Summary

 This module explains quantitative and qualitative research in


detail. The validity, strength and weakness were given in detail.
In the next module we will understand data analyses and
parametric test in detailed

72
References
 Carol Leslie Macnee, (2008), Understanding Nursing Research:
Using Research in Evidence-based Practice, Lippincott Williams
& Wilkins, ISBN 0781775582, 9780781775588
 Densise.Polit, et.al, (2013). ‘Nursing research-principles and
methods’, revised edition, Philadelphia, Lippincott
 https://researchrundowns.wordpress.com/quantitative-
methods/instrument-validity-reliability/
 http://www.nationaltechcenter.org/index.php/products/at-
research-matters/validity/
 http://ocw.jhsph.edu/courses/hsre/PDFs/HSRE_lect7_weiner.pdf
 http://ccnmtl.columbia.edu/projects/qmss/measurement/validity_
and_reliability.html
Thanks

Next Topic>>
PARAMETRIC
TEST

74

You might also like