Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

NOMINAL MEASUREMENT

Measurement & Assessment of INTERVAL MEASURES


Quantitative Data • Measurement is at its weakest level exists - more informative than ordinal measures,
when words or other symbols are used to but interval measure do not give information
classify a person’s or an object’s about absolute magnitude
• The research that you are undertaking at
characteristic.
present is called quantitative research. • In an interval scale, numbers can be
• Quantitative researchers gather empirical meaningfully added, subtracted and can
NOMINAL MEASUREMENT
evidence. be averaged. These cannot be done with
- the lowest level of measurement and is used
primarily for grouping or categorizing data ordinal measures.
EMPIRICAL EVIDENCE
- evidence that is rooted in objective reality RATIO MEASUREMENT
Sometimes, numbers are assigned to nominal
and gathered directly or indirectly through the
measurements, but these numbers are used
senses (sight, hearing, taste, touch or smell)
only to classify characteristics into RATIO MEASUREMENT
- is grounded on reality rather than the
categories. - the highest level of measurement
researcher’s personal belief
- has a rational, meaningful zero value
Example: If we classify subjects as males and
• Usually, in a quantitative study, the females, and assign 1 for males and 2 for females, • Because ratio scales have an absolute
information gathered is numeric and is a the numbers have no inherent meaning. zero, all arithmetic operations are
result of some type of formal measurement.
permissible.
• To study a phenomenon, quantitative • The numbers used in nominal measurement
researchers attempt to measure cannot be treated mathematically. • The four levels of measurement constitute
information that is analyzed with • There is no sense in calculating the a hierarchy, with ratio scales at the top
statistical procedures. average gender, for example, of a group of and nominal measurement at the bottom.
• Although there are accurate measures of people.
PHYSIOLOGIC PHENOMENA (like blood • The numeric codes assigned in nominal • Moving from a higher to a lower level of
pressure, and body temperature), measurement do not convey quantitative measurement results in information loss.
comparably accurate measures of information.
PSYCHOLOGICAL PHENOMENA (like
• A HIGHER LEVEL OF MEASUREMENT
patient morale, pain, or self-esteem) that Examples: Student number; Social security No.; yields more information and are
have been developed are not yet PIN amenable to more powerful analyses
standardized and widely used.
than a lower level.
• Nominal measures must have categories
• Nursing research tends to focus on human that are mutually exclusive and collectively
beings and traditional quantitative methods exhaustive.
typically focus on a relatively small portion
of the human experience (e.g., weight gain, MUTUALLY EXCLUSIVE
depression, chemical dependency) in a - means that each observed characteristic
single study. must be classifiable into one, and only one,
category
MEASUREMENT
COLLECTIVELY EXHAUSTIVE
• Quantitative studies derive data through the - means that all the possible categories have
measurement of variables. been considered (including the most-used
‘Others’ category)
MEASUREMENT ERRORS OF MEASUREMENT
- involves the assignment of numbers to ORDINAL MEASUREMENT
represent the amount of an attribute present in
• An observed (or obtained) score almost
an object or person, using a specified set of ORDINAL MEASUREMENT always contains an error of measurement.
rules - involves sorting objects on the basis of • When researchers measure an attribute,
- involves assigning numbers to objects their relative standing on an attribute they are also measuring attributes that are
according to rules, rather than haphazardly
not of interest.
• The attributes are ordered or ranked
• Rules for measuring temperature, weight, according to some criterion. The most common factors that contribute
blood pressure, and other physical attributes
to errors of measurement are:
are familiar to us. Examples: First, second, third
Small, medium, large;
1. SITUATION CONTAMINANTS. Scores
VARIABLE (or quantitative characteristic) Strongly agree, agree, somewhat agree,
disagree and strongly disagree; can be affected by the conditions under
- one in which alternate forms are numerical
which they are produced.
values, e.g., age, income, height
• Ordinal measurement does not, however, Examples:
tell us anything about how much greater A respondent’s awareness of an observer’s
ATTRIBUTE (or qualitative characteristic)
one rank is than another. presence.
- one in which the alternative forms are
• What we know is the order, but not the Other environmental factors, such as
qualitative categories, e.g., sex, occupation,
difference. temperature, lighting, and time of day
religion
• As with nominal scales, the types of
mathematical operations permissible with 2. TRANSITORY PERSONAL FACTORS.
ADVANTAGES OF MEASUREMENT Example: Temporary personal states like
ordinal-level data are limited.
• Averages are usually meaningless with rank- fatigue, hunger, anxiety, or mood can
1. Measurement removes subjectivity and order measures. influence a person’s score
guesswork.
2. Measurement also makes it possible to APPROPRIATE FOR ANALYZING ORDINAL- 3. RESPONSE-SET BIASES.
obtain reasonably precise information. LEVEL DATA: Example: Relatively enduring
3. Measurement is a language of  Frequency counts characteristics of respondents such as
communication.  Percentages social desirability, acquiescence, and
 Other statistical procedures potential problems
LEVELS OF MEASUREMENT
4. ADMINISTRATION VARIATIONS.
INTERVAL MEASUREMENT
4 commonly used scales in classifying data: Example: Interviewers may improvise
1. Nominal question wording (especially during
2. Ordinal INTERVAL MEASUREMENT
translation)
3. Interval - one where the variables or attributes can
4. Ratio assume equivalent distance between them
NCM 115 • Nursing Research 2 SJMN
5. INSTRUMENT CLARITY. • A test or measure is administered.
Example: Different respondents may interpret • Some time later the same test or measure
the same question differently is re-administered to the same or highly
similar group.
6. ITEM SAMPLING. • One would expect that the reliability
Example: The choice of items to be used in the coefficient will be highly correlated.
measure
INTERNAL CONSISTENCY
7. INSTRUMENT FORMAT. Technical - refers to the extent to which all the SENSITIVITY AND SPECIFICITY
characteristics of an instrument can instrument’s items are measuring the same
influence measurement. attribute SENSITIVITY AND SPECIFICITY
Example: Open-ended questions may yield - answers the question: “How well does each - criteria that are important in evaluating
different information than closed-ended ones. item measure the content or construct under instruments designed as screening
consideration?” instruments or diagnostic aids
• Rules for measuring many other variables for - an indicator of reliability for a test or measure - measures of a test's ability to correctly
nursing research studies, however, have to which is administered once classify a person as having a disease or
be invented. not having a disease
• The most widely used of these scales is the • An instrument may be said to be internally
Likert scale consistent to the extent that its items SENSITIVITY
measure the same trait. - the ability of an instrument to identify a
LIKERT SCALE ‘case’ correctly, that is, to screen or
- very useful when you need data on EQUIVALENCE (Parallel Forms Reliability) diagnose a condition correctly
emotions, perceptions, opinions, and belief - refers to the accuracy of observing ratings and - the ability of a test to correctly identify
systems, whereby the responses are lined up classifications patients with a disease, or true positives.
on a 3-, 5-, or 7-point scale from Agree to - answers the question: “Are the two forms of
Disagree the test or measure equivalent?” TRUE POSITIVE - the person has the
disease and the test is positive
• Some bi-polar scales might not always be • If different forms of the same test or measure
‘agree/disagree’ but might be ‘true/never are administered to the same group; one SPECIFICITY
true,’ ‘extremely likely/extremely unlikely,’ would expect that the reliability coefficient will - the instrument’s ability to identify ‘non-
always/never’ and so on. be high. cases’ correctly, that is, to screen out those
without the condition correctly
VALIDITY - the ability of a test to correctly identify
Reliability & Validity of Research people without the disease
Instruments VALIDITY - the rate of yielding ‘true negatives’
- the degree to which an instrument
measures what it is supposed to measure. SCREENING
RESEARCH INSTRUMENTS - refers to the degree of closeness between a - refers to the application of a medical
- scientific and systematic tools which are measurement and the true value of what is procedure or test to people who, as yet,
designed in order to help the researcher being measured have no symptoms of a particular disease, for
collect, measure, and analyze data related to - addresses the question: “How close is the the purpose of determining their likelihood of
the topic of the research measured value to the true value?” having the disease
- can include questionnaires, interviews, - The screening procedure itself does not
tests, surveys, scales, or checklists key aspects of validity: diagnose the illness.
 Face validity
Two primary criteria for assessing a  Content validity
quantitative research instrument are:  Criterion-related validity Analysis of Likert Scale Data
 Reliability  Construct validity
 validity
• The results of a single item answered on
FACE VALIDITY the Likert response format was never
RELIABILITY - refers to whether the instrument appears or meant to be analyzed in isolation.
looks as thought it is measuring the • A common practice of researchers is to ask
RELIABILITY appropriate construct dozens of Likert response items that are
- the degree of consistency or accuracy with not related in a questionnaire then analyze
which an instrument measures an attribute CONTENT VALIDITY each question individually.
- the extent to which an experiment, test, or any - concerns the degree to which an instrument • Unfortunately, this is not really acceptable
measuring procedure gives the same result on has an appropriate sample of items for the from a statistical format.
repeated trials, every time construct being measured • Instead, LIKERT RESPONSE
- (or precision) refers to the repeatability of a QUESTIONS are designed to be
measure, i.e., the degree of closeness CRITERION-RELATED VALIDITY analyzed as a grouped Likert score.
between repeated measurements of the same - involves determining the relationship between • A good solution when analyzing Likert data
value an instrument and an external criterion is to design a survey containing a
- addresses the question: “If the same thing is number of Likert questions that all point
measured several times, how close are the CONSTRUCT VALIDITY in the same direction.
measurements to each other?” - an instrument’s adequacy in measuring the • The group of Likert questions is then
focal construct analyzed together.
• The higher the reliability of an instrument, the - addresses the key questions: “What is this
lower the amount of error in obtained results. instrument really measuring?” LIKERT SCORE
“Does it really measure the abstract concept - formed by a grouping of these Likert
3 key aspects of reliability: of interest?” questions with others of a similar topic
 Stability
 Internal consistency REMEMBER! • This is the original use of the Likert
 Equivalence • Reliability and validity are not independent • Combined, the items are used to provide
qualities of an instrument. a quantitative measure of a character or
STABILITY (Test-Retest Reliability) • A measuring device that is unreliable personality trait.
- an aspect of reliability that indicates if a test cannot possibly be valid. • Typically, the researcher is only
is stable over time, i.e., that the results do not • A measure cannot be valid without being interested in the composite score that
change over time reliable. represents the character/personality trait.
- answers the question: “Will the scores be • An instrument cannot validly measure an
stable over time?” attribute if it is inconsistent and
inaccurate.
NCM 115 • Nursing Research 2 SJMN
LIKERT SCALE DATA FREQUENCY COUNTS
- analyzed and created by calculating a Inferential Statistics
composite score (sum or mean) from the
• From the data in the questionnaire, simple
questions or statements INFERENTIAL STATISTICS
tables can be made with frequency counts for
each variable. - provide a means for drawing conclusions
Example: Supposing we wanted to find out how about a population, given data from a
significant Valentine’s Day is to our respondents. sample.
We decided to measure the significance using a 5- FREQUENCY COUNT
- an enumeration of how often a certain - researchers estimate population
point Likert scale and assigned points as follows:
measurement or a certain answer to a specific parameters from sample statistics
Highly disagree (HD) ---------------- 1 point question occurs
Slightly disagree (SD) --------------- 2 points In quantitative research:
Neither agree nor disagree (N) -- 3 points Example:
Slightly agree (SA) ------------------ 4 points
Highly agree (HA) -------------------- 5 points

TEST OF HYPOTHESIS

TEST OF HYPOTHESIS
- a statistical procedure for deciding whether
to accept or to reject the hypothesis
The questionnaires were answered by 50 • If numbers are large enough, it is better to
based on the sample observations
respondents and their responses were tallied, calculate the frequency distribution in
resulting in the following data: percentages (relative frequencies).
• There are many different tests for the many
different kinds of data. A way to get started
For instance, (51/144) x 100 = 35% are
is to understand what kind of data you
smokers and (93/144) x 100 = 65% nonsmokers.
have.
• Tests of hypothesis answer the questions:
• This makes it easier to compare groups than
1. Is there a difference between two or more
when only absolute numbers are given.
samples, or two or more sets of data?
• In other words, PERCENTAGES
standardize the data.
Number of Respondents
HD SD N SA HA
(1 pt) (2 pts) (3 pts) (4 pts) (5 pts)
Total CROSS-TABULATIONS
9 12 0 14 15 50
3 7 1 17 22 50 • Further analysis of the data usually requires
4 6 0 11 29 50
13 12 1 10 14 50
the combination of information on two or
11 16 2 12 9 50 more variables in order to describe the
problem or to arrive at possible explanations
Weighted mean for it.
[(9)(1)+(12)(2)+(0)(3)+(14)(4)+(15)(5)] ÷ 50 = 3.28 • For this purpose, it is necessary to design
[(3)(1) + (7)(2) + (1)(3) + (17)(4) + (22)(5)] ÷ 50 = 3.96
[(4)(1) + (6)(2) + (0)(3) + (11)(4) + (29)(5)] ÷ 50 = 4.10
cross-tabulations (or ‘cross-tabs’ for short). 2. Is there a relationship between one
[(13)(1) + (12)(2) + (1)(3) + (10)(4) + (14)(5)] ÷ 50 = 3.00
variable and another variable?
[(11)(1) + (16)(2) + (2)(3) + (12)(4) + (9)(5)] ÷ 50 = 2.84 Example:

COMPOSITE MEAN
- sum of all the weighted mean ÷ no. of
statements = 3.44

The next step is to interpret the composite The test of hypothesis, however, is not
mean based on the Likert scale that was necessary in a research study when:
created. However, the Likert scale has to be 1. There was no sampling done (i.e., the data
modified first into: gathered are for the whole population)
• If a dependent and an independent variable 2. The research is descriptive in nature (e.g.,
Composite Weighted are cross-tabulated, the headings of the a study to determine the factors affecting a
Average
Highly disagree 1.00 - 1.79 dependent variable are usually placed certain phenomenon)
Slightly disagree 1.80 - 2.59 horizontally, and the headings of the
Neither agree nor disagree 2.60 - 3.39 independent variable, vertically. general rule of thumb for sample size:
Slightly agree 3.40 - 4.19  At least 20% of the size of the population,
Highly agree 4.20 - 5.00
Example:  At least 30 subjects or observations (or, 30
pairs), whichever is higher.
Therefore, the composite mean of 3.44 is
interpreted as “Slightly agree.” SMALL SAMPLE
- historically, any sample which is composed
There are also modified scales for the of less than 30 subjects (or pairs)
interpretation of scores from the 3-point, 4-point,
and 7-point Likert. LARGE SAMPLE
These are available from the Internet, or may be - a sample of 30 or more
constructed manually.
Some practical hints when constructing
tables: • Most quantitative research studies use
Descriptive Data Analysis large samples.
• All tables should have a clear title and clear • While each test of hypothesis uses a
headings for all rows and columns. different procedure for a small and a large
DESCRIPTIVE ANALYSIS (of quantitative data) • All tables should have a separate row and a sample, what we are more concerned in
- involves the production and interpretation of separate column for totals to enable you to research are the methods for large
frequencies, tables, graphs, etc., that describe check if your totals are the same for all samples.
the data variables and to make further analysis easier.
• All tables related to a certain objective should Some well known statistical tests and
be numbered and kept together so the procedures for research observations are:
work can be easily organized and the writing 1. Student’s t-test
of the final report will be simplified. 2. Analysis of variance (ANOVA)
NCM 115 • Nursing Research 2 SJMN
3. Pearson product-moment correlation STEP 3. SELECT A TEST STATISTIC • Research reports often state that the
coefficient results were statistically significant (p <
4. Chi-square test • Should a parametric or a nonparametric 0.05) or make some similar statement.
5. Mann-Whitney U test test be used? • Researchers and statisticians generally
6. Wilcoxon matched-pair signed-rank test • Which level of measurement will be used? agree on the following conventions for
7. Kruskal-Wallis test • How many groups are being compared? interpreting p- values.
8. Spearman rank correlation • Are you interested in establishing a p-Value Interpretation
difference or a relationship? Result is not significant;
usually indicated by no
Hypothesis Testing Procedure PARAMETRIC TESTS p > 0.05 asterisk or an ‘ns’
superscript to the right of
Researchers more often use parametric tests the p-value (ns).
• The overall process of testing hypotheses is and these are characterized by three attributes: Result is significant;
basically the same. p < 0.05 usually indicated by one
asterisk (*).
 They involve the estimation of a parameter
The steps are as follows: (population mean; population deviation).
 They require measurements on an interval • Researchers and statisticians generally
1. State your hypotheses
or ratio scale. agree on the following conventions for
2. Establish the level of significance
 They assume that the variables are normally interpreting p- values.
3. Select a test statistic
4. Compute the selected test statistic; calculate distributed in the population.
the degrees of freedom (if needed) p-Value Interpretation
5. Select a one-tailed or two-tailed test NONPARAMETRIC TESTS Result is highly significant;
6. State the test criterion. - do not estimate parameters p > 0.01 usually indicated by two
asterisks (**).
7. Obtain a table value for the statistical test - usually applied when data have been
Result is very highly
8. Compare the test statistic with the table value measured on a nominal or ordinal scale significant; usually
- do not assume a normal distribution, and p < 0.001
9. State your conclusion indicated by 3 asterisks
for this reason, they are sometimes called (***).

STEP 1. STATE YOUR HYPOTHESIS DISTRIBUTION- FREE STATISTICS

a. State your null hypothesis (symbolized by H0 • Parametric tests are preferred because they
and read ‘H sub zero’). are more powerful; that is, they are more
strict in rejecting an H0 that is not true.
NULL HYPOTHESIS
- states that no difference (or no relationship) STEP 4. COMPUTE THE SELECTED
exists between research variables, and that TEST STATISTIC
any observed difference (or relationship) is due STEP 9. STATE YOUR CONCLUSION
to chance  Identify the appropriate computational
- a statement of ‘no difference’ or ‘no change’ formula and compute using the traditional Was there a difference (or relationship)
or ‘no effect’ longhand method between the variables?

b. Then, state your alternative hypothesis or CONCLUSION


(symbolized by HA and read ‘H sub A’). - usually stated in terms of the accepted
 Use a computer-calculated statistic using hypothesis, using such statements as:
ALTERNATIVE HYPOTHESIS the Statistical Package for the Social • There is evidence to support the claim
- a statement of inequality and is the opposite Sciences (SPSS). that…
of the null hypothesis • There is enough reason to believe that
- H0 is not proven, but accepted or rejected • Calculate the degrees of freedom (if needed) there exists…
- Accepting H0 automatically rejects HA • We have sufficient reason to believe that…
- Conversely, rejecting H0 automatically accepts DEGREES OF FREEDOM (symbolized as df) • Sample data suggest that…
HA. - a concept that refers to the number of • Statistical evidence at hand indicates
observations that are free to vary about a that…
parameter • Statistical evidence suggest that…
• There is sufficient statistical data to support
STEP 5. SELECT A ONE-TAILED OR that there is…
TWO-TAILED TEST
t-Test for Independent Groups
STEP 6. STATE THE TEST CRITERION
STEP 2. ESTABLISH THE LEVEL OF PARAMETRIC TESTS
SIGNIFICANCE STEP 7. OBTAIN A TABLE VALUE FOR
THE STATISTICAL TEST • A common research situation involves
LEVEL OF SIGNIFICANCE (referred to as comparing the groups of subjects.
alpha or α) • There are theoretical distributions for all • Usually, the arithmetic means of the groups
- the maximum risk or probability of making test statistics. are compared.
a Type I error • Researchers have only to select the
- signifies the probability of rejecting a true appropriate table where a critical value TESTING THE DIFFERENCES
null hypothesis can be obtained for a specific df and α. BETWEEN TWO GROUP MEANS
- The two most frequently-used significance
levels are 0.05 and 0.01. STEP 8. COMPARE THE TEST 2 common statistical tests used to test the
STATISTIC WITH THE TABLE VALUE significant difference between group means:
The 0.05 level means that in only 5 out of  Student’s t-test
every 100 samples (or only 5%) would the  Analysis of Variance (ANOVA)
null hypothesis be rejected when it should • For most tests, if the computed test
have been accepted. statistic is greater than the table value, the
H0 is rejected, and the HA is automatically T-TEST FOR INDEPENDENT GROUPS
α = 0.05 accepted, meaning the results are
- is frequently used in research statistically significant. t-TEST
- sometimes referred to as the ‘default alpha’ • The computer software SPSS usually - the most powerful parametric test for
Meaning, if the level of significance is not summarizes the result of a hypothesis test calculating the significant difference
specified, then it is taken and understood to be into one particular value: THE P-VALUE. between sample means
equal to 0.05. - reliable not only for large samples, but for
small samples, as well

NCM 115 • Nursing Research 2 SJMN


For independent groups, the test statistic, t0, is c. Use a two-tailed test.
calculated using the general formula: d. Reject H0 if the computed t0 > table value of

e. From table of t-values: tα = 2.101 for α = 0.05
and df = 18

Table of critical t- values at a = 0.05 (two-


tailed):

with df = n1 + n2 - 2

where:
x1 = mean of group 1
x2 = mean of group 2
|x1 - x2| = absolute value of the difference
between the two means
s1 = standard deviation of group 1
observations
s2 = standard deviation of group 2
observations
n1 = number of subjects in group 1 (sample size
of group 1)
n2 = number of subjects in group 2 (sample size
of group 2)
n1 and n2 may or may not be equal

Example: Given the gains in weight (in lbs.) of


two groups of infants. One group of 10 infants
was breast-fed, while the other group of 10
infants was given powdered infant formula.

a. Perform a test of hypothesis to see whether


Since 0.9162 < 2.101, accept H0.
the two feeding methods were the same at a
significance level of 0.05.
Conclusion: There is no significant
difference between the mean gain in weights of
Solution:
the breast-fed infants and the infants given
powdered infant formula.
H0: There is no significant difference
between the mean gain in weights of the breast-
fed infants and the infants given powdered
infant formula. (or, µ1 = µ2 )

HA: There is a significant difference


between the mean gain in weights of the breast-
fed infants and the infants given powdered
infant formula. (or, µ1 ≠ µ2 )

b. Test at α = 0.05

Test statistic:

t0 = 0.9162
df = 10 + 10 - 2 = 18
NCM 115 • Nursing Research 2 SJMN

You might also like