A Mixed Methods Model of Scale Development and Validation Analysis

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/331295116

A Mixed Methods Model of Scale Development and Validation Analysis

Article  in  Measurement Interdisciplinary Research and Perspectives · January 2019


DOI: 10.1080/15366367.2018.1479088

CITATIONS READS

12 2,745

1 author:

Yuchun Zhou
Ohio University
15 PUBLICATIONS   221 CITATIONS   

SEE PROFILE

All content following this page was uploaded by Yuchun Zhou on 14 May 2019.

The user has requested enhancement of the downloaded file.


Measurement: Interdisciplinary Research and
Perspectives

ISSN: 1536-6367 (Print) 1536-6359 (Online) Journal homepage: https://www.tandfonline.com/loi/hmes20

A Mixed Methods Model of Scale Development and


Validation Analysis

Yuchun Zhou

To cite this article: Yuchun Zhou (2019) A Mixed Methods Model of Scale Development and
Validation Analysis, Measurement: Interdisciplinary Research and Perspectives, 17:1, 38-47, DOI:
10.1080/15366367.2018.1479088

To link to this article: https://doi.org/10.1080/15366367.2018.1479088

Published online: 22 Feb 2019.

Submit your article to this journal

Article views: 108

View Crossmark data

Full Terms & Conditions of access and use can be found at


https://www.tandfonline.com/action/journalInformation?journalCode=hmes20
MEASUREMENT: INTERDISCIPLINARY RESEARCH AND PERSPECTIVES
2019, VOL. 17, NO. 1, 38–47
https://doi.org/10.1080/15366367.2018.1479088

A Mixed Methods Model of Scale Development and Validation


Analysis
Yuchun Zhou
Gladys W. and David H. Patton College of Education, Ohio University, Athens

ABSTRACT KEYWORDS
Using mixed methods to develop new scales is not a new idea since the Scale development;
2000s. However, there exists inadequate literature that discusses scale validation analysis; mixed
development using mixed methods, with steps including how to design methods
the study, how to implement the process, and how to conduct validation.
This study proposes a hands-on model of using mixed methods to develop
new scales and using multiple approaches to conduct validation analysis.
The proposed model consists of five steps that highlight both mixed
methods’ integration techniques and psychometric methods. The model
of scale development and validation analysis is practical and useful for
researchers who desire to develop a reliable scale.

Introduction
In social sciences and human behaviors, scholars usually found the needs to develop new scales due
to lack of existing instruments in their fields. Traditionally, researchers referred to psychometric
literature on reliability and validity for item generation (Rowan & Wulff, 2007) and survey meth-
odology for questionnaire design (May, 2001) when developing new scales. In recent years, research-
ers realized that scale development was not merely a procedure within a research project but a
systematic study from a research design level to implementation, and accordingly, mixed methods
has been employed to scale development (Bryman, 2006; Collins, Onwuegbuzie, & Sutton, 2006;
Greene, Caracelli, & Graham, 1989).
The assumption of mixed methods research is that mixing both qualitative and quantitative
methods could enhance the understanding of a research phenomenon than either method by itself
(Creswell & Plano Clark, 2011; Johnson, Onwuegbuzie, & Turner, 2007). The development and
evolution of mixed methods has experienced 6 stages in the last 30 years: the formative stage (1980s
and before), the paradigm stage (1980s to 1990s), the procedural stage (1980s to present), the
advocacy stage (early 2000s to present), the reflective stage (2000s to present), and the expansion
stage (2010s to present) (Creswell & Plano Clark, 2011; Tashakkori & Teddlie, 1998; Teddlie &
Tashakkori, 2009). In more recent years, research on mixed methods’ expansion focused on its
applications in and adaptations to specific disciplines or topics, such as experiment intervention,
program evaluation, longitudinal research, and instrument development (Glogowska, 2011; Karasz &
Singelis, 2009; Kettles, Creswell, & Zhang, 2011; Tewksbury, 2009; Plano Clark et al., 2015; Plano
Clark & Wang, 2010; Shaw, Connelly, & Zecevic, 2010).
Using mixed methods to develop new scales is not a new idea since the 2000s. A few methodol-
ogists have discussed the rationales of using mixed methods in scale development since 2006 (Collins
et al., 2006; Creswell & Plano Clark, 2011; Onwuegbuzie, Bustamante, & Nelson, 2010; Smolleck,
Zembal-Saul, & Yoder, 2006). For instance, Collins et al. (2006) explicated the necessity and

CONTACT Yuchun Zhou zhouy@ohio.edu 302G McCracken Hall, Ohio University, Athens, OH 45701.
Color versions of one or more of the figures in the article can be found online at www.tandfonline.com/hmes.
© 2019 Taylor & Francis Group, LLC
MEASUREMENT: INTERDISCIPLINARY RESEARCH AND PERSPECTIVES 39

appropriateness of using mixed methods to “assess the appropriateness and/or utility of existing
instrument(s); create new instrument(s) and assess its appropriateness and/or utility” (p. 76).
Although researchers argued the advantages of using mixed methods to develop new scales,
literature providing systematic instructions on how to do it has been scarce and incomplete. For
instance, Onwuegbuzie et al. (2010) published a 10-phase IDCV instrument development framework
for developing and assessing the fidelity of a quantitative instrument. Their framework contained the
idea of mixing approaches; however, it remained at the implementation level and did not give
instructions on the research design level. Therefore, this study integrated mixed methods research
into psychometrics and proposed a hands-on model of using a sequential mixed methods design to
develop scales.
By reading this study, researchers could improve their understanding of using mixed methods for
scale development and use it appropriately. Hopefully, mixed methods would be adopted in practice
by researchers from a wide range of disciplines.

Literature on scale development using mixed methods


The study is a methodological discussion on scale development using mixed methods. The proposed
model and its methodological discussion were based on a systematic literature review. The relevant
literature was garnered from multiple databases including EBSCO, JSTOR, PsycINFO, and Google
Scholar. Search terms included a combination of “mixed methods*”, “instrument/scale develop-
ment”, “instrument/scale construction”, “instrument/scale validation”, and “validation analysis”. The
search yielded that most of the relevant articles were published in the past 15 years. As a result, 35
articles were selected and reviewed to answer two questions. First, why and how have researchers
used mixed methods to develop new scales? Second, why and how have researchers used mixed
methods to conduct validation analysis for new scales? Findings were summarized in the following
three categories: (1) needs of using mixed methods for instrument development, (2) appropriateness
of using mixed methods for instrument validation, and (3) specific validation techniques for scale
development. More details were presented as follows.

Needs of using mixed methods for instrument development


Since early 2000s, methodologists started the discussion on why using mixed methods could help
developing a reliable instrument, including the comprehensiveness of item generation and the
rigorous validation of new items (Bryman, 2006; Collins et al., 2006). Researchers such as Creswell
and Plano Clark (2011) discussed a sequential mixed methods research design for scale development,
exploratory instrument design, which consisted of three phases: a qualitative phase in defining the
construct of instrument, an instrument development phase including item generation and revision,
and a confirming quantitative phase to test the instrument.
Following Creswell and Plano Clark (2011)’s exploratory instrument design, a few researchers
have used mixed methods in scale construction (Crede & Borrego, 2013; Durham, Tan, & White,
2011; Hitchcock et al., 2006; Nastasi et al., 2007). For instance, Durham et al. (2011) developed a
scale to assess the impact of clearance on livelihood assets in Laos through three stages: a qualitative
stage including the literature review (etic perspective) and focus group discussions (emic perspec-
tive), an instrument development stage including item writing, and a quantitative stage regarding
scale testing. Durham et al.’s (2011) study described the process of developing a scale using mixed
methods and discussed the challenges in the process.
Besides Creswell and Plano Clark (2011)’s exploratory instrument design, Onwuegbuzie et al.
(2010)’s IDCV framework was another reference to developing and assessing the fidelity of a
quantitative instrument. Yet, Onwuegbuzie et al. (2010)’s 10-phase framework separated develop-
ment and validation procedures. It did not contain instructions on mixed methods research designs,
nor instructions on specific validation strategies. To address the needs of detailed instructions for
40 Y. ZHOU

researchers aspiring to design a mixed methods study for instrument development, this study
proposed a mixed methods model that provided clear guidance on research design, integration
phases, validation techniques, and psychometric consideration involving scale development and
validation.

Appropriateness of using mixed methods for instrument validation analysis


Many researchers misunderstood that validation analysis is required only after the scale devel-
opment phase. In fact, instrument validation analysis started at the beginning of instrument
development and involved the whole process. For instance, the content analysis of literature and
panel reviews for item generation usually assisted in the formulation of the systematized concept
and thus provided with the content evidence of validity for the instrument (Luyt, 2012). Why did
validation happen as early as item writing? To answer this question, we shall review the concepts
of validity and validation.
According to Messick (1989), validity was defined as “the degree to which empirical evidence and
theoretical rationales support the adequacy and appropriateness of interpretations and actions based
on test scores” (p. 6). Validation referred to an ongoing process of “developing a scientifically sound
validity argument to support the intended interpretation of test scores and their relevance to the
proposed use” (AERA/APA/NCME, 1999, p. 9). In other words, validity was viewed as a property
from an ontology perspective; whereas validation was the process of gathering evidence through
philosophical, experimental, and statistical means to evaluate such property (Borsboom,
Mellenbergh, & Heerden, 2004; Messick, 1989; Sireci & Sukin, 2013). To gather evidence for validity,
multiple types of data should be collected and mixed, including subjective judgment and statistical
analysis (Hubley & Zumbo, 2011; Sireci, 2009). Accordingly, using different methods to collect and
integrate different types of data could help with validation. That said, using mixed methods is
appropriate to enhance the quality of instrument validation.
According to Onwuegbuzie et al. (2010), mixed methods could be used to provide content-related
evidence for face validity, item validity, and sampling validity and construct-related evidence for
substantive validity, outcome validity, and generalizability. Taking the validation analysis for content
validity as an example, early in the 2000s, researchers discussed multiple methods in instrument
validation. For instance, Brod, Tesler, and Christensen (2009) stated that qualitative approaches (i.e.
grounded theory) was the favorite methods to support content validity, and Meurer, Rubio, Counte,
and Burroughs (2002) preferred a standardized and statistical method for content evidence of
validity, such as surveying a panel of experts and calculating inter-rater agreement as well as the
content validity index.
More recently, researchers discussed mixed methods as the most suitable method for instrument
validation. For instance, Newman, Lim, and Pineda (2013) advocated an interactive model that
allowed the integration of the qualitative and quantitative methods to collect content evidence of
validity. They presented a table of specifications, which was used to collect experts’ views of the
accuracy and sufficiency of a specific concept and to calculate the agreement between judges.
According to Newman et al. (2013), when researchers attempted to estimate the agreement with
the alignment of these concepts (qualitative) empirically (quantitative), the process was inherently
mixed methods.
Moreover, mixed methods has been used to collect different types of validity evidence. For
instance, Lee and Greene (2007) presented how mixing qualitative and quantitative approaches
provided evidence for the predictive validity of a test. Morell and Tan (2009) demonstrated how
mixed methods could be implemented to gather internal validity evidence and thus help to form the
validation argument. Hitchcock et al. (2006) used mixed methods to collect cross-cultural evidence
for construct validity through ethnographic and factor analysis techniques. Ungar and Liebenberg
(2011) mixed qualitative and quantitative approaches in the development of cultural-relevant
measures for translation validity.
MEASUREMENT: INTERDISCIPLINARY RESEARCH AND PERSPECTIVES 41

Furthermore, previous literature has also addressed how to properly use mixed methods for
validation in specific fields, including health sciences (Hitchocock et al, Hitchcock et al., 2006),
psychology (Luyt, 2012; Ungar & Liebenberg, 2011), education (Burton & Mazerolle, 2011; Crede &
Borrego, 2013; Nassar-McMillan, Wyer, Oliver-Hoyo, & Ryder-Burge, 2010; Smolleck et al., 2006),
childhood trauma (Boeije, Slagt, & Wesel, 2013), and language test (Lee & Greene, 2007).
In addition to the multiple methods deployed in the validation analysis of an instrument,
Onwuegbbuzie and Johnson (2006) also paid attention to the legitimation in the validation analysis
via using mixed methods. The legitimation included sampling integration, inside-outside, weakness
minimization, sequential, conversion, paradigmatic mixing, commensurability, multiple validities,
and political legitimation. Following Onwuegbbuzie and Johnson (2006), Dellinger and Leech (2007)
and Leech, Dellinger, Brannagan, and Tanaka (2010) further discussed how to evaluate a mixed
methods study for validation purpose. The framework emphasized the quality of mixed methods
research design, legitimation considerations, interpretive rigor, inferential consistency, as well as the
utilization and consequential elements. In sum, mixed methods is an appropriate methodology that
should be used to collect various evidence of validity for a new instrument; however, mixed methods
would not guarantee the quality of validation analysis. Only when researchers used mixed methods
with rigor, could they properly test their instrument. Therefore, the discussion in this study on how
to integrate a specific mixed methods research design into psychometrics is extremely important to
researchers who need to develop and analyze new scales.

Specific validation techniques in instrument development


In the past 5 years, most researchers have reported the validation strategies in instrument develop-
ment (Agarwa, 2011; Agarwa, Xu, & Poo, 2011; Burton & Mazerolle, 2011; Dahodwala et al., 2012;
Melka, Lancaster, Bryant, Rodriguez, & Weston, 2011; Miller, Kim, Chen, & Alvarez, 2012; Luyt,
2012; Nassar-McMillan et al., 2010). In general, instrument development consisted of the following
phases, namely, defining the construct and content domain, generating items, pilot testing the scale,
revising the scale, and finalizing the scale (Burton & Mazerolle, 2011). In the early phases of
instrument development, qualitative approaches (e.g. interviews with experts, literature review)
were widely used to define the construct and to provide the content and face evidence of validity.
Researchers consulted a panel of experts to discuss the content evidence of validity, reviewed the
items for clarity and comprehension, and revised the items (Burton & Mazerolle, 2011; Dahodwala
et al., 2012; Onwuegbuzie et al., 2010; Smolleck et al., 2006). Consulting experts in the targeting
domains helped with item generation, and meanwhile the panel’s comments provided the evidence
for content validity and face validity for the items.
The panel review could be conducted in a variety of ways, including individual interviews
(Dahodwala et al., 2012), focus group (Durham et al., 2011; Holsapple, Finelli, Carpenter,
Harding, & Sutkus, 2009; Luyt, 2012; Nassar-McMillan et al., 2010; Ungar & Liebenberg, 2011;
Vogt, King, & King, 2004), think-aloud (Morell & Tan, 2009), and two-stage sorting procedures
(Agarwa, 2011; Agarwa et al., 2011; Moore & Benbasat, 1991, Moore & Benbasat, 2001). In addition,
as Smolleck et al. (2006) suggested, the panel review should be conducted iteratively in several
rounds to provide content evidence of validity. In Smolleck et al.’s (2006) 13-step process of
instrument development, the first 9 steps were about defining and enhancing the collection of
content evidence of validity. Besides the panel review, some advanced qualitative designs, such as
ethnography (Crede & Borrego, 2013; Hitchcock et al., 2006; Nastasi et al., 2007), were also widely
used in developing an instrument (Worthingon & Whittaker, 2011).
In contrast, at the later stages of instrument development, such as pilot testing and survey
administering, quantitative approaches (e.g. factor analysis) were largely used to collect the construct
evidence of validity, such as convergent validity and discriminant validity. For instance, statistical
item analysis and factor analysis were used to collect evidence for construct validity (Onwuegbuzie
et al., 2010). Exploratory factor analysis with principal components’ extraction was used to retrieve
42 Y. ZHOU

the factors, reduce the items, and examine the factor structure (Burton & Mazerolle, 2011;
Dahodwala et al., 2012; Martin & Sass, 2010; Melka et al., 2011; Miller et al., 2012). Confirmatory
factor analysis was another commonly used strategy to cross-validate the factor structure following
the exploratory factor analysis (Burton & Mazerolle, 2011; Dahodwala et al., 2012; Melka et al., 2011;
Miller et al., 2012; Weiss & Smith, 1999; Worthingon & Whittaker, 2011). When the model fit
indices were acceptable, validity evidence was provided for construct validity, which indicated the
items could measure the scale construct.
To sum up, the existing literature indicated the needs and appropriateness of using mixed
methods for scale development and validation. However, no study provided a comprehensive
model that encompasses mixed methods designs and psychometric consideration in scale develop-
ment. Therefore, this study proposed a mixed methods model with five steps to construct new scales
and concurrently examine multiple evidence for greater validity. The recommended steps not only
emphasize mixed methods design but also focus on specific mixing and validation techniques to
achieve appropriate psychometric properties.

Proposed model of scale development and validation analysis


The recommended model of scale development and validation analysis connects development
procedures and validation procedures (See Figure 1). Moreover, the model highlights the mixing

Figure 1. Mixed Methods Model of Scale Development and Validation Analysis.


MEASUREMENT: INTERDISCIPLINARY RESEARCH AND PERSPECTIVES 43

phases. As Figure 1 indicates, ovals are used to distinguish the mixing steps from other square boxes
with single method.
At the research design level, the mixed methods model of scale development and validation analysis
(MSDVA) adapts Creswell and Plano Clark (2011)’s exploratory instrument design and embraces
validation phases. It consists of five steps: (1) qualitatively investigating the scale construct, which is
also a qualitative validation process to collect content-validity evidence, (2) converting qualitative
findings to scale items, which is an integration strategy in mixed methods research, (3) conducting
mixing validation to review items’ content-based validity, (4) administering test items and collect item
responses, and (5) conducting quantitative validation to analyze item properties and to examine
construct-validity evidence. For items with poor psychometric properties, they should be revised and
sent back to step (3) for another run of validation and pilot testing until they get through item analysis
in step (5). More details and suggestions for each step in the model are presented as followed.

Step 1: Qualitatively investigating the scale construct


Besides literature review, a qualitative investigation is necessary for exploring the scale construct and
collecting detailed information. The central phenomenon of the qualitative study should be defined
the same as the scale construct, and all research questions should relate to it. For instance, when
developing a new scale to measure adolescents’ cognitive wisdom, a qualitative study could be
conducted to explore cognitive wisdom as the central phenomenon. A purposeful sample of young
people should be selected to provide their perspectives on cognitive wisdom. Research questions
should be designed to explore details about cognitive wisdom, such as “what issues are regarded as
cognitive-related wisdom?” and “how does cognitive wisdom differ from other types of wisdom?”
The reasons why the scale construct of interest should be defined as the central phenomenon in a
qualitative investigation are twofolded. First, it provides the foundation to compare and connect
qualitative findings with quantitative item responses in a scale development study because of the
same research purpose, the comparable research questions and samples throughout the whole
process of scale development. Second, it provides the evidence for content validity of the new
scale because the items are written based on the qualitative findings. Only when the qualitative
study is designed to explore the scale construct as the central phenomenon, the findings can provide
the most relevant and rich information of the scale construct and item writing. Accordingly, the
initial qualitative investigation involves a validation strategy to provide the content-related informa-
tion to the scale items of interest. In all, the first step consists of both a qualitative investigation and a
qualitative validation.

Step 2: Converting qualitative findings to scale items


How to convert qualitative findings to scale items? First, make sure the qualitative findings consist of
quotes, codes, and themes in different layers. Second, convert quotes and codes to items, and convert
themes to scales. For instance, a code of “life no change” can be converted to an item of “I do not
think life is changing all the time” with Likert style response format. Then, go further to determine
the format of item responses and the visual display of the scale. The format of item responses will
determine what statistical analysis will be run in the later quantitative stage. For instance, if the
responses are nominal and ordinal scales, you will use IRT instead of CFA to do data analysis. In
short, transforming qualitative data to measurable items is a mixing strategy that indicates how
qualitative and quantitative data are integrated.

Step 3: Conducting mixing validation to review items’ content-based validity


Mixing validation indicates that both qualitative approaches and quantitative methods are used to
review the items, and the results will be mixed to provide validity evidence for the new scale. In this
44 Y. ZHOU

stage, the evidence is mainly about content-based validity, namely, whether the items comprehen-
sively present the scale construct to be measured.
The main qualitative approaches include reflection, debriefing, and panel review. The common
quantitative methods include sorting and calculation. Reflection refers to defining the scale construct
to be measured. The researcher should attempt to answer questions such as, what the scale is used to
measure and how to define the construct. This practice of reflection could provide evidence for
content validity of the new scale (Burton & Mazerolle, 2011; Dahodwala et al., 2012; Onwuegbuzie
et al., 2010; Smolleck et al., 2006).
Debriefing is a discussion on the relationship between items and the construct. By discussing the
structure of the new scale, the hypothesized model between items and the latent construct is
generated (Burton & Mazerolle, 2011; Dahodwala et al., 2012; Melka et al., 2011; Miller et al.,
2012; Weiss & Smith, 1999; Worthingon & Whittaker, 2011). The hypothesized model will be tested
via factor analysis in the later steps.
Panel review is to have a panel of experts review the items the representativeness and complete-
ness. Experts’ comments will provide evidence for content validity of the new scale (Dahodwala
et al., 2012; Durham et al., 2011; Morell & Tan, 2009; Smolleck et al., 2006). Afterwards, revision of
the items according to the panel’s feedback should be completed.
Sorting is a quantitative method to calculate experts’ agreement on the content of items. A
group of experts are asked to review the items and determine what items should be grouped
together. They sort items to different categories and then define the categories. The sorting
procedure will provide the evidence of content-based and construct validity because it discusses
why certain items should be categorized in one construct (Agarwa, 2011; Agarwa et al., 2011;
Moore & Benbasat, 1991, Moore & Benbasat, 2001). All in all, multiple approaches could be used
to examine the new items’ accuracy and comprehensiveness, and the results should be integrated
to provide evidence of content-based validity.

Step 4: Administering the scale on the target population


Quantitative survey is a primary method to administer the new instrument (May, 2001). For valid
scores, a sample of target population for the new scale should be recruited to complete the
questionnaire. Random sampling is best for collecting unbiased data. Another issue is sample size.
For advanced statistical analysis such as factor analysis, the responses for each item should be as
large as hundreds (Kline, 2011). The numeric item responses will be collected and then will be sent
to Step (5) for statistical analysis.

Step 5: Conducting quantitative validation to examine items’ construct-based validity


Item responses will be analyzed via a series of statistical analysis for evidence of construct-based
validity, including internal reliability of Cronbach’s alpha (conventionally higher than .70), inter-
item correlations (conventionally higher than .50), scale variance (should be high), and corrected
item-total correlations (should be positive and conventionally higher than .20/. 30) (Meurer et al.,
2002), factor analysis, and regressions. Taking confirmatory factor analysis as an example, the results
will indicate whether the scale construct can explain item variance (Burton & Mazerolle, 2011). In
other words, whether the items are measuring what they should measure, which is the items’
construct-based validity. Multiple indices are used to evaluate the model fit, including the chi-square
value, comparative fit index, standardized root mean square residual, and root mean square error of
approximation (Kline, 2011). If the model fit indices are acceptable, the standardized factor loadings
will be further examined to indicate how strong the relationship is between items and its scale
(conventionally higher than .30/. 40) (Kline, 2011). The strength of factor loadings can indicate
which items are the best, good, or acceptable ones in the scale.
MEASUREMENT: INTERDISCIPLINARY RESEARCH AND PERSPECTIVES 45

After combining the results from different analyses, the researchers should revise and/or delete
poor items and send the revised items back to Step (3) for iterative runs of mixing validation and
pilot testing until the statistical validation results are acceptable through Step (5). The whole process
of the model is represented in Figure 1.

Conclusion
In conclusion, this study summarized the existing literature on using mixed methods to develop
instrument. The review of literature indicated why mixed methods is the most appropriate research
design for scale development. Adapting Creswell and Plano Clark (2011)’s exploratory instrument
design, the author of this study proposed a five-step mixed methods model of scale development
(MSDVA) that comprises mixed methods research design and psychometric issues. As the figure of
the model showed, the scale development procedures and validation procedures are integrated
during the whole process. Mixing occurs at research design level and research implementation
level. For instance, at some steps, development and validation are conducted concurrently. At certain
steps, specific mixing strategies are highlighted based on mixed methods research design considera-
tion. In brief, the model provides comprehensive and detailed instructions for researchers to develop
new scales using a sequential mixed methods design.
Besides the discussion on scale development, the paper also aims to improve readers’ under-
standing of mixed methods as well as its applications. In the future, more mixed methods studies are
needed to present empirical examples of scale development in social sciences. Hopefully, mixed
methods would be well recognized and applied in a wide range of disciplines in practice.

References
Agarwa, N. (2011). Verifying survey items for construct validity: A two-stage sorting procedure for questionnaire
design in information behavior research. ASIST, 10, 9–13.
Agarwa, N., Xu, Y., & Poo, D. (2011). A context-based investigation into source use by information seekers. Journal of
American Society for Information Science and Technology, 62(6), 1087–1104.
American Educational Research Association, American Psychological Association, & National Council on
Measurement in Education. (AERA/APA/NCME). (1999). Standards for educational and psychological testing.
Washington, DC: American Educational Research Association.
Boeije, H., Slagt, M., & Wesel, F. (2013). The contribution of mixed methods research to the field of childhood trauma:
A narrative review focused on data integration. Journal of Mixed Methods Research, 7(4), 347–369.
Borsboom, D., Mellenbergh, G., & Heerden, J. (2004). The concept of validity. Psychological Review, 111(4), 1061–
1071.
Brod, M., Tesler, L., & Christensen, T. (2009). Qualitative research and content validity: Developing best practices
based on science and experience. Quality of Life Research, 18, 1263–1278.
Bryman, A. (2006). Integrating quantitative and qualitative research: How is it done. Qualitative Research, 6, 97–113.
Burton, L., & Mazerolle, S. (2011). Survey instrument validity part I: Principles of survey instrument development and
validation in athletic training education research. Athletic Training Education Journal, 6(1), 27–35.
Collins, K., Onwuegbuzie, A., & Sutton, I. (2006). A model incorporating the rationale and purpose for conducting
mixed-methods research in special education and beyond. Learning Disabilities: A Contemporary Journal, 4(1), 67–
100.
Crede, E., & Borrego, M. (2013). From ethnography to items: A mixed methods approach to developing a survey to
examine graduate engineering student retention. Journal of Mixed Methods Research, 7(1), 62–80.
Creswell, J. W., & Plano Clark, V. L. (2011). Designing and conducting mixed methods research (2nd ed.). Thousand
Oaks, CA: Sage.
Dahodwala, N., Karlawish, J., Shea, J., Zubritsky, C., Stern, M., & Mandell, D. (2012). Validation of an instrument to
measure older adults’ expectations regarding movement. PLoS One, 7(8), e43854.
Dellinger, A., & Leech, N. (2007). Toward a unified validation framework in mixed methods research. Journal of Mixed
Methods Research, 1(4), 309–332.
Durham, J., Tan, B., & White, R. (2011). Utilizing mixed research methods to develop a quantitative assessment tool:
An example from explosive remnants of a war clearance program. Journal of Mixed Methods, 5(3), 212–226.
Glogowska, M. (2011). Paradigms, pragmatism and possibilities: Mixed-methods research in speech and language
therapy. International Journal of Language Communication Disorder, 46(3), 251–260.
46 Y. ZHOU

Greene, J. C., Caracelli, V. J., & Graham, W. F. (1989). Toward a conceptual framework for mixed-method evaluation
designs. Educational Evaluation and Policy Analysis, 11, 255–274.
Hitchcock, J., Sarkar, S., Nastasi, B., Burkholder, G., Varjas, K., & Jayasena, A. (2006). Validating culture- and gender-
specific constructs: A mixed-method approach to advance assessment procedures in cross-cultural settings. Journal
of Applied School Psychology, 22(2), 13–33.
Holsapple, M., Finelli, C., Carpenter, D., Harding, T., & Sutkus, J. (2009). Work in progress – A mixed-methods
approach to developing an instrument measuring engineering students’ positive ethical behavior. 39th ASEE/IEEE
Frontiers in Education Conference, Session T3E-1, San Antonio, TX.
Hubley, A., & Zumbo, B. (2011). Validity and the consequences of test interpretation and use. Social Indicators
Research, 103, 219–230.
Johnson, R. B., Onwuegbuzie, A. J., & Turner, L. (2007). Toward a definition of mixed methods research. Journal of
Mixed Methods Research, 1(2), 112–133.
Karasz, A., & Singelis, T. (2009). Qualitative and mixed methods research in cross-cultural psychology: Introduction to
the special issue. Journal of Cross Cultural Psychology, 40(6), 909–916.
Kettles, A., Creswell, J., & Zhang, W. (2011). Mixed methods research in mental health nursing. Journal of Psychiatric
and Mental Health Nursing, 18, 535–542.
Kline, R. B. (2011). Principles and practice of structural equation modeling (3rd ed.). New York: Guilford Press.
Lee, Y., & Greene, J. (2007). The predictive validity of an ESL placement test: A mixed methods approach. Journal of
Mixed Methods Research, 1(4), 366–389.
Leech, N., Dellinger, A., Brannagan, K., & Tanaka, H. (2010). Evaluating mixed research studies: A mixed methods
approach. Journal of Mixed Methods Research, 4(1), 17–31.
Luyt, R. (2012). A framework for mixing methods in quantitative measurement development, validation, and revision:
A case study. Journal of Mixed Methods Research, 6(4), 294–316.
Martin, N., & Sass, D. (2010). Construct validation of the behavior and instructional management scale. Teaching and
Teacher Education, 26(5), 1124–1135.
May, T. (2001). Social research: Issues, methods and process. Berkshire, UK: Open University Press.
Melka, S., Lancaster, S., Bryant, A., Rodriguez, B., & Weston, R. (2011). An exploratory and confirmatory factor
analysis of the affective control scale in an undergraduate sample. Journal of Psychopathology Behavior Assessment,
33, 501–513.
Messick, S. (1989). Meaning and values in test validation: The science and ethics of assessment. Educational
Researcher, 18, 5–11.
Meurer, S., Rubio, D., Counte, M., & Burroughs, T. (2002). Development of a healthcare quality improvement
measurement tool: Results of a content validity study. Hospital Topics: Research and Perspectives on Healthcare,
80(2), 7–13.
Miller, M., Kim, J., Chen, G., & Alvarez, A. (2012). Exploratory and confirmatory factor analyses of the Asian
American racism-related stress inventory. Assessment, 19(1), 53–64.
Moore, G., & Benbasat, I. (1991). Development of an instrument to measure the perceptions of adopting an
information technology innovation. Information Systems Research, 2(3), 192–222.
Moore, G., & Benbasat, I. (2001). Development of an instrument to measure the perceptions of adopting an
information technology innovation. Information Systems Research, 2(3), 192–222. [Copyright, 1991, The Institute
of Management Sciences].
Morell, L., & Tan, R. (2009). Validating for use and interpretation: A mixed methods contribution illustrated. Journal
of Mixed Methods Research, 3(3), 242–264.
Nassar-McMillan, S., Wyer, M., Oliver-Hoyo, M., & Ryder-Burge, A. (2010). Using focus groups in preliminary
instrument development: Expected and unexpected lessons learned. The Qualitative Report, 15(6), 1621–1634.
Nastasi, B., Hitchcock, J., Sarkar, S., Burkholder, G., Varjas, K., & Jayasena, A. (2007). Mixed methods in intervention
research: Theory to adaptation. Journal of Mixed Methods Research, 1(2), 164–182.
Newman, I., Lim, J., & Pineda, F. (2013). Content validity using a mixed methods approach: Its application and
development through the use of a table of specifications methodology. Journal of Mixed Methods, 7(3), 243–260.
Onwuegbbuzie, A., & Johnson, R. (2006). The validity issue in mixed research. Research in the Schools, 13(1), 48–63.
Onwuegbuzie, A., Bustamante, R., & Nelson, J. (2010). Mixed research as a tool for developing quantitative instru-
ments. Journal of Mixed Methods, 4(1), 56–78.
Plano Clark, V., Anderson, N., Wertz, J., Zhou, Y., Schumacher, K., & Miaskowski, C. (2015). Conceptualizing
longitudinal mixed methods designs: A methodological review of health sciences research. Journal of Mixed
Methods Research, 9(4), 297–319.
Plano Clark, V. L., & Wang, S. C. (2010). Adapting mixed methods research to multicultural counseling. In J. G.
Ponterotto, J. M. Casas, L. A. Suzuki, & C. M. Alexander (Eds.), Handbook of multicultural counseling (3rd ed., pp.
427–438). Thousand Oaks, CA: Sage.
Rowan, N., & Wulff, D. (2007). Using qualitative methods to inform scale development. The Qualitative Report, 12(3),
450–466.
MEASUREMENT: INTERDISCIPLINARY RESEARCH AND PERSPECTIVES 47

Shaw, J., Connelly, D., & Zecevic, A. (2010). Pragmatism in practice: Mixed methods research for physiotherapy.
Physiotherapy Theory and Practice, 26(8), 510–518. doi:10.3109/09593981003660222
Sireci, S. (2009). Packing and unpacking sources of validity evidence: History repeats itself again. In R. Lissitz (Ed.), The
concept of validity: Revisions, new directions, and applications (pp. 19–37). Charlotte, NC: Information Age
Publishing, Inc.
Sireci, S. G., & Sukin, T. (2013). Test validity. In K. Geisinger et al. (Eds.), APA handbook of testing and assessment in
psychology, 1 (pp. 61–84). Washington DC: American Psychology Association.
Smolleck, L., Zembal-Saul, C., & Yoder, E. (2006). The development and validation of an instrument to measure
preservice teachers’ self-efficacy in regard to the teaching of science as inquiry. Journal of Science Teacher
Education, 17, 137–163.
Tashakkori, A., & Teddlie, C. (1998). Mixed methodology: Combining qualitative and quantitative approaches.
Thousand Oaks, CA: Sage.
Teddlie, C., & Tashakkori, A. (2009). Foundations of mixed methods research. Thousand Oaks, CA: Sage.
Tewksbury, R. (2009). Qualitative versus quantitative methods: Understanding why qualitative methods are superior
for criminology and criminal justice. Journal of Theoretical and Philosophical Criminology, 1(1), 38–58.
Ungar, M., & Liebenberg, L. (2011). Assessing resilience across cultures using mixed methods: Construction of the
child and youth resilience measure. Journal of Mixed Methods, 5(2), 126–149.
Vogt, D., King, D., & King, L. (2004). Focus groups in psychological assessment: Enhancing content validity by
consulting members of the target population. Psychological Assessment, 16(3), 231–243.
Weiss, M., & Smith, A. (1999). Quality of youth sport friendships: Measurement development and validation. Journal
of Sport & Exercise Psychology, 21, 145–166.
Worthingon, R., & Whittaker, T. (2011). Scale development research: A content analysis and recommendations for
best practice. The Counseling Psychologist, 34(6), 806–838.

View publication stats

You might also like