Assessing Students' Critical Thinking Performance: Urging For Measurements Using Multi-Response Format

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Thinking Skills and Creativity 4 (2009) 70–76

Contents lists available at ScienceDirect

Thinking Skills and Creativity


journal homepage: http://www.elsevier.com/locate/tsc

Keynote

Assessing students’ critical thinking performance: Urging for


measurements using multi-response format
Kelly Y.L. Ku
Department of Psychology, The Chinese University of Hong Kong, Sino Building, Shatin, New Territories, Hong Kong, China

a r t i c l e i n f o a b s t r a c t

Article history: The current paper discusses ambiguities in critical thinking assessment. The paper first
Received 24 October 2008 reviews the components of critical thinking. It then discusses the features and issues of
Received in revised form 28 January 2009
commonly used critical thinking tests and to what extend they are made compatible to the
Accepted 2 February 2009
conceptualization of critical thinking. The paper argues that critical thinking tests utilizing
Available online 26 February 2009
a single multiple-choice response format measures only recognition or level of knowledge,
and do not adequately capture the dispositional characteristics of test-takers. Multiple-
Keywords:
Critical thinking choice response format does not reveal test-takers’ underlying reasoning for choosing a
Assessment particular answer, nor does it reflect test-takers’ ability to think critically under unprompted
Response format situations. Whereas measurement that allows for responses in both multiple-choice and
Higher education open-ended format makes it possible to assess individuals’ spontaneous application of
Thinking skills thinking skills on top of their ability to recognize a correct response. Assessment consists
of multi-response format should be pursued for effective evaluation of students’ critical
thinking performance.
© 2008 Elsevier Ltd. All rights reserved.

1. Introduction

Teaching for critical thinking is an important goal of modern education, as it equips students with the competency
necessary to reason about social affairs in a rapidly changing world. To develop such competency, students must go beyond
absorbing textbook knowledge and learn to build up skills involved in judging information, evaluating alternative evidence
and arguing with solid reasons. These skills in critical thinking are not only vital for students to perform well in school, but
also needed in future workplaces, social and interpersonal contexts where sound decisions are to be made carefully and
independently on a daily basis.
The importance being accorded to critical thinking is now a worldwide phenomenon. In education reports of countries
such as the United States, United Kingdom and Australia, critical thinking has been listed as a key area to be cultivated and
assessed in higher education (Association of American Colleges and Universities, 2005; Australian Council for Educational
Research, 2002; Higher Education Quality Council, 1996). In addition to Western countries, recent educational reforms in
Asian countries such as Hong Kong (Education Bureau, 2003) and Japan (see Discussion in Atkinson, 1997) have as well
advocated the development of critical thinking in order for students to participate in a liberal society. Despite the consensus
of scholars and educators on the significance of nurturing students to become critical thinkers, teaching for critical thinking
has not been a simple task. For instance, the Association of American Colleges and Universities’ (2005) report stated that
as few as 6% of college seniors were considered proficient in critical thinking. Although we can now find a range of pro-
grams designed to enhance students’ critical thinking ability in many educational institutes, educators have commented

E-mail address: kylku@psy.cuhk.edu.hk.

1871-1871/$ – see front matter © 2008 Elsevier Ltd. All rights reserved.
doi:10.1016/j.tsc.2009.02.001
K.Y.L. Ku / Thinking Skills and Creativity 4 (2009) 70–76 71

that critical thinking instruction has not been carried out systematically and explicitly in most schools (Paul, Elder, & Bartell,
1997; Pithers & Soden, 2000), and particularly not in traditional teacher-centered classrooms (Howe, 2004). The difficul-
ties involved in critical thinking education are multifold. One of the obstacles is lacking proper assessment that effectively
and objectively measures students’ strength and weaknesses in critical thinking (Ennis, 2003; Halpern, 2003; Norris, 2003).
Without appropriate assessment that allows the growth of students’ critical thinking ability to show, it would be difficult to
examine the effectiveness of any programs that aim to enhance skills in critical thinking.
Valid assessment is crucial as it helps to identify instructional needs, foster student learning, and provides feedback for
helping students to make progress and for instructors to devise teaching plans (Ennis, 2003). It has been widely recognized
that traditional school examinations did not favor the growth of critical thinking, as they are often highly selective and put
much emphasis on retention of content knowledge. In order to acquire as much textbook information as needed for such
retention-based tests, students often learn through memorization as opposed to critical inquiry. In order to achieve the goal
of educating students to become critical thinkers, change in assessment practices has been recommended. For instance, a
report of the recent educational reforms in Hong Kong highlighted the need “to put more emphasis on the assessment of
[students’] ability to apply what they have learnt to solve problems” (Education Bureau, 2003, p. 31). However, as the nature
of critical thinking is complex and multivariate, developing a proper measurement can be difficult (Ennis, 2003). Given the
basis of a meaningful assessment is a clear definition of what is to be measured, the following section begins with addressing
how critical thinking is conceptualized, and proceeds to examine how its conception is related to its assessment.

1.1. The cognitive and disposition components of critical thinking

The conceptualization and assessment of critical thinking are interdependent issues that must be discussed together:
how critical thinking is defined determines how it is best measured. Earlier definitions of critical thinking emphasized the
cognitive component, that critical thinking is a skill, a set of skills, a mental procedure, or simply is rationality (Baron, 1985;
Ennis, 1962; McPeck, 1981). These definitions typically concern thinking methods and rules of formal logic instead of the
implications of thoughts. In recent years, a broader perspective has been sought. For instance, Ennis’ definition of critical
thinking has changed over the years from the “correct assessing of statements” (Ennis, 1962, p. 81) to a “reasonable reflective
thinking that is focused on deciding what to believe and do” (Ennis, 1987, p. 10). In his later definition, an intentional
and motivational aspect of critical thinking is emphasized, which has been termed by other scholars as “critical thinking
disposition” (e.g., Facione, 1990a; Halpern, 1998; Perkins, Jay, & Tishman, 1993). The disposition to think critically includes
the motivation of a person (i.e., a matter of choosing to engage in effortful thinking or not) and it accounts for how critical
thinking is triggered, “good timing—attempting the right kind of thinking at the right moment” (Perkins & Ritchhart, 2004,
p. 352). In other words, what makes a good thinker is now a question that “must be answered as much in terms of people’s
attitudes, motivations, commitments, and habits of mind as in terms of their cognitive abilities” (Perkins & Ritchhart, 2004,
p. 352).
The changes of how theorists define critical thinking reflect the emergence of a more holistic view on the conceptualization
of critical thinking: besides the ability to engage in cognitive skills, a critical thinker must also have a strong intention
to recognize the importance of good thinking and have the initiative to seek better judgment. In other words, cognitive
component and dispositional component together determine a person’s actual thinking performance (Ennis, 1987; Facione,
Sanchez, Facione, & Gainen, 1995; Halpern, 1998). Given that, assessment of critical thinking must also be reexamined in
accordance to this more recent understanding of critical thinking. In particular, the need for critical thinking measurement
to account for individuals’ inclination to use appropriate thinking skills at appropriate situations ought to be emphasized
(Norris, 2003).

1.2. Relationships between the conceptualization of critical thinking and its assessment

Although there has been limited research investigating the unique and relative contributions of the cognitive component
and the disposition component to individuals’ critical thinking performance, the two components were found to associate
to different response format of critical thinking tests. (e.g., Clifford, Boufal, & Kurtz, 2004; Macpherson & Stanovich, 2007;
Sá & Stanovich, 1999; Taube, 1997; Toplak & Stanovich, 2003).
In a study by Taube (1997), confirmatory factor analysis showed that a two-factor model for critical thinking provided a
better fit with the data from a group of university students than did a single-factor model. On the one hand, there was an
ability factor indexed by SAT scores, GPA, as well as scores on the Watson–Glaser Critical Thinking Appraisal (WGCTA; Watson
& Glaser, 1980), a multiple-choice test of critical thinking (factor loading = .59). On the other hand, there was a disposition
factor represented by measures of need for cognition, tolerance of ambiguity, and dualistic/relativistic thinking. Scores on
the Ennis–Weir Critical Thinking Essay Test (Ennis & Weir, 1985), an open-ended test of critical thinking in which test-takers
were asked to generate and evaluate arguments, loaded significantly on both factors (i.e., .41 on the ability factor and .20 on
the disposition factor). These findings suggested that cognitive and dispositional components were differentiable aspects of
critical thinking. More importantly, the fact that WGCTA scores (a multiple-choice test) loaded solely on the ability factor
whereas Ennis-Weir scores (an open-ended essay test) loaded on both the ability and disposition factors had significant
implications for the measure of critical thinking - it suggested that open-ended response formats would capture more of the
disposition aspect of critical thinking than multiple-choice response formats would.
72 K.Y.L. Ku / Thinking Skills and Creativity 4 (2009) 70–76

In more recent studies, Sá et al. (1999) found cognitive ability (scores on selected subtests of the WAIS-R) accounting for
about 20–25% variance in participants’ performance on three reasoning or evaluation tasks, whereas disposition composite
scores (i.e., actively open-minded thinking) contributed a further 2.8% or 8.6% unique variance in their performance on
two of the three tasks. Clifford and colleagues (2004) found that cognitive factors (the Verbal Comprehension Index of
the WAIS-III) explained 11% of the variance of WGCTA scores whereas the personality dimension of openness to experience
explained a further 5.5% of unique variance. Similar analyses in another study by Toplak and Stanovich (2003) revealed that in
addition to the 15.5% contributed by cognitive ability, there was a further unique 11.8% contributed by thinking dispositions
(i.e., reflectivity and need for cognition) to the variance in performance on five disjunctive tasks. Although results from
these studies vary with different measures of cognitive ability, thinking dispositions and critical thinking performance,
generally there is clear evidence that thinking dispositions predict independent variances other those already accounted for
by cognitive abilities in critical thinking performance. In other words, unique contributions were found for both cognitive
and dispositional components to performance on critical thinking tests.
As a whole, the discussion in the first section of the paper highlights the importance for the assessment of critical thinking
to be developed on the basis of its conceptualization. Given the empirical evidence mentioned in preceding paragraphs on the
unique contributions of both cognitive and dispositional components to individuals’ critical thinking performance, proper
assessment of critical thinking ought to capture and reflect both components. The following section argues that existing
measurements do not seem to adequately reveal the dispositional aspect of critical thinking, because the response format
of some tests disallow unprompted thinking or self-generated solutions to questions (Halpern, 2007; Norris, 2003). The
next section first reviews commonly used critical thinking tests, and questions whether tests utilizing only multiple-choice
response format accurately reflect test-takers’ critical thinking ability, as individuals’ dispositional characteristics may not
be well captured.

2. Issues in critical thinking assessment

As the construct of critical thinking is abstract and multi-nature (Paul, 1985), its assessment has also been indefinite.
There has not been a consensus on how critical thinking should be measured. There are a number of popular critical think-
ing instruments: the Watson–Glaser Critical Thinking Appraisal (WGCTA; Watson & Glaser, 1980), the Ennis–Weir Critical
Thinking Essay Test (EWCTET; Ennis & Weir, 1985), the Cornell Critical Thinking Test (CCTT; Ennis, Millman, & Tomko, 1985),
the California Critical Thinking Skills Test (CCTST; Facione, 1990b), and a more recent test: the Halpern Critical Thinking
Assessment Using Everyday Situations (HCTAES; Halpern, 2007). Previous studies using these different instruments as esti-
mates of individuals’ critical thinking competence have rested on the assumption that the chosen measurements of critical
thinking are compatible with the conceptualization of critical thinking. However, despite overlap in some aspects, these
critical thinking tests vary in their purposes, formats, and contexts.
In this section, characteristics and ambiguities of commonly used critical thinking instruments are discussed. In particular,
there are two focuses: (1) whether existing tests of critical thinking are made compatible to reflect both cognitive and
dispositional components of the conceptualization of critical and (2) whether these tests can differentiate individuals’ ability
to think critically in both prompted and unprompted contexts.

2.1. Multiple-choice measures of critical thinking

The WGCTA and the CCTST are examples of two widely used instruments that utilize single multiple-choice response
format. The WGCTA, available in three forms, and the CCTST, in two forms, consist of multiple-choice questions placed in
general contexts. All items are intended to be discipline neutral and are based on a variety of topics ranging from everyday
situations to social, economic, and political issues. Both tests measure discrete cognitive skills that require no knowledge of
a specific domain. The standard form (i.e., Forms-A & B) of the WGCTA is composed of 80 items that measure skills in five
aspects of critical thinking: inference, recognition of assumptions, deductions, interpretation, and evaluation of arguments.
The CCTST is composed of 34 items and measures five categories of skills including interpretation, analysis, evaluation,
inference, and explanation. Similarly, the CCTT, consisting of two levels (i.e., X and Z), is a story-based test in only multiple-
choice questions. Level X contains 71 items designed for Grade 4 to college students, and Level Z contains 62 items to be used
for gifted high school and college students. Altogether, the two forms of the CCTT measures seven aspects of critical thinking
including induction, deduction, credibility, assumptions, semantics, definition, and prediction. The ˛’s of the overall scales
ranged from .69 to .85, .61 to .72, and .67 to .90 for WGCTA, CCTST, and CCTT, respectively (Ennis et al., 1985; Facione, 1990b;
Watson & Glaser, 1980).
The WGCTA, CCTST and CCTT have been widely used as indication of individuals’ critical thinking competence. However,
there have been various reports challenging and questioning their validity and factor structure. For instance, in one study the
CCTST was reported to have low internal consistency, with subscales only varying from .21 to .51 (Leppa, 1997). It has also
been commented to have poor construct validity, unstable reliability, and low comparability between its two forms in other
investigations (Bondy, Koenigseder, Ishee, & Williams, 2001; Jacobs, 1999; Leppa, 1997). Likewise, Loo and Thorpe (1999)
reported that the reliability of WGCTA’s subscales varied from .17 to .74. In a recent meta-analysis (Bernard et al., 2008) on
the psychometric properties of the WGCTA, a single-component solution emerged from datasets of 60 published studies,
K.Y.L. Ku / Thinking Skills and Creativity 4 (2009) 70–76 73

which indicated no clear subscale structure was found. Bernard and colleagues’ also reported reliability coefficients of the
WGCTA to be an average of .47 across studies.
In addition to the potential drawbacks mentioned above, as a whole, the multiple-choice measurements of critical thinking
have also raised two major concerns. The first concern is the incompatibility of the assessment and the conceptualization
of critical thinking: the multiple-choice tests mainly tap the cognitive component of critical thinking, with the dispositional
component incompletely revealed. The single right-and-wrong answer approach of multiple-choice tests is unable to reflect
test-takers inclinations to engage in critical thinking (Ennis, 2003; Halpern, 2003; McMillan, 1987). Specifically, Halpern
(2003) has criticized the lack of comprehensiveness of multiple-choice tests, and concluded that they are basically tests of
verbal and quantitative knowledge, since the test-takers are not free to determine their own evaluative criteria nor generate
their own solutions to problem. Likewise, Ennis (2003) has commented that “these easy-to-use tests typically miss, for
example, the critical thinking dispositions. . .existing multiple-choice tests do not directly and effectively test for many
significant dispositional aspects of critical thinking” (p. 305).
A second concern is that scores of multiple-choice critical thinking tests may fail to serve as indicators of test-takers’
ability to think critically in unprompted contexts (Ennis & Norris, 1990; Halpern, 2003; Norris, 2003). It has been argued that
satisfactory performance in prompted-thinking contexts cannot be generalized to contexts where prompts are not given.
Moreover, real-life problems often require the use of several skills at one time and a strategic approach in selecting suitable
skills for different problems (Halpern, 2003), which can be very much unlike the multiple-choice response format that readily
provides test-takers with the answers to choose from (Sternberg, 1985).
In response to the above-mentioned concerns, Facione (1990a,b) have suggested employing the California Critical Thinking
Skills Test (CCTST; Facione, 1990b) and the California Critical Thinking Disposition Inventory (CCTDI; Facione & Facione, 1992)
together when assessing critical thinking. The CCTST was specifically developed to tap the cognitive factor of critical thinking,
and the CCTDI was developed to tap the dispositional factor; thus, administering both has been suggested by their authors to
reflect the two-factor conceptualization of critical thinking. However, measuring each factor of critical thinking using separate
measures is unlikely to fill the gap between what people claim they would do (in self-reported dispositional measures) and
what they actually do (in an actual test of critical thinking skills). More importantly, although the CCTDI and the CCTST
were specifically developed to measure critical thinking skills and dispositions, their reliability and validity have also been
questioned. In a study of 70 nursing students (Leppa, 1997), the ˛’s of the CCTST subscales ranged from only .21 to .51, which
were incomparable with the statistics (˛’s ranged from .68 to .70) published by the authors of the CCTST. Similarly, in another
study of Chinese undergraduates (Ip et al., 2000), a weak internal consistency of the CCTDI was reported with Cronbach’s
˛ ranging from .34 to .76 for its subscales. In several other studies (Kakai, 2003; Walsh & Hardy, 1997; Walsh, Seldomridge,
& Badros, 2007), the stability of the CCTDI’s factor structure was found to be weak and unstable: a four-factor structure (as
opposed to the original seven-factor structure suggested by the authors) was found, which indicated that the subscales of the
CCTDI are not discrete enough. In general, Walsh et al. (2007) concluded that the CCTDI contained elements worth keeping,
but creating a short form by eliminating the weak loading items was urged.

2.2. Open-ended measures of critical thinking

Open-ended tests are preferred by several researchers (Halpern, 2003; Norris & Ennis, 1989; Taube, 1997). The Ennis–Weir
Critical Thinking Essay Test (EWCTET; Ennis & Weir, 1985) is a popular essay test of the general critical thinking ability of high
school or college students. It is a highly structured test that examines students’ ability to identify built-in reasoning flaws in
an argumentative passage, as well as their ability to defend their own arguments (Ennis, 2003). In one scenario, test-takers
are presented with a letter to the editor in which the writer argues for a ban on parking between 2 am and 6 am. Test-takers
are to evaluate the logic of the letter and write a response.
The EWCTET specifically measures test-takers’ ability to analyze and respond to arguments and debates in authentic
situations. Although the open-ended format of the EWCTET allows test-takers to demonstrate their cognitive skills as well
as inclination to engage in careful thinking (Norris & Ennis, 1989), its highly specific context and strict structure have been
commented as restricting test-takers’ responses, and thus the effects of disposition on thinking performance may not be
adequately revealed (Taube, 1997). Additionally, despite author-reported inter-rater reliability of .82 to .86, concerns have
been raised regarding the subjective scoring process and potential biases in favor of test-takers who are more proficient in
writing (Adams, Whitlow, Stover, & Johnson, 1996).

2.3. New trend: critical thinking measure of multi-response format

In short, both the multiple-choice and open-ended tests of critical thinking have their respective limitations. The current
trend is to combine the two response formats into one test. The Halpern Critical Thinking Assessment Using Everyday
Situations (HCTAES; Halpern, 2007) is a recent attempt to address the above-mentioned issues by incorporating both
multiple-choice and open-ended response formats into a single measurement tool. Unlike the EWCTET, the HCTAES is
less structured and presents more life-like situations. The HCTAES measures critical thinking ability using questions set in
authentic and believable contexts. The test consists of 25 scenario-based questions; each asks for open-ended responses
as well as multiple-choice responses, totaling 50 questions. The multiple-choice part of each question tests recogni-
tion of correct responses from a list of alternatives, whereas the open-ended part tests strategic use of thinking skills
74 K.Y.L. Ku / Thinking Skills and Creativity 4 (2009) 70–76

as well as the ability to self-construct solutions without hints. Test-takers are required to answer the open-ended part
first.
There is evidence that multiple-choice and open-ended responses are measuring separate cognitive abilities (Bridgeman
& Moran, 1996; Halpern, 2007). The open-ended part of the HCTAES attempts to reveal more of the dispositional component
of thinking, as it allows test-takers to demonstrate whether they are inclined to apply the appropriate skills (Halpern, 2007).
Essentially, the open-ended format measures “free recall” as there are few constraints on the type of response that the
test-taker may generate, whereas the multiple-choice format measures “recognition memory” (Halpern, 2007). The former
requires test-takers to consciously search and select appropriate knowledge and skills from their own memory in constructing
an answer, whereas the latter requires test-takers to identify the appropriate response from a given list of alternatives.
The questions of the HCTAES represent five categories of skills: verbal reasoning (e.g., recognizing the use of pervasive
or misleading language), argument analysis (e.g., recognizing reasons and conclusions in arguments), hypothesis testing
(e.g., understanding sample size, generalizations), using likelihood and uncertainty (e.g., applying relevant principles of
probability, base rates), as well as decision making and problem solving (e.g., identifying the problem goal, generating and
selecting solutions among alternatives).
The following is a hypothetical item similar in length and presentation to those in the HCTAES:
Results from a recent study indicated that female adolescents who perceive themselves as being unpopular among
peers are more likely to be overweight. The researchers suggested that running social skills training programs for
female adolescents who are overweight would help solve their weight problems.
Open-ended question: Based on this information, would you support this idea as a way of solving overweight problems
for female adolescents? Type “yes” or “no” and explain why or why not.
Forced-choice question: Based on this information, which of the following is the best answer? (Four choices provided)
Sample choice: Social skills training will probably reduce overweight problems among female adolescents because
the researchers found that girls who perceive themselves as being unpopular among peers are more likely to be
overweight.
This question examines whether a test-taker recognizes the distinction between correlation and cause and effect. Test-
takers are given marks if they show an understanding that correlation does not imply causation in their self-constructed
answer in the open-ended part or in their statement selection in the multiple-choice part.
The HCTAES have been used with different samples of American students with convergent and divergent validity evidences
reported, including a .32 positive correlation with the Arlin Test of Formal Reasoning, high positive correlations (in the .50 to
.70 range) with a number of achievement or ability tests (e.g., SAT-Verbal, SAT-Math, GRE-Analytic), and moderate correlations
in the .30 to .40 range with measures of need for cognition and conscientiousness. Cronbach’s ˛ reliability coefficients (˛) of
0.81 and 0.82 were reported with different groups of students (Halpern, 2007).
In a preliminary study (Hau et al., 2006) examining cross-cultural construct validation of the HCTAES using Chinese
and American undergraduates (N = 295) showed correlated yet quite distinguishable subscales with the HCTAES that are
similar for the Chinese and U.S. college students. There are unique characteristics for open-ended and multiple-choice items
that suggest they are probably assessing different aspects of participants’ competence and disposition in critical thinking
and the five categories of critical thinking skills (verbal reasoning, argument analysis, hypothesis testing, likelihood and
uncertainty, decision making/problem solving) was confirmed across both cultures. In a more recent study (Ku & Ho, 2009)
that examined the relative and combined effects of cognitive and disposition factors on the performance of the HCTAES of
137 Chinese undergraduates, it was found that thinking dispositions exerted greater effects when participants were giving
unprompted responses. The result provided preliminary support for the argument that in order for a critical thinking test to
assess both related ability and disposition, tasks allowing for unprompted thinking would be necessary.
The popular instruments of critical thinking discussed in this section mostly adopt either the multiple-choice or the open-
ended response format. Various concerns raised regarding each of the format are reviewed, and the suggestion to combine
the two response formats into one test is put forth. Although more studies are needed to further establish the validity and
applicability of the HCTAES, its use of parallel questions that allow for both multiple-choice and open-ended responses seems
to carry some advantages over traditional critical thinking tests that utilize a single response format.

3. Future implications

There has been worldwide demand for higher-education curriculums to engage students in learning activities that nurture
critical thinking skills. To understand whether we have been successful in answering such demand, the development of
suitable assessment of critical thinking is crucial.
Given the current consensus of viewing critical thinking as a synthesis of cognitive ability and disposition, the discussion in
the preceding sections drew attention to the importance of making critical thinking assessment reflects both of its theoretical
components as closely as possible, so that the test-taker’s cognitive ability as well as the related dispositions can be properly
and accurately reveal. It has been emphasized that any measurements of critical thinking that utilize a single-response
format are insufficient in reflecting students’ true critical thinking ability, and are incompatible with the conceptualization
K.Y.L. Ku / Thinking Skills and Creativity 4 (2009) 70–76 75

of critical thinking. As a result, it has been argued that measurement that elicits both open-ended and multiple-choice
response formats should be pursued.
The current paper has highlighted the importance of considering the right format of response in critical thinking assess-
ment. It has pointed to the need for refining and expanding existing measures to capture more comprehensively thinking
processes in critical thinking performance. Future research comparing tests in single-response format and multi-response
format would be crucial to establish empirical support to the argument of this paper. In addition, a recent study (Renaud &
Murrary, 2008) has suggested that improvement in critical thinking skills are more clearly detected with items focusing on
specific course content rather than on general content. The issue of whether critical thinking is best assessed with general
and subject-specific questions should be further examined in relation to the response format of measurement.
It must also be noted that any single measure of critical thinking is unlikely to be comprehensive enough to capture all
relevant aspects involved in human thinking. Thus, when teaching critical thinking, it is necessary for teachers to consider
“a set of multiple measures of critical thinking that can be used to triangulate the results” (McMillan, 1987, p. 15). When
evaluating students’ ability to think, teachers should adopt different assessment methods, such as exercises that allow
students to self-construct answers, assignments that facilitate the practice of strategic use of thinking skills in everyday
contexts, and when adopting multiple-choice exercises, follow-up questions should be given to probe students’ underlying
reasoning (Ennis, 2003; Norris & Ennis, 1989).

Acknowledgement

This paper was partially supported by a grant from the Research Grants Council of the Hong Kong Special Administrative
Region (Project no. CUHK 4118/04H, 2004, Humanities, Social Science & Business Studies Panel).

References

Adams, M. H., Whitlow, J. F., Stover, L. M., & Johnson, K. W. (1996). Critical thinking as an educational outcome: An evaluation of current tools of measurement.
Nurse Education, 21, 23–32.
American Association of Colleges and Universities. (2005). Liberal education outcomes: A preliminary report on student achievement in college. Washington,
DC: AAC&U.
Atkinson, D. (1997). A critical approach to critical thinking. TESOL Quarterly, 31, 71–94.
Australian Council for Educational Research. (2002). Graduate skills assessment. Australia: Commonwealth of Australia.
Baron, J. (1985). Rationality and intelligence. New York, NY: Cambridge University Press.
Bernard, R., Zhang, D., Abrami, P., Sicoly, F., Borokhovski, E., & Surkes, M. (2008). Exploring the structure of the Watson–Glaser Critical Thinking Appraisal:
One scale or many subscales? Thinking Skills and Creativity, 3, 15–22.
Bondy, K., Koenigseder, L., Ishee, J., & Williams, B. (2001). Psychometric properties of the California Critical Thinking Tests. Journal of Nursing Measurement,
9, 309–328.
Bridgeman, B., & Moran, R. (1996). Success in college for students with discrepancies between performance on multiple choice and essay tests. Journal of
Educational Psychology, 88, 333–340.
Clifford, J. S., Boufal, M. M., & Kurtz, J. E. (2004). Personality traits and critical thinking skills in college students: Empirical tests of a two-factor theory.
Assessment, 11, 169–176.
Education Bureau. (2003). Progress Report on the Education Reform (3). Hong Kong: Education Bureau.
Ennis, R. H. (1962). A concept of critical thinking. Harvard Educational Review, 32, 81–111.
Ennis, R. H. (1987). A taxonomy of critical thinking dispositions and abilities. In J. B. Baron & R. J. Sternberg (Eds.), Teaching thinking skills: Theory and practice.
Ennis, R. H. (2003). Critical thinking assessment. In D. Fasko (Ed.), Critical thinking and reasoning (pp. 293–310). Cresskill, NJ: Hampton Press.
Ennis, R. H., Millman, J., & Tomko, T. N. (1985). Cornell critical thinking tests (3rd ed.). Pacific Grove, CA: Midwest Publications.
Ennis, R. H., & Norris, S. P. (1990). Critical thinking evaluation: Status, issues, needs. In J. Algina & S. M. Legg (Eds.), Cognitive assessment of language and math
outcomes (pp. 1–42). Norwood, NJ: Ablex.
Ennis, R. H., & Weir, E. (1985). The Ennis–Weir critical thinking essay test. Pacific Grove, CA: Midwest Publications.
Facione, P. A. (1990a). Critical thinking: A statement of expert consensus for purposes of educational assessment and instruction.
Facione, P. A. (1990b). The California Critical Thinking Skills Test. Millbrae, CA: California Academic Press.
Facione, P. A., & Facione, N. C. (1992). California Critical Thinking Disposition Inventory. Millbrae, CA: California Academic Press.
Facione, P. A., Sanchez, C. A., Facione, N. C., & Gainen, J. (1995). The disposition toward critical thinking. The Journal of General Education, 44, 1–25.
Halpern, D. F. (1998). Teaching for critical thinking: Helping college students develop the skills and dispositions of a teaching critical thinking for transfer
across domains: Dispositions, skills, structure training, and metacognitive monitoring. American Psychologist, 53, 449–455.
Halpern, D. F. (2003). The “how” and “why” of critical thinking assessment. In D. Fasko (Ed.), Critical thinking and reasoning: Current research, theory and
practice. Cresskill, NJ: Hampton Press.
Halpern, D. F. (2007). Halpern critical thinking assessment using everyday situations: Background and scoring standards. Claremont, CA: Claremont McKenna
College.
Hau, K. T., Halpern, D., Marin-Burkhart, L., Ho, I. T., Ku, K. Y. L., Chan, N. M., & Lun, V. M. C. (2006). Chinese and United States students’ critical thinking:
Cross-cultural construct validation of a critical thinking assessment American Educational Research Association Annual Conference, San Francisco, 7–11
April.
Higher Education Quality Council, Quality Enhancement Group. (1996). What are graduates? Clarifying the attributes of “graduateness”. London: HEQC.
Howe, E. R. (2004). Canadian and Japanese teachers’ conceptions of critical thinking: A comparative study. Teachers and Teaching: Theory and Practice, 10,
505–525.
Ip, W. Y., Lee, D. T. F., Lee, R. F. K., Chau, J. P. C., Wooton, R. S. Y., & Chang, A. M. (2000). Disposition towards critical thinking: A study of chinese undergraduate
nursing students. Journal of Advanced Nursing, 32, 84–90.
Jacobs, S. S. (1999). The equivalence of forms A and B of the California critical thinking skills test. Measurement and Evaluation in Counseling and Development,
31, 211–222.
Kakai, H. (2003). Re-examining the factor structure of the California Critical Thinking Disposition Inventory. Perceptual and Motor Skills, 96,
435–438.
Ku, K. Y. L., & Ho, I. T. (2009). An examination of the two-factor theory of critical thinking among Chinese students. Unpublished manuscript.
Leppa, C. J. (1997). Standardized measures of critical thinking: Experience with the california critical thinking tests. Nurse Education, 22, 29–33.
76 K.Y.L. Ku / Thinking Skills and Creativity 4 (2009) 70–76

Loo, R., & Thorpe, K. (1999). A psychometric investigation of scores on the Watson-glaser critical thinking appraisal new forms. Educational and Psychological
Measurement, 59, 995–1003.
Macpherson, R., & Stanovich, K. E. (2007). Cognitive ability, thinking dispositions, and instructional set as predictors of critical thinking. Learning and
Individual Differences, 17, 115–127.
McMillan, J. (1987). Enhancing college student’s critical thinking: A review of studies. Research in Higher Education, 26, 3–29.
McPeck, J. E. (1981). Critical thinking and education. NY, New York: St. Martin’s Press.
Norris, S. P. (2003). The meaning of critical thinking test performance: The effects of abilities and dispositions on scores. In D. Fasko (Ed.), Critical thinking
and reasoning: Current research, theory and practice. Cresskill, NJ: Hampton Press.
Norris, S. P., & Ennis, R. H. (1989). Evaluating critical thinking. Pacific Grove, CA: Critical Thinking Books and Software.
Paul, R. W. (1985). The critical thinking movement. National Forum, 65, 2–3.
Paul, R. W., Elder, L., & Bartell, T. (1997). California teacher preparation for instruction in critical thinking: Research findings and policy recommendations.
Sacramento, CA: California Commission of Teacher Credentialing.
Perkins, D. N., Jay, E., & Tishman, S. (1993). Beyond abilities: A dispositional theory of thinking. Merrill-Palmer Quarterly, 39, 1–21.
Perkins, D. N., & Ritchhart, R. (2004). When is good thinking? In D. Y. Dai & R. J. Sternberg (Eds.), Motivation, emotion, and cognition: Integrative perspectives
on intellectual functioning and development. Mahwah, NJ: Erlbaum.
Pithers, R. T., & Soden, R. (2000). Critical thinking in education: A review. Educational Research, 42, 237–249.
Renaud, R. D., & Murrary, H. G. (2008). A comparison of a subject-specific and a general measure of critical thinking. Thinking Skills and Creativity, 3, 85–93.
Sá, W. C., West, R. F., & Stanovich, K. E. (1999). The domain specificity and generality of belief bias: Searching for a generalizable critical thinking skill. Journal
of Educational Psychology, 91, 497–510.
Sternberg, R. J., & Beyond, I. Q. (1985). A triachic theory of human intelligence. New York, NY: Cambridge University Press.
Taube, K. T. (1997). Critical thinking ability and disposition as factors of performance on a written critical thinking test. Journal of General Education, 46,
129–164.
Toplak, M. E., & Stanovich, K. E. (2003). Associations between myside bias on an informal reasoning task and amount of post-secondary education. Applied
Cognitive Psychology, 17, 851–860.
Walsh, C. M., & Hardy, R. C. (1997). Factor structure stability of the California Critical Thinking Disposition Inventory across sex and various students’ majors.
Perceptual and Motor Skills, 85, 1211–1228.
Walsh, C. M., Seldomridge, L. A., & Badros, K. K. (2007). California Critical Thinking Disposition Inventory: Further factor analytic examination. Perceptual
and Motor Skills, 104, 141–151.
Watson, G., & Glaser, E. M. (1980). Watson–Glaser critical thinking appraisal. Cleveland, OH: Psychological Corporation.

You might also like