Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

The Journal of Educational Research

ISSN: (Print) (Online) Journal homepage:

Evaluating scientific thinking among Shanghai’s

students of high and low performing schools

Irfan Ahmed Rind & Bo Ning

To cite this article: Irfan Ahmed Rind & Bo Ning (2020): Evaluating scientific thinking among
Shanghai’s students of high and low performing schools, The Journal of Educational Research,
DOI: 10.1080/00220671.2020.1832430

To link to this article:

Published online: 20 Oct 2020.

Submit your article to this journal

View related articles

View Crossmark data

Full Terms & Conditions of access and use can be found at

Evaluating scientific thinking among Shanghai’s students of high

and low performing schools
Irfan Ahmed Rinda,b and Bo Ningc
Department of Education, Sukkur IBA University, Sukkur, Pakistan; bTeachers’ College, Columbia University, New York City, New York, USA;
Research Institute for International and Comparative Education, Shanghai Normal University, Shanghai, China


Shanghai is dubbed as a role model for science and mathematics education as its fifteen-year-olds Received 22 April 2020
have been outperforming all in the Program for International Student Assessment (PISA) since Revised 29 August 2020
2009. Shanghai’s achievements are attributed to its interest in adopting innovative international Accepted 1 October 2020
trends in education and equally effectively implementing these at its high and low-performing
schools. One such trend in science education is based on the univariable and Control of Variable Multivariable thinking;
(CoV) strategies. This model is also used in constructing higher-order thinking items in science control-of-variables; scien-
and mathematics assessments in PISA. Our first objective was to understand if students of tific thinking;
Shanghai mastered the CoV strategies. Beyond CoV models, the emerging trend in science educa- science education
tion promotes multivariable thinking among young adolescents. Our second objective was to
understand if Shanghai has adopted this emerging trend and prepared its students on multivari-
able thinking. Using specially designed and previously validated assessments, we measured and
compared the CoV and multivariable thinking skills of fifteen-year-olds representing one high and
one low-performing school. Our results highlighted the equally exceptional performance of both
schools in the CoV tasks and comparatively poor performance in the multivariable causal reason-
ing and prediction tasks. These findings may offer one aspect to understand Shanghai’s perform-
ance in the PISA, at the same time highlight the weaknesses in its contemporary
science education.

Introduction producing ‘test-taker experts’ who perform exceptionally

well in the examinations of any sort, including PISA, with-
The 2009 results of the Program for International Student
out having mastery over higher-order thinking skills
Assessment (PISA) caught the attention of the world when
(Hongbing, 2010; Ringmar, 2013). Others criticized the high
Shanghai outperformed all the major countries in science
scores of students which are achieved at the cost of a heavy
and mathematics. Considering that PISA assesses higher-
order thinking skills, including critical thinking and prob- workload, low social skills, and insufficient development of
lem-solving abilities of fifteen-year-old school students in creativity (Xie, 2010; Zhang & Akbik, 2012; Zhao, 2011).
science and mathematics (OECD, 2015), it was surprising Most highlighted criticism, which is also relevant to this
for both national and international critics to comprehend study, was related to the representativeness of the sample of
Shanghai’s performance (for a detailed review, see Zhang & PISA, questioning whether Shanghai chose the high-per-
Akbik, 2012). Nevertheless, Shanghai has maintained its top forming schools or students for the test and that the low-
position in science and mathematics in the PISA of 2011, performing schools of suburban areas would struggle in
2015, and 2018. This continuous educational development of such international tests (Dillon, 2010).
Shanghai attracted different countries, which looked toward Liang et al. (2016) have presented a comprehensive
Shanghai for teaching styles of science and mathematics response to these criticisms and highlighted different reform
(Coughlan, 2014; Gibb, 2015; OECD 2010, 2011). initiatives behind Shanghai’s success. They stressed on the
Since China has historically cherished the traditional leading position Shanghai has taken in China’s socioeco-
standardized examinations that promote drills and rote nomic and educational reforms since 1978, which made it
memorization (Tan & Ng, 2018), how students of Shanghai sensitive to global trends. Shanghai has adopted an open-
excelled in the higher-order thinking skills became a topic door policy and opened itself up for innovative international
of debate in academic circles. Some criticized the education trends in education, most of which are mastered with trial
system of Shanghai as ‘elitist,’ one that rewards the high per- and error (Zhang & Akbik, 2012). One such contemporary
formers and abandoned the low performing students, thus model of science education is based on the univariable

CONTACT Bo Ning Research Institute for International and Comparative Education, Shanghai Normal University, Guilin Road 100,
Building Jiaoyuanlou, Room 1106, Shanghai 200234, China.
ß 2020 Taylor & Francis Group, LLC

models, particularly the Control of Variable (CoV) — a 393). This experiment is based on single variable logic,
strategy to hold the influence of other variables constant so where all other variables are removed or kept constant
that the effect of a focal variable can be identified (Kuhn except the focal one to assess its influence on an outcome
et al., 2008). Such CoV strategies are embedded in the (Zimmerman, 2000, 2007). This strategy is usually referred
national curricula of science education for the development to as CoV and was first introduced by Inhelder and
of scientific reasoning among students (Australian Piaget (1958).
Curriculum, 2015; National Curriculum of England, 2013; These univariable designed experiments help students in
Next Generation Science Standards, 2013). China also constructing data that influence their conceptual representa-
adopted a similar approach to science education (Guan & tions of a single cause and effect (Koerber et al., 2005; Rind
Meng, 2007; Hongbing, 2010; Huang et al., 2017; Jiaoshou, et al., 2019; Rind & Kadiwal, 2016).
2004). Interestingly, these univariable and CoV models are However, this model has its limitations. For example, it
also used in constructing many higher-order thinking items has been argued that construal of mechanism and processing
in the assessments of science in PISA (OECD, 2015; Wasis, of evidence very often function together to support causal
2014). Have students of Shanghai mastered the CoV strat- inference (Keil, 2006; Klahr, 2002; Koslowski, 1996). With
egies and therefore performed well in the PISA? This is an this assumption, student’s data-influenced conceptual repre-
interesting question that has never been raised in evaluating sentations of causal effects should be facilitated by generat-
Shanghai’s performance in PISA. ing plausible mechanisms that might underlie such effects
Although CoV is an important strategy, in the real world (Kuhn, 2016). However, if a student’s explanation and avail-
of science and scientific thinking, multiple variables con- able evidence conflict, it is most probable that the student’s
tinue to be a major presence. This limitation of CoV models explanation will overpower the evidence (Williams et al.,
raised the importance of multivariable thinking— the coord- 2013). Therefore, students must not only learn to construct
ination of the effects of multiple variables on an outcome data-based explanations but also learn to evaluate whether
(Kuhn et al., 2008). This emerging trend in science educa- their explanations are correct. These limitations of the uni-
tion is getting its place in the West. Has Shanghai already variable model have generated concerns and criticisms
adopted this emerging trend? Most importantly, are these among educators and researchers who encourage science as
scientific thinking skills, i.e., CoV and multivariable think- inquiry and that scientific practice should be enriched by
ing, equally promoted among the students of high and low- contextualization through the use of a more authentic multi-
achieving schools around the greater Shanghai region? All variable model (Duschl, 2008; Kuhn, 2016, 2018; Williams
these unanswered questions reflect a gap in the growing et al., 2013).
body of literature around Shanghai’s educational reforms
and its performance in science. This study attempts to
Multivariable model
bridge this gap by measuring fifteen-year-old students’ CoV
and multivariable thinking skills using specially designed The advocates of the multivariable model argue that the
and previously validated assessments. Focusing on a high purpose of science education is not only to prepare students
and a low-performing school located in the greater Shanghai to deal with the problems presented in the controlled envir-
region, this study offers a new perspective on students’ sci- onment of laboratories but to prepare them for real-life
entific thinking through their CoV and multivariable think- problems. In real-world situations, however, single variables
ing skills. rarely contribute to an outcome. A combination of multiple
variables jointly influences, additively or interactivity, to an
outcome, a concept contrary to the CoV approach. The abil-
Literature review
ity to realize the power of multivariable on an outcome
Much of scientific thinking and scientific theory building enhances students’ deep analytical and predictive skills
pertains to the development of causal models between varia- (Kuhn, 2016; Kuhn et al., 2015). Moreover, students need to
bles of interest. The most common approach used in science comprehend that ‘covariation between two variables need
education for the causal models is referred to as Control of not be perfect in order for a relationship to exist between
Variable (CoV), which, as mentioned earlier, is embedded in them because effects of other factors, as well as measure-
the national curricula of science education of many devel- ment error, are likely to play a role’ (Kuhn et al., 2017, p.
oped countries. 234). This understanding of multivariable causality is a
move toward authentic scientific thinking (Bo et al., 2020;
Rind, in press).
The concept of a variable and its primary role in doing sci- Method
ence is usually introduced in early grades. This concept is
usually introduced through a simple experiment: ‘An out- This study is mainly guided by the following
come O occurs when a positive level (e.g. a flame) of an research questions:
antecedent variable A, such as heat, is introduced, but O
fails to occur when this positive level of A is absent (e.g. no 1. To what extent are the CoV skills developed among the
flame or other heating agent is present)’ (Kuhn, 2016, p. students of High Performing School (HPS) and Low

Figure 1. An example of a question asked in the CoV assesment.

Performing School (LPS), and is there any significant Participants

difference in the performance of both schools?
Participants were 26 fifteen-year-old 9th-grade students (with
2. To what extent are the multivariable causal reasoning
an equal representation of males and females) drawn from
skills developed among the students of HPS and LPS,
one HPS located in downtown Shanghai and 29 students of
and is there any significant difference in the perform-
the same age and grade (female 58%) from one LPS located
ance of both schools?
in the outskirts of the Jinshan district of Shanghai. Schools
3. To what extent are the multivariable analysis and pre-
in Shanghai are usually tagged as high or low-performing by
diction skills developed among the students of HPS and
the school district administrators using a broader evaluation
LPS, and is there any significant difference in the per-
and accountability practices (see Jensen & Farmer, 2013 for
formance of both schools?
a detailed review). Since we were interested in students’

performance, we collected data on students’ scores of the item of the pair had four options, each option presenting
last three grades. On average, the students of the HPS different justifications related to the first item (see Figure 1
secured 86% scores in earlier grades, whereas, the students for an example). Students were required to evaluate and
of LPS secured 67% scores. All the students of former school identify the correct design and justification for each pair.
received privately paid at-home tutoring in the last three The aim was to assess how many students from each group
years, compared to only 32% of the students in the latter consistently constructed a control comparison across three
school (see Zhang & Bray, 2018 for a detailed review on the cases as well as provided appropriate justifications.
impact of private tutoring on students’ performance).
Household income of parents may explain the discrimin-
ation of access to private tutoring. According to the data Assessment of multiple causal reasoning skills
collected from schools and students, it is estimated that The second test was adopted from Kuhn et al. (2015) to
roughly 85% of the students of the former school belonged assess students’ multiple causal reasoning skills. It was based
to upper- and upper-middle-class families, thus majority of on a single, unelaborate scenario to generate initial
them have had the access to expensive private tutoring. In responses from the students. The responses were assessed to
contrast, most of the students in the latter school belonged determine if students made a causal claim and proposed a
to lower-middle-class families, and therefore about one-third means of evaluating it. The scenario reads as follows:
could afford private tutoring. The Public Health department of Portland Ohio has noticed
For data collection, we approached the management of that the percentage of residents diagnosed with cancer is much
both schools via email followed by a face-to-face meeting. higher in the inner city than in the outlying neighborhoods. The
After ensuring that the identities of the schools and students department is undertaking a study to find out why. You have
been assigned the job. Describe how you would conduct
will be kept anonymous, the management of both schools
this study.
agreed to grant us access to the students. On our request,
one science teacher from each school volunteered to facili- The responses generated from this task were analyzed
tate us in data collection. After explaining them in great and coded to pinpoint if students identified the potential
detail on the nature of the tasks, we requested the facilitat- cause(s), and how they related different causes to one
ing teachers to present these tasks as graded activities among another, as well as to examine if students compared two
students to get their maximum input. After reviewing the kinds of contexts (i.e. inner city and outlying neighbor-
assessments and consultation with school management, the hoods). Using the analytical model of Kuhn et al. (2015, p.
facilitating teachers agreed. However, to align our tasks with 99), students’ responses were coded into one of the follow-
the pre-scheduled class activities, the facilitating teachers ing three levels and sub-levels:
proposed to conduct all the three assessments (see 3.2 for
details) on different days. The facilitating teacher of the LPS 1. Responses that didn’t find any causal variables.
also agreed to let one of us conduct the same assessments 1.1. No-variable without group comparison
on selected students of grade 8 for pilot testing (see 3.2.4 1.2. No-variable with group comparison
for details). 2. Responses that identified a single variable
2.1. Univariable without group comparison (i.e. only
A is the cause of an outcome)
2.2. Univariable with group comparison
Three assessments used in this study were prepared by 3. Responses that identified two or more variables and the
Kuhn and colleagues for the students of the same age group. way the relationship between these variables were
The first task assessed students’ CoV skills. The next two conceptualized
tasks assessed students’ multivariable thinking, particularly 3.1. Multivariable-alternative with group comparison
their multivariable causal reasoning skills, and multivariable (i.e. either A or B is the cause of an outcome)
analysis and prediction skills. 3.2. Multivariable-alternative without group comparison
3.3. Multivariable-additive with group comparison
(i.e. both A and B and possibly C are the causes
Assessment of CoV skills of an outcome)
This assessment was adopted from Kuhn et al. (2017), which 3.4. Multivariable-additive without group comparison
reads as follows: 3.5. Multivariable-interactive with group comparison
The city of New York wants to get new trains for the subway. (i.e. A and C effect B, and B is the cause of
They can make trains with short or long Car Size and they can an outcome)
make trains with four or six Number of Wheels
3.6. Multivariable-interactive without group comparison
This description was followed by three pairs of multiple-
choice items. Each pair presents a case for students to evalu- The responses of each student were taken as a whole and
ate regarding different car designs and offers different justi- were coded into one of the above levels. One of the authors
fications to students for their choice. The first item of each and a native Chines data analyst coded all responses inde-
pair had four options, each option having two images of car pendently, yielding a percentage agreement of 89%. The dif-
designs with different sizes and several wheels. The second ferences were resolved by discussion.

Figure 2. Life expectancy scenario followed by items to assess the multivariable analyis and prediction.

Assessment of multivariable analysis and prediction skills highly effective contributing factors (Employment and
This, a slightly more sophisticated, test was used to assess Family Size), two moderately effective contributing factors
students’ multivariable analysis and prediction skills. (Education and Home Climate), and one noncontributing
Developed by Kuhn et al. (2015) and further validated by factor (Country Size) on the average Life Expectancy (LE)
other studies (Arvidsson, 2018; Kuhn et al., 2017), this task across different countries. The data were simplified and pre-
presented authentic data from the World Bank 2010 on two sented in the graphs (see Figure 2). The task started with a
description (see Figure 2). Following an introduction,

Table 1. Type of students’ responses in percentage to identify the cause(s) of an outcome with and without group.
No-variable Univariable Multivariable-alternative Multivariable-additive Multivariable-interactive
Without Group With Group Without Group With Group Without Group With Group Without Group With Group Without Group With Group
HPS (N ¼ 26) – 8 46 19 8 12 4 – 4 –
LPS (N ¼ 29) 14 7 52 17 10 – – – – –
Note: HPS and LPS refer to high-performing school and low-performing school respectively.

students were asked to predict the LE of nine countries, out any word or phase in the task or in the oral instruction
based on their status on the five identified variables (i.e. of the task that they couldn’t understand. Meanwhile, the
Employability, Family Size, Education, Climate, and Country researcher evaluated the scores of randomly selected three
Size) . students. The researcher, then, conducted an oral examin-
The first question asked students to highlight the fac- ation of each of the three students separately. During the
tor(s) (i.e. Employability, Family Size, Education, Climate, oral examination, students were asked 3 to 4 questions con-
and Country Size) that they would need to know to predict structed on the CoV principle. Their responses were scored,
the LE of a country. The aim was to assess if students were which were then compared with their scores on the CoV
able to recognize the need to take into account all the four task. The same process was repeated for task 2 and task 3.
contributing factors, but not the noncontributing factor, in The comparative analysis of scores of tasks and oral exami-
predicting the LE of the country. nations suggested that all three tasks assessed what they
The second question presented the levels of the five fac- were intended to assess.
tors (for example, Employability¼ high; Family Size¼ small, Based on students’ feedback on the tasks, some words
Education¼ higher, Climate¼ Hot, and Country Size¼ and phrases were modified for better comprehension.
small) and asked the students to highlight which factor(s) Students viewed the researcher as an outsider, thus experi-
they do not need to predict the LE of a country. Here, the enced fear and anxiety. Therefore, it was decided that facili-
aim was to assess if students were able to identify the non- tating teachers from each school would conduct the real
contributing effect of “Country Size” on the LE of tests for the ease and comfort of the students.
the country.
The final question asked students to predict the LE (i.e.
Analysis and results
High, Medium, Low, or Very Low) of the country based on
the information provided. The aim was to assess if students Findings on COV skills
considered all relevant factors and appropriately weighed
This task attempted to answer the first research question,
moderately and highly effective factors to construct a valid
i.e., how students of HPS and LPS performed on the CoV
prediction. As seen in Figure 2, the weight of two moder-
assessment and if there is any significant difference in the
ately effective contributing factors (Education & Home
performance of both schools. It is done by analyzing stu-
Climate) contributes less to the outcome than do the two
dents’ skills to consistently construct a control comparison
highly effective contributing factors (Employment & Family
and provide appropriate justification across all the three
Size). The same sequence was repeated for all nine countries.
cases presented in the task. The results show that 24 of 26
The responses on nine countries were analyzed to find out
students (92%) of HPS, and 26 of 29 students (90%) of LPS
the number of correct predictions, attributions to contribu-
consistently constructed a controlled comparison across all
ting and noncontributing factors in the prediction, and the
three cases, with no significant difference in the perform-
consistency of the attribution across nine countries.
ance of both groups, X2(1) ¼ .525, p ¼ .469.
In order to ensure that students understood the task, the
Similarly, 22 students (84%) of HPS and 25 students
facilitating teachers of each school read the description
(86%) of LPS consistently constructed a controlled compari-
aloud and explained each graph to the class. The teachers
son and also provided appropriate justifications across all
then asked one random student from each group to attempt
three cases, with no significant difference in the perform-
the first set of the questions aloud, just to ensure that the
ance of both groups, X2(1) ¼ .035, p ¼ .853.
students understood the activity.
These findings suggest that students of both schools have
a good command of the problems based on CoV principles
Validation and pilot testing and that there is no significant difference in their perform-
All three assessments we translated into Mandarin and pilot ance on the CoV task.
tested on a separate group of students. To check the validity
of the translated assessments, we followed different proce-
Findings on multiple causal reasoning skills
dures Kuhn and colleagues used in their studies. First, we
selected 10 students from the LPS. One of the researchers, This task attempted to answer the second research question,
in the presence of facilitating teacher, instructed students on i.e., how students of HPS and LPS perform on the multivari-
the first task and allowed them sufficient time to complete able causal reasoning assessment and if there is any signifi-
it. In the end, the facilitating teacher asked students to point cant difference in the performance of both schools. To

Table 2. Correct predictions across nine countries. Table 3. Average scores of contributing and noncontributing factors influenc-
Predictions HPS (N ¼ 26) # (%) LPS (N ¼ 29) # (%) ing students’ prediction across nine countries.
Nine countries 6 (23) 5 (17) Average (SD)
Eight countries 4 (15) 4 (14) Factors HPS LPS t p
Seven countries 8 (31) 10 (34)
Six countries 4 (15) 5 (17) Contributing Factor-1 (Employment) 6.3 (1.22) 5.9 (1.83) 2.47 .173
Five countries 3 (12) 1 (4) Contributing Factor-2 (Family Size) 6.6 (1.33) 6.1 (1.00) 1.104 .275
Four countries 0 4 (13) Contributing Factor-3 (Education) 4.2 (.94) 4.8 (.75) 1.129 .258
Three countries 1 (4) 0 Contributing Factor-4 (Climate) 3.6 (1.36) 3.5 (1.30) .355 .735
Noncontributing Factors (Country Size) 2.7 (.80) 3 (1.30) -.958 .354
Note: HPS and LPS refer to high-performing school and low-performing school
respectively. Note: HPS and LPS refer to high-performing school and low-performing school

answer this question, the given task assessed two skills of

strong factors) for each of the nine countries (Kuhn
the students. First, it assessed if students were able to iden-
et al., 2017). Each correct prediction was given 1 point.
tify the potential cause(s) of ‘cancer in the inner city and
As measured against this model, the students of HPS
outlying neighborhoods’, and ways in which students related
scored an averaged of 7.8 (SD ¼ 1.51; Shapiro-Wilk Test
different causes to one another. Second, if students com-
> .05) points, whereas the students of LPS scored an
pared two kinds of contexts (i.e. inner city and outlying
average of 7.2 (SD ¼ 1.78; Shapiro-Wilk Test > .05)
neighborhoods). Table 1 summarizes the results showing
that most of the students of both schools either identified points. There is no significant difference in the perform-
no variable or single-variable causes for the out- ance of both groups in this task, t ¼ .322, p ¼ .714.
come (cancer).
In the multivariable categories, most of the responses fell Table 2 shows the consistency of correct predictions
in the multivariable-alternative with no response of the LPS across different countries. Only, 6 of 26 students (23%) of
students and only two responses of HPS students falling HPS correctly predicted the LF of all the nine countries as
into the multivariable-additive and interactive categories. compared to 5 out of 29 students (17%) of LPS, showing no
Collectively, 7 responses (27%) of HPS and 5 responses significant difference, (p ¼ .739, Fisher’s exact test).
(17%) of LPS fell into the multivariable category, with no
significant difference in the performance of both groups (p b. For each country, students had four contributing factors
¼ .517, Fisher’s exact test). to choose from, each contributing factor carries one
Similarly, 10 of 26 (38%) students of HPS and 7 of 29 point. It was expected that multivariable thinking stu-
students (24%) of LPS realized the need to compare the con- dents would choose all four contributing factors thus
texts (i.e. the number of cancer cases in the inner city vs. score 4 points for each country, and 36 points for all
outlying neighborhoods), which also shows no significant the nine countries. The average scores of all the contri-
difference in the performance of both groups, (p ¼ .381, buting factors for the students of HPS and LPS are 21.7
Fisher’s exact test). (SD ¼ 5.46; Shapiro-Wilk Test > .05) and 22.5
These results show that most of the students of both HPS (SD ¼ 5.78; Shapiro-Wilk Test > .05), respectively, with
and LPS tend to consider single variables in determining the no significant difference, t ¼ -.539, p -.592. Table 3
cause of an outcome. A very few students of both schools shows the average scores on each contributing factor
were multivariable thinkers and felt the need to identify against nine countries for the students of HPS and LPS.
multiple factors to determine an outcome, and the need to It shows that there is no significant difference in the
compare two contexts to draw their conclusions. average scores of both schools’ students for the two
highly effective and two moderately effective contribu-
ting factors.
Findings on the assessment of multivariable analysis
and prediction skills In this task, it was also expected that students would not
This task attempted to answer the third research question, consider the noncontributing factor (i.e. Country Size) to
i.e., how students of HPS and LPS perform on the multivari- predict the LF of the countries due to its neutralizing effects
able analysis and prediction assessment and if there is any (see Figure 2). For each country, students were given one
significant difference in the performance of both schools. To point if they did not select the noncontributing factor.
answer this question, the task assessed if (a) students’ cor- There were a total of nine points. Table 3 shows that the
rectly predicted the LF across nine countries, (b) students students of HPS scored an average of 2.7 (SD ¼ .80;
were able to identify the contributing and noncontributing Shapiro-Wilk Test > .05) on noncontributing factor,
factors in their prediction, and (c) the consistency in identi- whereas, students of LPS scored an average of 3 (SD ¼ 1.30;
fying the four contributing factors in their predictions across Shapiro-Wilk Test > .05), with no significant difference, t ¼
the nine countries. .958, p ¼ .354.
Moreover, Table 3 also shows a pattern in the average
a. Students’ predictions were compared to a model of cor- scores of students from high to low in the spectrum of
rect prediction (weighting the two moderate factors as highly effective contributing factors to a noncontributing
contributing half as much to the outcome as the two factor. This suggests that students were tempted to focus

Table 4. Consistency of students’ attributions.

Attributions HPS (N ¼ 26) # (%) LPS (N ¼ 29) # (%)
Chose four contributing factors consistently across 9 countries 7 (27) 5 (17)
Chose multiple consistent (but not all four contributing) factors across 9 countries 7 (27) 12 (41)
Chose multiple but inconsistent factors across 9 countries 11 (42) 7 (24)
Chose only one consistent causal factor across 9 countries 1 (4) 5 (17)
Chose only one but inconsistent factor across 9 countries 0 0

more on the highly effective contributing factors as influenc- such courses from grades 1 to 5, and two courses from
ing their predictions. grades 6 to 9. Such freedom in course design and implemen-
tation may allow teachers to be more creative in their teach-
c. Table 4 shows students’ consistency in identifying the ing (Liang et al., 2016).
four effective contributing factors in their prediction At the same time, the equally good performance of LPS
across the nine countries. Only 7 of 26 students (27%) on the CoV task in this study may offer a glimpse of educa-
of HPS chose all the four effective contributing factors tional reforms Shanghai introduced to deal with the regional
consistently across all nine predictions, as compared to disparity in teaching quality. Using different financing, man-
5 of 29 students (17%) of LPS, with no significant differ- agement, and twinning strategies, Shanghai has improved
ence, (p ¼ .517, Fisher’s exact test). These results show the low-performing schools in many ways. For example, the
that only a few students realized the contribution of municipal government transfers half of education surplus
multiple factors affecting an outcome consistently. The taxes to economically weak districts, and mandates that top
majority showed inconsistency in realizing the need to secondary schools reserve some spaces for middle school
consider multiple factors in their predictions, which graduates from poor districts (Liang et al., 2016). The most
reflect their weak multivariable thinking skills. important initiative has been the introduction of an
‘entrusted school’ management system (Jensen & Farmer,
2013), which allows additional financing, management, and
Discussion and conclusion professional support from HPSs to LPSs. This marriage
The present study offers a new perspective on scientific between HPSs and LPSs generates mutual professional
thinking skills among the Shanghai students of high and development training, teacher and student exchanges, and
low-performing schools. The findings show that the students joint goals (Liu, 2016).
of both schools performed well in the CoV task and that Despite the remarkable performance on the CoV task,
there is no significant difference in the performance of both students of both schools performed poorly on the multivari-
schools. As mentioned earlier that the CoV task was adopted able causal reasoning and prediction tasks. These findings
from Kuhn and colleagues who used it to measure the CoV highlight that although the students reflected developed sci-
skills among the USA students of the same age. They found entific thinking skills based on the CoV principle, there are
exceptional results on this task by those students who were opportunities to achieve what Kuhn et al. (2017, p. 247) call
specially trained on these CoV skills for weeks (Jewett & ‘authentic scientific practice’ that incorporates multivariable
Kuhn, 2016; Kuhn et al., 2017). In this study, however, thinking among students. Kuhn et al. (2017) argue that the
Shanghai students of both high and low-performing schools execution of CoV-based experiments presents a ‘narrow slice
scored exceptionally well on the same task without any of authentic scientific inquiry and arguably needs to be
training. This suggests that Shanghai schools, either high or enriched by contextualization in a more authentic multivari-
low-performing, have been effectively teaching CoV strat- able model’ (p. 233). Authentic scientific practice prepares
egies through the regular science curriculum. These findings students for real-life situations. Everyone makes inferences
may also provide a new perspective to explain Shanghai’s about causality as a part of thinking (Bo et al., 2020; Rind,
good performance in the PISA, considering that most of the 2014, 2015, 2016; Rind et al., 2016, 2019). However, single
higher-order items in the science and mathematics section cause thinking only prepares students for the controlled
of PISA are based on the CoV principle (OECD 2015). environment of laboratories. Although contemporary
The good performance of the students on the CoV task research on developing scientific thinking skills involves the
may offer a glimpse of the impact of recent reforms in sci- management of multiple variables, the focus is on neutraliz-
ence and mathematics education in Shanghai that have spe- ing the effects of additional variables to ascertain the effect
cifically focused on the development of scientific thinking of a single variable on an outcome. This trend in science
through inquiry curriculum and teaching (Liang et al., education has attracted Shanghai, which adopted this model
2016). The review of the science and mathematics curricu- and mastered it, as reflected in the PISA scores as well as
lum suggests that many activities are mainly based on the from the findings of the current study.
CoV principle (Ma, 2016). Teachers have also trained them- The findings of this study have wider implications
selves accordingly. The multi-layered training offered to the beyond Shanghai, as this study attempted to highlight one
teachers at different stages of their careers reinforces the use of the core principles, i.e., CoV, of scientific thinking used
of CoV strategies in their teaching (Yao & Guo, 2018). in the national curricula of science all around the world,
Besides, each school is allowed to develop contextually rele- and showed its limitations. The emerging research in scien-
vant, inquiry-based courses and introduce at least one of tific thinking argues that mastery over the use of CoV

strategies does not guarantee mastery over the multivariable Funding

causal reasoning or prediction skills (Kuhn, 2007; Kuhn
This work was part of the study entitled an empirical study of princi-
et al., 2009, 2015; Kuhn & Pease, 2008). The students pals’ living condition in primary and secondary schools, sponsored by
trained on the CoV skills trend to adopt a univariable Chinese National Social Sciences Fund Education Youth Project
approach in determining the causal relationship, which is (No. CHA180269).
reflective in the performance of the students of both schools
on the multivariable causal reasoning task. Most of the stu-
dents either pointed out to single or no variables in deter-
mining the outcome. Similarly, on the multivariable analysis
and prediction task, students made the faulty predictions Irfan Ahmed Rind
and were inconsistent in considering contributing and non- Bo Ning
contributing factors on their predictions. Kuhn and Dean
(2004) found similar results among adolescents and some References
adults while assessing their multivariable analysis and
Arvidsson, T. S. (2018). Individualized scaffolding of scientific reasoning
prediction skills. They found that apart from making faulty
development-complementing teachers with an auto-agent. Columbia
predictions, their research participants often made inconsist- University.
ent causal attributions across consecutive predictions. Australian Curriculum. (2015). Australian senior secondary curriculum
Moreover, they often failed to implicate as many variables as for science.
influencing their predictions as they had earlier identified as lum/f-10?layout=1
Bo, N., Rind, I. A., & Asad, M. (2020). Influence of teacher educators on
causal when asked to make explicit judgments of the causal
the development of prospective teachers’ personal epistemology and tol-
roles of each variable (in a multivariable context in erance. Sage Open, 10(1), 215824402091463–215824402091414. https://
which a CoV method must be used to identify casual and
non-casual effects). Coughlan, S. (2014). Shanghai teachers flown in for maths. BBC News.
Kuhn et al. (2008) explained these faulty errors of the
individuals by conceptualizing the relationship between CoV Dean, D., Jr., & Kuhn, D. (2007). Direct instruction vs. discovery: The
long view. Science Education, 91(3), 384–397.
and multivariable analysis. They argued that both CoV and Dillon, S. (2010). Top test scores from Shanghai stun educators. The
multivariable analysis and prediction can be regarded in an New York Times.
analysis of variance (ANOVA) framework. The first assump- 07education.html
tion of ANOVA is that causes have consistent effects under Duschl, R. (2008). Science education in three-part harmony: Balancing
the same considerations, and second that multiple effects conceptual, epistemic, and social learning goals. Review of Research in
Education, 32(1), 268–291.
may operate jointly on an outcome, in either additive or Gibb, N. (2015). The maths teachers of Shanghai have the perfect for-
interactive manner. However, the faulty errors made by mula for learning. The Guardian.
Kuhn and Dean’s (2004) participants as well the students in commentisfree/2015/nov/26/maths-teachers-shanghai-china-uk
this study, show a pattern, which violated the analysis of Guan, Q., & Meng, W. (2007). China’s new national curriculum
variance assumptions of consistency and additivity of effects. reform: Innovation, challenges and strategies. Frontiers of Education
in China, 2(4), 579–604.
Kuhn and Dean (2004) characterized these patterns as Hongbing, J. (2010). Using International Standards to Examine
reflecting an immature mental model of multivariable caus- Chinese Education. 国际标准“体检”中国教育. Section 12. People’s
ality. This study as well as earlier research (Dean Jr & Kuhn, Daily 人民日报.
2007; Kuhn & Dean Jr, 2004; Rind, in press; Zohar, 1995) Huang, X., Ding, L., & Hu, B. (2017). Science curriculum and imple-
makes it clear that children and even many adults can not mentation in senior secondary school. In Liang, Ling L., Liu,
Xiufeng, Fulmer, Gavin W. (Eds) Chinese science education in the
readily conceptualize additive main effects, much less the 21st century: Policy, practice, and research (pp. 101–132). Springer.
interaction effects. This calls to upgrade the current Inhelder, B., & Piaget, J. (1958). The growth of logical thinking from
approach that conceptualizes scientific thinking using uni- childhood to adolescence: An essay on the construction of formal oper-
variable models. There is a need to move to multivariable ational structures (Vol. 22). Psychology Press.
Jensen, B., & Farmer, J. (2013). School turnaround in Shanghai: The
models to achieve a more authentic scientific thinking.
empowered-management program approach to improving school per-
We conclude with an acknowledgment that the conclu- formance. Center for American Progress.
sions drawn from this study are not representative of all of Jewett, E., & Kuhn, D. (2016). Social science as a tool in developing
Shanghai. We used a very small sample of students from scientific thinking skills in underserved, low-achieving urban stu-
only two schools to make our case. However, what we have dents. Journal of Experimental Child Psychology, 143, 154–161.
tried to show here is that Shanghai has great potential to
Jiaoshou, G. L. B. (2004). China’s science curriculum reform. Asia-
bring highly effective change to its education system. It Pacific Forum on Science Education, 5(2).
adopted the univariable and CoV models in its science edu- apfslt/v5_issue2/foreword/cindex.htm
cation and has surprised the world with its extraordinary Keil, F. C. (2006). Explanation and understanding. Annual Review of
performance in PISA since 2009. However, the contempor- Psychology, 57, 227–254.
ary discourse in science education is moving slightly away 102904.190100
Klahr, D. (2002). Exploring science: The cognition and development of
from univariable models toward multivariable models. This discovery processes. MIT press.
may be something that Shanghai and the rest of the world Koerber, S., Sodian, B., Thoermer, C., & Nett, U. (2005). Scientific rea-
need to consider for the future educational reforms. soning in young children: Preschoolers’ ability to evaluate

covariation evidence. Swiss Journal of Psychology, 64(3), 141–152. Rind, I. A. (2015). Gender identities and female students’ learning experiences in studying English as Second Language at a Pakistani
Koslowski, B. (1996). Theory and evidence: The development of scientific University. Cogent Education, 2(1), 1115574.
reasoning. Mit Press. 2331186X.2015.1115574
Kuhn, D. (2007). Reasoning about multiple variables: Control of varia- Rind, I. A. (2016). Conceptualizing students’ learning experiences in
bles is not the only challenge. Science Education, 91(5), 710–726. English as Second Language in higher education from structure and agency. Cogent Social Sciences, 2(1), 1191978.
Kuhn, D. (2016). What do young science students need to learn about 1080/23311886.2016.1191978
variables? Science Education, 100(2), 392–403. Rind, I. A. (In press). Developing adult students’ multivariable thinking
1002/sce.21207 capabilities. International Journal of Science and Mathematics
Kuhn, D. (2018). A role for reasoning in a dialogic approach to critical Education, 19.
thinking. Topoi, 37(1), 121–128. Rind, I. A., & Kadiwal, L. (2016). Analysing institutional influences on
9373-4 teaching– learning practices of English as second language pro-
Kuhn, D., Arvidsson, T. S., Lesperance, R., & Corprew, R. (2017). Can gramme in a Pakistani university. Cogent Education, 3(1), 1160606.
engaging in science practices promote deep understanding of them?
Science Education, 101(2), 232–250. Rind, I. A., Mari, M. A., & Heidari-Shahreza, M. A. (2019). Analysing
Kuhn, D., & Dean Jr, D. (2004). Connecting scientific reasoning and the impact of external examination on teaching and learning of
causal inference. Journal of Cognition and Development, 5(2), English at the secondary level education. Cogent Education, 6(1).
Kuhn, D., Iordanou, K., Pease, M., & Wirkala, C. (2008). Beyond con- Rind, I. A., Shahriar, A., & Fatima, S. (2016). Rural-ethnic identities &
trol of variables: What needs to develop to achieve skilled scientific students’ learning experiences in english as second language pro-
thinking? Cognitive Development, 23(4), 435–451. gramme in a public sector university of pakistan. The Sindh
1016/j.cogdev.2008.09.006 University Journal of Education, 45(1), 1–20.
Kuhn, D., & Pease, M. (2008). What needs to develop in the develop- Ringmar, S. (2013). Here’s the truth about Shanghai schools: They’re
ment of inquiry skills?. Cognition and Instruction, 26(4), 512–559. terrible. The Guardian. free/2013/dec/28/shanghai-china-schools-terrible-not-ideal
Kuhn, D., Pease, M., & Wirkala, C. (2009). Coordinating the effects of Tan, C., & Ng, C. S. L. (2018). Assessment reform in Shanghai: Issues
multiple variables: A skill fundamental to scientific thinking. Journal and challenges. International Journal of Educational Reform, 27(3),
of Experimental Child Psychology, 103(3), 268–284. 291–309.
10.1016/j.jecp.2009.01.009 Wasis. (2014). Analyzing physics items of UN, TIMSS, and PISA based
Kuhn, D., Ramsey, S., & Arvidsson, T. S. (2015). Developing multivari- on higher-order thinking and scientific literacy. In International
able thinkers. Cognitive Development, 35, 92–110. Conference on Research, Implementation and Education of
1016/j.cogdev.2014.11.003 Mathematics and Sciences (pp. 147––154). https://www.academia.
Liang, X., Kidwai, H., & Zhang, M. (2016). How Shanghai does it: edu/download/54606793/17-Wasis.pdf
Insights and lessons from the highest-ranking education system in the Williams, J. J., Lombrozo, T., & Rehder, B. (2013). The hazards of
world. The World Bank. explanation: Overgeneralization in the face of exceptions. Journal of
Liu, P. (2016). Transforming turnaround schools in China: Strategies, Experimental Psychology: General, 142(4), 1006–1014.
achievements, and challenges. Frontiers of Education in China, 11(3), 10.1037/a0030996
374–414. Xie, X. (2010). What does Shanghai PISA test tell us? The Balanced
Lu, J., & Xiaohu, L. (2011). How to view 2009 Shanghai PISA result? development of education is not impossible. PISA 测试上海夺冠回答
The review of the reflection on Shanghai middle school students’ first 了什么—教育均衡并非可望不可. 中国青年报. China Youth.
participation in the international test. 如何看待上海2009 年PISA 测 Yao, J.-X., & Guo, Y.-Y. (2018). Core competences and scientific liter-
评 结 果 —中 国 上 海 中 学 生 首 次 参 加 国 际 测 评 结 果 反 响 述 评 . acy: The recent reform of the school science curriculum in China.
Shanghai Research on Education 上海教育科研, 11–19. International Journal of Science Education, 40(15), 1913–1933.
Ma, M. (2016). The development and characteristics of elementary sci-
ence curriculum in China. US-China Education Review, 6(11), Zhang, W., & Bray, M. (2018). Equalising schooling, unequalising pri-
642–649. vate supplementary tutoring: Access and tracking through shadow
National Curriculum of England. (2013). National curriculum of education in China. Oxford Review of Education, 44(2), 221–238.
England: Science programmes of study.
ment/publications/national-curriculum-in-england-science-pro- Zhang, C., & Akbik, A. (2012). PISA as a legitimacy tool during China’s
grammes-of-study education reform: Case study of Shanghai. TranState Working Papers
Next Generation Science Standards. (2013). Next generation science No. 166, Universit€at Bremen.
standards. Zhao, X. (2011). Development under stress: The culture of academic
OECD. (2010). PISA & TALIS 2008 technical report. OECD Pub. competition and adolescent friendship participation in China’s second-
OECD. (2011). Successful reformers in education: Lessons from PISA for ary school. Harvard Graduate School of Education.
the United States. OECD. Zimmerman, C. (2000). The development of scientific reasoning skills.
OECD. (2015). Summary description of the seven levels of proficiency in Developmental Review, 20(1), 99–149.
science in PISA 2015. Organisation for Economic Co-Operation and 1999.0497
Development. Zimmerman, C. (2007). The development of scientific thinking skills in
seven-levels-of-proficiency-science-pisa-2015.htm elementary and middle school. Developmental Review, 27(2),
Rind, I. A. (2014). Investigating students’ experiences in an ESL pro- 172–223.
gramme at the University of Sindh, Jamshoro, Pakistan. Compare: A Zohar, A. (1995). Reasoning about interactions between variables.
Journal of Comparative and International Education, 44(6), Journal of Research in Science Teaching, 32(10), 1039–1063. https://

You might also like