1 Kulgemeyer 2018

Received: 12 April 2017
| Revised: 30 November 2017

| Accepted: 3 March 2018
DOI: 10.1002/tea.21457
RESEARCH ARTICLE
From professional knowledge to professional

performance: The impact of CK and PCK on
teaching quality in explaining situations
Christoph Kulgemeyer1 | Josef Riese2
1
Physics Education Department, Institute
Abstract
of Science Education, University of
Bremen, Bremen, Germany In recent years, many studies have researched the impact of
2
Physics Education Department, Institute teachers’ professional knowledge on teaching quality. Those
of Physics IA, RWTH Aachen University, findings are still ambiguous providing unclear evidence for
Aachen, Germany supporting teacher education programs that aim at develop-
Correspondence ing professional knowledge. In this study, we followed a
Christoph Kulgemeyer, Physics new approach and used a “performance test” to take a closer
Education Department, Institute of look at the impact of professional knowledge on teaching
Science Education, University of quality. We simulated one particular teaching situation in a
Bremen, Otto-Hahn-Allee 1, Bremen
controlled, standardized setting in which student teachers
28359, Germany.
Email: Kulgemeyer@physik.uni-bremen.de
(N 5 109) enrolled in physics teacher training courses at five
German universities had to explain given phenomena to
Funding information high-school students. These high-school students were
Bundesministerium f€ur Bildung und For- trained to behave in a standardized way and to ask standar-
schung, Grant/Award Number: dized questions. Videos of these situations were analyzed
01PK11001B
using an established model of explaining physics. The valid-
ity, reliability, and objectivity of these performance tests for
explaining physics were examined in previous studies. In
this article, we report on the analysis and interpretation of
the results of our study concerning the impact of physics
content knowledge (CK), pedagogical content knowledge
(PCK), and two groups of beliefs (specific aspects of (i)
self-efficacy and (ii) teaching and learning) that mediate the
effect of CK and PCK on student teachers’ explaining per-
formance. Using path analysis, we can show that student
teachers’ PCK mediates the influence of their CK on
explaining performance in that CK only has a positive influ-
ence if PCK has also increased as well. Our findings stress
the key role of PCK. For one particular teaching situation,
we can show the positive influence of student teachers’ CK
and their PCK they acquired in academic teacher education
on their teaching quality.
J Res Sci Teach. 2018;1–26. wileyonlinelibrary.com/journal/tea V

C 2018 Wiley Periodicals, Inc. | 1
2 | KULGEMEYER AND RIESE
KEYWORDS
explanation, instructional quality, pedagogical content knowledge, physics
education, teaching quality
1 | THEORETICAL BACKGROUND
1.1 | Teaching quality

Various researchers have described teaching quality (sometimes also referred to as instructional quality
or teaching effectiveness) since the 1960s (e.g., Carroll, 1963). Many of these studies aimed at finding
links between teaching practices and student outcomes. Brophy (2000) called this kind of research
“process/product research” with the goal to find components of or structures in teaching that promote
student outcomes. Different meta-analyses have been published since the late 1980s (e.g., O’Neill,
1988) with some more recent ones (e.g., Hattie, 2009; Seidel & Shavelson, 2007). Over the years, three
characteristics of teaching quality have emerged describing teaching quality with a positive impact on
student achievement. In the English-speaking countries, they are often referred to as “emotional sup-
port,” “classroom organization” and “instructional support” following the “Classroom Learning
Assessment Scoring System” (Pianta, Hamre, & Mintz, 2012). In the European tradition, the three
characteristics are more frequently called “constructive support,” “classroom management,” and “cog-
nitive activation” but the meaning differs just slightly (Dorfner, F€ortsch, & Neuhaus, 2017). “Emo-
tional support” or “constructive support” comprises, for example, the sensitivity of a teacher.
“Classroom organization” or “classroom management” means, for example, an effective use of time on
task. “Instructional support” or “cognitive activation” consists of, for example, a coherent development
of concepts or the request for students’ questions. In particular, cognitive activation has been exten-
sively researched recently aiming for an assumed impact of teachers’ professional knowledge on teach-
ing quality (e.g., Keller, Neumann, & Fischer, 2016). For different teaching situations, different means
to realize the three characteristics have been developed in science education research. Practical work
and demonstrations are good examples for such teaching situations with much research on different
practices to enhance their effectiveness (Abrahams & Millar, 2008; Hofstein & Kind, 2012; Hofstein
& Lunetta, 2004). In the study described in this article, we had an in-depth investigation of one particu-
lar teaching situation: teacher explanations. The effectiveness of teacher explanations has been exam-
ined in many prior studies, and we use the results of these studies to operationalize teaching quality in
explaining situations.
1.2 | Teachers’ professional knowledge

Teachers’ professional knowledge has been assumed to be a key factor affecting teaching quality (e.g.,
Abell, 2007; Fischer, Borowski, & Tepner, 2012; Fischer, Neumann, Labudde, & Viiri, 2014a, 2014b;
Peterson, Carpenter, & Fennema, 1989; Van Driel, Verloop, & De Vos, 1998). Also, the research on
professional knowledge has a long tradition in science education. Since Shulman’s (1986) fundamental
considerations about teachers’ professional knowledge, many studies have been conducted. Shulman
(1987) described seven categories of professional knowledge (content knowledge [CK]; general peda-
gogical knowledge [PK]; curriculum knowledge; pedagogical content knowledge [PCK]; knowledge
of learners and their characteristics; knowledge of educational contexts; and knowledge of educational
end, purposes, and values). Recent studies, however, point to three areas having a large impact on
teaching quality and student outcome: CK, PCK, and pedagogical knowledge (e.g., Baumert et al.,
KULGEMEYER AND RIESE | 3
2010; Cauet, Liepertz, Kirschner, Borowski, & Fischer, 2015; Hill, Rowan, & Ball, 2005). In these
studies, the three areas are sometimes described as including the seven categories. Our views accord
with those of these studies and we focus on the unclear evidence about the relationship between these
three areas and teaching quality. In this sense, CK is the knowledge of the content selected for teach-
ing, PCK is the domain-specific knowledge of how to teach this content and PK is the knowledge of
how to act in teaching situations in general. The focus on the three areas is a limitation of our study
and we cannot make claims about research on professional knowledge in general but just about these
aspects.
1.3 | The impact of teachers’ professional knowledge on teaching quality

The development of professional knowledge is a goal of teacher education, and it is assumed to enable
teachers to use appropriate actions in teaching situations, which should result in students’ achievement
(Terhart, 2012). However, it is not undisputed whether or not CK, PCK, and PK are the key factors
that influence teaching quality—or even if these three knowledge areas are relevant at all for teaching
(for the domain of physics, see Vogelsang, 2014). Beneath other goals, science teacher education also
aims at achieving learning outcomes related to these three areas. Therefore, the role of professional
knowledge in improving teaching quality is important for science teacher education in general. It is
important to mention that professional knowledge usually comprises not only declarative knowledge
but also procedural knowledge. This distinction refers to research in cognitive psychology: declarative
knowledge can be articulated and is explicit, whereas procedural knowledge refers to how to do some-
thing and often is implicit (Anderson, 1976). All of the studies discussed in the following claim to test
for both declarative and procedural knowledge. They use knowledge tests but also items in which this
knowledge is used to analyze authentic classroom situations.
There is a lack of empirical evidence for professional knowledge affecting instruction and also a
lack of related empirical studies, especially concerning science teacher education. In mathematics
teacher education, Hill et al. (2005) showed that a joint knowledge of CK and PCK (“content knowl-
edge for teaching mathematics”) predicts the quality of instruction (e.g., responding to students appro-
priately) and students’ learning. In contrast, Delaney (2012) found no significant correlation between
the amount of knowledge and the quality of instruction despite using the same test instruments. The
Competence of Teachers, Cognitively Activating Instruction and Development of Students’ Mathemat-
ical Literacy (COACTIV) study reported on a positive impact of PCK on students’ achievement, which
is mediated by cognitive activation—one of the three characteristics of teaching quality described
above. Even more, it was found that CK has no direct impact on teaching quality, which is mediated
by PCK (Baumert et al., 2010).
Cauet et al. (2015) described the results of the Professional Knowledge in Science (ProwiN) project
which aimed at finding a relationship between physics teachers’ professional knowledge (measured by
written tests), the cognitive activation in the classroom (measured with video-studies) and students
learning (measured by written tests). They came to contradictory findings: “Neither teachers’ content
knowledge (CK) nor their pedagogical content knowledge (PCK) correlated significantly with their
support of students’ cognitive activation in the classroom; nor did their professional knowledge explain
any variance of student learning gains” (Cauet et al., 2015, p. 463).
The Quality of Instruction in Physics (QuIP) project came to similar findings (Fischer et al., 2014a,
2014b; Keller et al., 2016). The goal of the QuIP project was to explain the gap between the PISA per-
formances of students from PISA Germany, Switzerland, and Finland. The project also measured
teachers’ professional knowledge, their teaching quality (with a focus on cognitive activation) as well
as students’ achievement. Erg€oneç, Neumann, and Fischer (2014) reported a low but positive
correlation between PCK and learning outcome (r 5 0.270, p < 0.05) and also stated that Finnish stu-
dents had the greatest achievement of knowledge while their teachers had the lowest PCK compared to
Switzerland and Germany.
All of these tests for professional knowledge were developed with a focus on curricular validity;
and therefore, the test results were a good representation of the content of the science-specific aspects
of teacher education (PCK and CK). Two assumptions emerged—either the content of teacher educa-
tion is not relevant for both teaching quality and students’ learning (as the facets of the content respec-
tively covered by the test instruments) or the measurement using written tests is not sufficiently valid.
Aufschnaiter and Bl€omeke (2010), for example, argued that written tests, especially those with multiple
choice items are not appropriate for measuring relevant knowledge, namely procedural knowledge but
only for measuring declarative knowledge focused on facts.
In our study, we wanted to contribute to researching the unclear relations between CK, PCK, and
teaching quality. We did not focus on student outcomes.
1.4 | Beliefs about self-efficacy, beliefs about teaching and learning, and their
impact on teaching quality
Prior research suggests that not only knowledge but also beliefs might determine teaching quality (e.g.,
Fang, 1996; Stipek, Givvin, Salmon, & MacGyvers, 2001), most importantly self-efficacy beliefs (e.g.,
Retelsdorf, Butler, Streblow, & Schiefele, 2010). Ross (1998), for instance, reported in a review that
teachers with high self-efficacy use more activating instructional methods and their behaviors are more
adaptive and supportive. Being adaptive and supportive is crucial for teaching quality in general and
for explaining situations in particular. In fact, adaptation is the most important strategy for effective
explanation (Wittwer & Renkl, 2008). In this regard, self-efficacy is important for the present study.
However, self-efficacy has been measured with very different instruments and notions; sometimes,
researchers followed the original distinction of Bandura (1977) between self-efficacy and outcome
expectancy (e.g., Enochs & Riggs, 1990) and sometimes they assumed that both cannot be distin-
guished clearly (e.g., Williams, 2010). In our study, we focused on self-efficacy, not outcome expect-
ancy because our focus was on behaviors leading to teaching quality, not beliefs on how likely it is
that these behaviors would result in student outcomes. In this study, we followed the definition of
Schiefele and Schaffner (2015, p. 161): “Teacher self-efficacy represents teachers’ belief that they are
able to perform those teaching behaviors that bring about student learning even when students are diffi-
cult or unmotivated.” We only focused on a small aspect of self-efficacy that is important for explain-
ing situations. Tschannen-Moran and Woolfolk Hoy (2001) differentiated between instructional
strategies, classroom management, and student engagement as essential dimensions of self-efficacy.
Concerning the present study on explaining situations, self-efficacy regarding instructional strategies
might be the most important part, which was our focus.
Besides self-efficacy, beliefs about teaching and learning are also likely to influence teaching qual-
ity (Staub & Stern, 2002). Beliefs about teaching and learning have been described as including very
different dimensions, for example, about the nature of knowledge, the way learning works, the role of
teachers (and students), effective instructional methods, and learning approaches (Chan & Elliott,
2004). Concerning their impact on teaching quality, one aspect seems to be of a distinguished impor-
tance: teachers’ beliefs toward learning facilitation and knowledge construction. Teachers with a
student-centered view are more likely to teach effectively than are teachers who believe that teaching
means only a transmission of information (Prosser & Trigwell, 1999; Rienties, Lygo-Baker, &
Brouwer, 2013). In particular, teachers with a constructivist view (instead of a transmissive view) on
teaching and learning have a higher teaching quality regarding cognitive activation (Dubberke, Kunter,
McElvany, Brunner, & Baumert, 2008). As we will discuss below, the difference between a constructi-
vist view and a transmissive one is of high importance for effective teacher explanations. Teacher
explanations, in general, are often confused with situations that build up on a simple transmission of
knowledge. Our research, therefore, focused on this aspect of beliefs on teaching and learning.
1.5 | Professional competence and performance

Holding professional knowledge in the three areas CK, PCK, and PK together with beliefs should ena-
ble teachers to solve the problems that are specific to their profession. That is why beliefs and profes-
sional knowledge are sometimes combined in models of so-called “professional competence.” The
understanding of competence has developed over the years and has recently had much influence on the
research in science education as well as that in pedagogy. The very influential model of professional
competence by Baumert and Kunter (2006) comprises professional knowledge (CK, PCK, and PK)
and beliefs, most importantly, beliefs about teaching and motivation. Bl€omeke et al. (2014) stated that
it would be a common assumption that competence has a strong relationship to successful performance
in teaching situations. Bl€omeke, Gustafsson, and Shavelsson (2015) regarded competence as the “the
latent cognitive and affective-motivational underpinning of domain-specific performance in varying sit-
uations” (p. 3) and stressed that competence has “to be inferred from observable behaviour” (p. 3).
Like Bl€omeke et al. (2014), we accord with Weinert’s (2001) notion of competence. Weinert defined
competence as “clusters of cognitive prerequisites that must be available for an individual to perform
well in a particular content area” (p. 47). Weinert further stated that additional to the cognitive aspects,
an individual’s volition and motivation to use the abilities in concrete situations are also an essential
part of competence. Competence, therefore, enables individuals to solve specific problems. Professio-
nal competence, in particular, enables them to solve problems that are related to their particular profes-
sion. Teachers’ professional competencies are the cognitive prerequisites as well as the volition and
motivation to solve problems that are specific to the profession of a teacher.
In the present study, not only did we measure professional knowledge but we also measured two
groups of beliefs to control the effects of CK and PCK on explaining performance quality.
1.6 | Difficulties in testing teachers’ knowledge and performance

As described above, the relationship between professional knowledge, beliefs, and teaching quality or
even students’ achievement is unclear. One reason for the unclear relationship between professional
knowledge and teaching quality, especially in science education, might also be that most studies—
which aim to link professional knowledge and teaching performance—measure professional knowl-
edge by using written tests and teaching quality by videotaping lessons. For example, the ProwiN pro-
ject videotaped 23 lessons of 23 science teachers and analyzed them for cognitive activation. These
results were connected with results about students’ knowledge achievement over several lessons (12–
59 lessons, depending on which teachers taught these lessons) and the amount of teachers’ professional
knowledge in CK, PCK, and PK (Cauet et al., 2015). The QuIP project videotaped a total of 69 lessons
of the 92 physics teachers and compared cognitive activation in these lessons with teachers’ professio-
nal knowledge in the same areas as in the ProwiN project. Students’ achievement over several lessons
on the same topic were measured as well. Both projects had in common a comparably low number of
lessons that had been videotaped because of the constraints due to the enormous empirical effort
required. Experienced teachers know that the success of a single lesson can be influenced by different
variables beyond teachers’ professional knowledge—from teachers’ humor and patience to classroom
climate and students’ social background (Helmke, 2006). There are context-related factors that differ
greatly between one lesson and the other, and that might have a massive influence (e.g., whether it is
the last lesson of the day or the first, or whether there was an examination administrated in the previous
lesson). It is very likely that there are many other factors. For a rather low number of lessons video-
taped in a study, these influencing factors are crucial for the outcome of the data analysis. If the occur-
rence of these factors is assumed to be random, it is likely that for a large number of lessons
videotaped in the study, these factors may cancel each other out and the “real effect” of teachers’ pro-
fessional knowledge can be revealed. Videotaping a large number of lessons and then analyzing them
appropriately, however, is nearly impossible even for very large research projects.
Another reason why it is hard to investigate a relationship between teachers’ professional knowl-
edge and teaching performance is that teachers have to make decisions to solve various problems in a
very short time during instruction (Weinert, 1996). Beginning teachers report that they perceive their
teaching as “action under pressure” (Wahl, 1991). Depending on variables like their experience, skills,
or personality this pressure leads to decisions that are not entirely based on their professional knowl-
edge. It sometimes leads to instruction that simulates the instruction as they experienced it as a student
(Brouwer, 2010). A common written test for professional knowledge is less useful for testing knowl-
edge required for authentic instruction compared to a test that demands decision-making under pressure
as in real instruction. Bl€omeke et al. (2014) proposed to use video-based testing (so-called “video-
vignettes”) to simulate an authentic teaching situation but still admitted that testing with video-
vignettes is much closer to testing cognition than to testing performance.
The literature on measuring knowledge and performance suggested one point for testing purposes
we regard as essential for our study: It is important to look for tests for professional performance
beyond written tests and videotaped lessons. We argued that the so-called “performance tests” (as out-
lined in the following) are a good alternative and maybe even a better choice when it comes to
validity.
1.7 | Testing for performance rather than for knowledge

Miller (1990) showed alternatives in assessment by referring to his discipline, medicine. He described
knowledge-oriented ways of testing as different from testing for performance. Testing knowledge
works with paper-and-pencil tests and provides high curricular validity and high reliability.
Performance-oriented testing focuses on validity. In a performance test, actual problems someone has
to solve in a specific profession are simulated. There are many examples of this way of testing in medi-
cine (Cleland, Abe, & Rethans, 2009) or psychology (Eckel, Merod, Vogel, & Neuderth, 2014). These
performance tests work in testing in a complex and challenging discipline like medicine, and there is
no reason why simulating standardized and profession-related problems should not work in science
education. Regarding medicine, different “standard situations” someone working in this profession has
to face frequently are simulated with performance tests, sometimes also called “objective structured
clinical examination” (e.g., Harden, Stevenson, & Wilson, 1975). Examples range from pelvic exami-
nations (Rochelson, Baker, Mann, Monheit, & Stone, 1985) to psychiatry (McNaughton, Ravitz,
Wadell, & Hodges, 2008) or communication training (Nau, Halfens, Needham, & Dassen, 2010). In
these tests, actors serve as patients with a precise role description. “Simulated patients” or “programed
patients” have been used successfully in teaching and testing medicine-related skills for more than 50
years (Barrows & Abrahamson, 1964). The core problem in testing for performance is not the simula-
tion itself but the analysis of these situations in order to get test scores. Scoring sheets provide high
construct validity if they are developed accurately (Park et al., 2004). In principle, it is possible to fulfil
high standards of reliability, objectivity, and validity using performance tests (Hodges, Regehr,
Hanson, & McNaughton, 1998; Walters, Osborn, & Raven, 2005).
When applied to teacher education, this would mean that a test for knowledge is a paper-and-pencil
test for declarative or procedural knowledge (which would include dealing with authentic classroom
situations or so-called vignettes [cf. Aufschnaiter & Bl€omeke, 2010]). A test for action would mean an
observation of teachers’ action in the classroom, which is basically the aim of the video-studies
described above. These ways of testing are common in teacher education. Uncommon is a test for per-
formance, which means an observation of teachers’ performance in simulated, standardized teaching
situations. That could mean that students have to be trained to behave in a specific manner and to chal-
lenge each test teacher with the same realistic problems. Those performance tests are much closer to
real teaching but have the advantages of standardized testing. All the contextual factors described
above that make it hard to research teachers’ skills can be controlled in such a setting. Indeed, the inde-
pendent action in the classroom still is something different. For example, the independent action inte-
grates a diversity of different situations, whereas performance tests can only simulate a particular
teaching situation. In a way, performance tests are a step closer regarding authenticity, but they are not
authentic.
The differentiation between tests for knowledge and performance is important for teacher educa-
tion. Terhart (2012) described a simplified “functional chain” of teacher education. Teacher education
(e.g., at a university) improves teachers’ knowledge (both declarative and procedural). Knowledge is
assumed to allow them to take more appropriate actions in teaching which in turn results in better stu-
dents’ learning. All these assumed causal relationships are not corroborated yet, as described above.
The study reported in this article aimed to contribute to the first two “links” of this chain. We used
knowledge tests that were developed with a focus on curricular validity to ensure that teacher education
potentially affected the development of the knowledge. Many test instruments have been developed in
this way, and it is well known that CK and PCK develop during science teacher education (e.g., Riese
& Reinhold, 2012). Our study adds a new perspective to these studies. We researched whether or not
teachers use this knowledge successfully for enacting specific elements of high-quality instruction. Pro-
fessional knowledge was measured by using paper-and-pencil tests, which were developed with a focus
on curricular validity. The teaching quality was measured by using standardized performance tests. We
decided to choose explaining situations for these performance tests as a start. For science teaching, we
can think of many standard situations, which someone working in the profession of a science teacher
has to face frequently. Explaining science is certainly one of such teaching situations.
1.8 | Teacher explanations as a standard situation of physics instruction
1.8.1 | Scientific explanations and science teaching explanations

Among the many duties that science teachers have, “providing explanations is the bread and butter of
the science teacher’s existence.” (Osborne & Patterson, 2011, p. 632). Explaining is an important skill
for science teachers (see also Geelan, 2012) and it is a highly valued skill as well. Wilson and Mant
(2011a) asked 5,044 students from the eighth grade about the characteristics and the quality of their
teachers and the attributes that distinguish the exemplary teachers from all teachers. The most fre-
quently mentioned attribute of exemplary teachers was their explaining skills. Interestingly, the teach-
ers themselves did not refer to explaining as important at all (Wilson & Mant, 2011b).
In science education literature, there has been a discussion about the nature of explanation and its
similarities with argumentation (Berland & McNeill, 2012; Osborne & Patterson, 2011). Both explana-
tion and argumentation share commonalities regarding their structure and their epistemic nature. For
example, the most common approach to describing explanations links a phenomenon (“explanandum”)
with an underlying universal law (“explanans”) by considering specific conditions. The links have to
fulfil the terms of logic. This “covering law model” (Hempel & Oppenheim, 1948) is similar to
Toulmin’s (1958) argumentation patterns. However, explaining in the classroom is more than just pre-
senting a logical connection between a phenomenon and an underlying principle. Treagust and Harri-
son (1999) therefore distinguished between scientific explanations on the one hand (a scientific
explanation could follow the covering law model, even though there are alternatives, e.g., Kitcher,
1981) and science teaching explanations (that means an explanation of a scientific phenomenon) on
the other. Explaining in the classroom needs to meet the needs of the students (e.g., their prior knowl-
edge) and sometimes has a reversed structure when compared to scientific explanations. A general law,
for example, Ohm’s law, might be explained by giving different examples in which the law shows that
it can be applied to different phenomena (Kulgemeyer & Tomczyszyn, 2015). Kulgemeyer and
Schecker (2013) stressed that even a well-reasoned claim—that means a good argumentative structure
—might lead to acceptance or conviction, but not necessarily to understanding. Understanding (e.g.,
by achieving a conceptual change) is the main goal of explaining in the classroom (Gage, 1968).
Concerning the present study, we focused on science teaching explanations), that means explana-
tions of a scientific idea to a particular addressee. They are sometimes also named instructional explan-
ations (e.g., Wittwer & Renkl, 2008). We wanted to focus on teaching quality in explaining situations.
Thus, that aspect of explaining seems to be more relevant.
1.8.2 | A model of explaining physics

The communicative aspects in explaining have been researched both in science education (e.g., Geelan,
2012; Sevian & Gonsalves, 2008) and educational psychology (e.g., Roelle, Berthold, & Renkl, 2014;
Wittwer & Renkl, 2008). Kulgemeyer and Tomczyszyn (2015) proposed a model of explaining in sci-
ence teaching, which highlights the communicative nature of the process. Concerning the three charac-
teristics of teaching quality (emotional support, classroom organization, instructional support), this
model provides a concretion of these three characteristics for explaining situations. The model was
used in several studies and refined based on empirical data (Kulgemeyer & Schecker, 2012, 2013;
Kulgemeyer & Tomczyszyn, 2015).
In this model, the basic explaining-constellation consists of two roles, an explainer and an
addressee. Usually, in science teaching, the explainer is an expert in science, for example, the teacher,
a tutor, or an expert student in the jigsaw method (Berger & Hänze, 2015) and the addressee is a nov-
ice (a student, a tutee, or a peer). In different contexts, the constellation might be different. It is impor-
tant to underline that both the explainer and the addressee participate actively in the communication. A
process of explaining would be modeled as follows (see Figure 1).
The explainer aims to communicate science content and must consider two main points: (i) “What
is to be explained?” (subject-adequate communication) and (ii) “Who is it to be explained to?”
(addressee-oriented communication). The first point refers to the science that needs to be explained
and the structure in which it is presented—this is the connection to scientific explanations. The “cover-
ing law model” describes a possible structure. The second point deals with the addressee’s needs and
their effects on explaining, for example, by considering their supposed prior knowledge, interests, or
any misconceptions. The explainer cannot be certain about these assumptions and has to evaluate them
during the explanation, for example, by asking questions or giving tasks that require knowledge that
has just recently been explained.
The addressee perceives the actions of the explainer and is free to accept or reject the explainer’s
attempt to explain. Of course, information cannot be transferred directly from the explainer to the
explainee. Explaining is sometimes confused with a strong didactical, teacher-centered way of teach-
ing. That would be a non-constructivist view on explaining. The model, however, describes explaining
FIGURE 1 Model of dialogic explaining
as a possible prerequisite for student’s understanding—or as Brown (2006) stated: “an attempt to pro-
vide understanding of a problem to others” (p. 196). High-quality explaining helps students to con-
struct their own knowledge. The model has the characteristic of a probabilistic approach to
communication—the actions of the explainer aim to increase the likelihood that an addressee can con-
struct meaning from the explained information. The addressee plays an active role during this process.
Once he/she decides to take part in the explaining process, he/she will show verbally or non-verbally
signs of interest, comprehension, or misunderstanding. An explainer has to perceive and interpret those
signs and decide how to vary the explaining in order to make it more comprehensible. In particular, in
a physics-related explaining situation, there are some physics-specific means that can be varied for this
purpose. These means are: (i) the level of mathematization, (ii) the representational forms (e.g., dia-
grams or realistic pictures), (iii) the contexts and examples (e.g., everyday contexts or physics textbook
examples), and (iv) the code of the verbal language used (e.g., informal language, technical language,
or language of education). For each statement, an explainer has to decide on how to make appropriate
use of these variables. It is at the core of explaining skills to decide on how to choose these four varia-
bles to be both addressee-oriented and subject-adequate and to adapt them according to the addressee’s
needs in order to increase the comprehensibility.
With respect to the three characteristics of teaching quality (emotional support, classroom organiza-
tion, instructional support), adapting the four variables means providing emotional support (as the
adaptation is guided by students’ perspectives) and instructional support (as the adaptation regards con-
cept development and the quality of feedback). Classroom organization is not a part of the model as it
focuses on dialogic explaining.
2 | RESEARCH QUESTIONS
Prior studies could not corroborate the claim that teaching quality benefits from professional knowl-
edge. As this is a core claim in teacher education, our research question addresses this desideratum
concerning one particular situation of teaching: explaining physics. Our research question in this study
is: How do CK and PCK highlighted in academic physics teacher education impact explaining per-
formance? Our study, therefore, has a strong limitation to the quality of teaching in dialogic explaining
situations, the explaining performance quality.
We expected both CK and PCK to be important for explaining performance. However, PCK might
be a mediator for CK (as in the COACTIV study mentioned above). Without PCK, CK might not be
useful for explaining. We tested this hypothesis, but also the possibility that CK might affect explain-
ing performance directly. We decided to focus on the impact of CK and PCK and to leave out PK.
Testing time is limited, and these two are more likely to be of use in explaining situations than PK.
We also tested two groups of important beliefs that are likely to mediate the effects of CK and
PCK on explaining performance quality: (i) beliefs about self-efficacy regarding the instructional strat-
egy of explaining and (ii) beliefs on teaching and learning regarding a constructivist or transmissive
view on explaining. We assumed according to prior research (described above) that someone with a
high self-efficacy explains more effectively, for example, because he or she has enough self-
confidence to use various examples and to explain more independently from the exact knowledge
found in the textbook. We also assumed according to prior research (as described above) that someone
who considers explaining from a constructivist point of view explains more effectively than someone
who regards explaining as simple transmission of knowledge. The two groups of beliefs are used as
control variables for the effect of CK and PCK on explaining performance quality. Testing time is lim-
ited here as well, so we operationalized the two groups of beliefs close to explaining situations and by
using just two very small aspects of them. That was a limitation of our study.
3 | METHODS
3.1 | Design
We administered the paper-and-pencil tests about CK, PCK, the two beliefs and demographics to the
participating student teachers. The tests had been split into two separate test booklets, each of which
was a 90-min test. After having tested the student teachers with the paper-and-pencil tests, we tested
them with the performance test on explaining physics (20 min). Various test administrators followed a
clear documentation about the process to ensure objectivity in conducting the paper-and-pencil tests.
Conducting the performance test was more challenging. Thus, the same test administrator conducted
all of the tests at all of the universities over a period of 3 years. Three different high-school students
who had been trained thoroughly participated in performance tests (as described below). Pilot studies
were conducted to ensure that all of these students acted similarly in all the performance tests. All in
all, the testing time for each participating student was 200 min.
3.2 | Sample
German teacher education starts from the first semester on with a specific track for future physics
teachers. To become a physics teacher in Germany a master’s degree (10 semesters) is mandatory. We
wanted to increase the likelihood to reach students with a broad range of CK, PCK, and explaining per-
formance quality. Thus, our sample consisted of student teachers enrolled in a physics teacher training
program from all stages of their academic teacher education in German universities. The median of the
number of semesters for which they attended teacher training courses at university was 6, and the range
was from 1 to 20. One hundred ninety-eight of the student teachers participated in the performance
test, but not all of them were present when the other tests were conducted. We collected the data from
109 of these 198 student teachers for all three of the tests for explaining performance quality, PCK,
and CK. Those 109 student teachers came from five different German universities. The final sample
consisted of 68 male and 41 female student teachers. Their age ranged from 17 to 40 years with a
median of 23 years.
3.3 | Instruments
3.3.1 | Demographics
We collected data about the age, semesters in a physics teacher training program, Credit Points in a
physics teacher training program, high-school exam grade, and weeks of internships in schools.
3.3.2 | Explaining performance

We assessed explaining performance of student teachers in a process-oriented approach close to real
teaching situations in this study. The framework is called “dialogic explaining assessment” (DEA),
which was an adaptation of a method used by Kulgemeyer and Schecker (2013) to assess high-school
students’ competencies of communicating physics. In a DEA, experts have to explain physics phenom-
ena to a single novice, for example, why one feels weightless in a rollercoaster.
The setting consisted of 10-min preparation time and 10-min explaining. Ten-minute preparation
time were chosen because we found in prior studies that test persons felt prepared after that time
(Kulgemeyer & Tomczyszyn, 2015). Ten-minute explaining time was chosen because, in the same
prior studies, we could see that all aspects of the topics had been touched on after that time. Still, the
time was both a standardization in order to compare the results and a limitation of our study because
teachers might take more time to prepare an explanation in classroom.
During the preparation time, an instructor presents the topic of the explanation to the test person.
The instructor adds that the test person needs to prepare to explain this topic to a 10th-grade student
(the novice) with average grades in physics and no prior knowledge about this topic. All the partici-
pants taking the test then receive the same set of materials for preparation and use in the explaining sit-
uation, for example, diagrams of a well-defined complexity. They are told that they are free to take
their notes into the explaining situation and free to use whatever parts of the standardized material they
want (or not). In the study reported in this article, we used two different topics in the explaining situa-
tion: “Why does a car skid out of a sharp turn on a wet road?” (physics: friction, forces, circular
motion); and “Why could it theoretically work to blow up an asteroid approaching earth into two
halves in order to save the planet? (physics: conservation of momentum, superposition principle). We
chose these two topics because some of the students were tested twice (before and after a semester),
however, the data used in the study reported in this paper just refers to the first explanation because
some participants only had the chance to explain once. We ensured that both topics were of a compara-
ble complexity in a pilot study where ten student teachers explained both topics and had comparable
results (Kulgemeyer & Tomczyszyn, 2015). The physics concerning these topics belongs to the high-
school syllabus so that they are realistic topics for explanations. The student teachers (explainers)
should have developed a deeper understanding of the physics background in their university courses
by the time of the study as the topics belong to the curriculum of the courses in the first semester.
During the explaining, the student teachers have to explain the topic to the novices. The explaining
attempts were filmed. The testing took place in the universities; we tested the participants in each of the
rooms with a table, where both the explainer and the novices were seated. There were also paper and pen-
cils for making notes and sketches, as well as a whiteboard. During the explaining, just the explainer, the
novices, and a camera were in the room. The most important standardization, however, was that the novi-
ces have not simply been picked randomly. They have been trained thoroughly to behave in a standar-
dized way during the explaining situation, particularly by giving specific prompts as feedback. Several
prompts occur in every DEA in similar ways. Each of the four variables of the model of dialogic explain-
ing has related prompts. Some of them attempt to simplify the explanation (“Could you explain it again?
It was a bit confusing for me.”), others to make it more sophisticated (“Can you make it sound more like
physics? My teacher always wants me to use the appropriate terms.”). Some simply attempt to use a tool
(“Could you sketch that for me?”). At all of the universities, the same students were used as trained novi-
ces. Three high-school students participated as explainees. Even though only one of them was part of
each setting in a pilot study, it turned out that we needed to change the explainees after each five to seven
DEAs because they got tired. Also, in a pilot study, we ensured that all of them behave similarly and the
results of the DEAs were as independent of the explainee as possible. Ten test persons explaining the
same topic to two different explainees had comparable results (Kulgemeyer & Tomczyszyn, 2015).
Dialogic situations frequently happen in science teaching, for example, when students are working
in groups or on their own, and the teacher has time to help. Still, we cannot be sure that explaining a
phenomenon to a group of students needs the same skills as classroom explaining. We focus on dia-
logic situations because this setting can also be realized in large-scale studies.
We analyzed the videos using an established instrument and an established process for researching
explaining performance quality. Both the instrument and the process had been researched thoroughly regard-
ing validity, reliability, and objectivity in a previous study (Kulgemeyer & Tomczyszyn, 2015). Kulgemeyer
and Tomczyszyn (2015) used the “model of explaining physics” described above (cf. Theoretical back-
ground) as the starting point. Using this model, initial categories have been formed; further categories have
been added following the process of qualitative content analysis (Mayring, 2000). The final instrument con-
sists of 12 categories for appropriate explaining and for non-appropriate explaining (categories are given in
Table 1). Kulgemeyer and Tomczyszyn (2015) describe the process of analysis as follows:
1. Identifying every statement of the explainers and the novices.

2. Linking the categories from the instrument to the statements. A focus of the linking lies on the
answers to the novice’s prompts (on raters and objectivity: see below).
3. Counting the number of categories for appropriate explaining (X) and non-appropriate explaining (Y).
4. Calculating the explaining performance index: PI 5 X – Y.
Validity, reliability, and objectivity were examined by Kulgemeyer and Tomczyszyn (2015) as
described below:
Validity
The PI predicted experts’ decision for the better explaining quality when a pair of videos was com-
pared (Cohen’s j 5 0.78). The PI correlated with the rating of the trained students on the explaining
quality (Pearson’s r 5 0.53, p < 0.001); they were asked to rate the explaining quality on a Likert Scale
after each of the 109 DEAs. Both the expert rating and the student rating are arguments for concurrent
validity, even though the ratings of the students are probably confounded by factors such as sympathy.
An argument for content validity is that the categories represent the four “variables” of the model for
explaining physics (e.g., “contexts and examples”). For construct validity, a nomological network with
both aspects of convergent and discriminant validity was analyzed. Interview studies were conducted
to ensure that the test teachers perceived the test situations as authentic.
Reliability
The PI reached a good Cronbach’s a 5 0.772. A one-dimensional Rasch-model could be used to
model the scale applying the usual criteria (0.8 < Infit MNSQ < 1.2, T < 2.0).
Objectivity
Objectivity was examined by measuring inter-rater reliability. In the first step for all categories, two
raters reached an accordance ranging between 73% and 97% depending on the category. In the second
step, a consensus between the two raters could be reached for all categories and videos.
T A BL E 1 Twelve categories used to calculate the performance index (PI)
Category Description
Presenting concrete Explainer presents numbers as an example instead of leaving a formula

numbers for formulas unexplained
Explaining physics concepts Explainer avoids technical terms by describing the underlying concept
in everyday language with everyday terms.
“push” instead of “momentum”; “. . . and the water lifts it a bit” instead
of “buoyancy”
Connecting non-verbal elements Explainer connects non-verbal elements like diagrams, pictures or
demonstrations by highlighting similarities and differences.
Using items in general Explainer uses small everyday items (e.g., a paper snarl) to illustrate a
process.
Connecting items with the Explainer not only uses small everyday items but connects them to the
topic by showing analogy topic he/she wants to explain (“The paper snarl stands for the asteroid.
Look at it when I am moving it”)
Small demonstrations The explainer conducts small demonstrations with everyday items (e.g.,
“In which direction does the paper snarl move when I push it?”)
Answering inadequately (scientifically Explainer does not answer an addressee’s question or ignores the
wrong answers do not belong question.
into this category) A: “Weightlessness, is that the same as in outer space?” - E: “I have
never been in outer space, how should I know?”
Review The explainer stresses that something has already been explained and is
needed now (“You remember that we were talking about friction
before, that is exactly what happens here.”)
Summary The explainer summarizes the explanation briefly.
Encouragement The explainer praises the explainee for good answers and encourages to
deal with difficult parts of the explanation.
Diagnosing understanding The explainer diagnoses the success of the explanation by asking
questions or giving tasks (NOT just: “Did you understand that?”)
Request action from explainee The explainer requests the explainee to act. (“What do you think how it
moves? Could you sketch that for me?”)
“Answering inadequately” decreases the explaining performance index.
3.3.3 | Professional knowledge

To measure CK and PCK of physics student teachers, paper-and-pencil tests were developed focusing
above all on curricular validity. All the tests addressed the content domain mechanics as mechanics is
taught in the first year at university and students’ conceptions in mechanics are well known in science
education research (e.g., Hestenes, Wells, & Swackhamer, 1992; Schecker, 1986). Besides, a test to
measure mathematical skills (terms, trigonometry, vectors, matrices, differentiation, integration) was
developed as well as a questionnaire to gather demographics (e.g., grades, teaching experience). Riese
et al. (2015) described all the steps and the whole process when developing the tests in detail. A brief
overview is given in the following.
FIGURE 2 Sample item to measure PCK (subscale: students’ misconceptions and how to deal with them)
Pedagogical content knowledge

Starting from an analysis of different conceptualizations of PCK in science subjects (e.g., Lee & Luft,
2008; Magnusson, Krajcik, & Borko, 1999; Park & Oliver, 2008; Riese & Reinhold, 2012) and also
considering curricula analyses, a comprehensive model for prospective physics teachers PCK was
developed. With regard to curricular validity, the focus was on PCK that students could acquire in pre-
service teacher training programs in Germany (the university part of teacher education programs in
Germany). The curricular valid model consists of four subscales (cf. Gramzow, Riese, & Reinhold,
2013): (i) instructional strategies, (ii) students’ misconceptions and how to deal with them, (iii) experi-
ments and teaching of an adequate understanding of science, and (iv) PCK-related theoretical concepts
(e.g., the model of educational reconstruction (Duit, Gropengießer, Kattman, Komorek, & Parchmann,
2012) or conceptual change (Scott, Asoko, & Driver, 1992)). Additionally, to create items with differ-
ent requirements and difficulties, researchers added a second dimension called cognitive activities con-
taining the following aspects: (i) reproduce, (ii) apply, and (iv) analyze.
Based on the PCK-model, a paper-and-pencil test including 91 items with open situational judg-
ment items as well as complex multiple-choice items (multiple select) was developed (Riese et al.,
2015). An example of an item is shown in Figure 2. Concerning the model, it addresses the subscale
“student’s misconceptions and how to deal with them.” Finally, the best items (concerning statistical
fit, validity and a balanced distribution of item difficulties) had to be chosen for the main study, so the
test was reduced to 43 items (60 min).
With regard to test-quality-criteria, three experts verified the intended relationship between
the test items and the underlying model (Cohen’s kappa values were between 0.83 and 0.89). A
further argument for content validity is that we ensured that the test covered the topics from the
PCK courses of the participating universities by analyzing the curriculum and expert ratings from
educators at the particular university. Moreover, the instrument and the underlying model were
validated by implementing a think-aloud study. For construct validity, a one-dimensional Rasch-
model (one global PCK scale: variance 5 0.49; EAP-reliability 5 0.82; 0.8 < MNSQ < 1.2;
21.9 < T < 1.9) was compared with a four-dimensional Rasch-model (each dimension represent-
ing one of the four subscales mentioned above: variance was between 0.50 and 0.61; EAP-
reliability between 0.55 and 0.73; 0.8 < MNSQ < 1.2; 21.9 < T < 1.9 for 89 of 91 items). In
doing so, significantly higher and better matching of the four-dimensional model could be
observed (v2 test: p < 0.001). That means it is probably justified to accept the assumption of hav-
ing four independent subscales for measuring PCK (the four dimensions of the model). We can,
therefore, analyze data for each of the subscales and treat the subscales as independent tests. Fur-
thermore, a nomological network (aspects of convergent and discriminant validity) was investi-
gated by using correlational analysis (cf. Riese et al., 2015).
Content knowledge
The model for physics teachers’ CK comprises three subscales: (i) physics knowledge from school
textbooks, (ii) deeper understanding of physics knowledge from a school textbook (e.g., knowledge of
different procedures of solution, knowledge of boundary conditions, or knowledge of coherences of
various physical phenomena), and (iii) physics knowledge from a university textbook. To create items
with different requirements and difficulties, a further dimension called complexity was added; that is
made up of: (i) facts, (ii) links between facts, and (iii) dealing with advanced physical concepts.
Based on this CK-model, a paper-and-pencil test with 143 multiple-choice items (single select) was
developed, from which 40 items (for the test of 60 min) were chosen for the main study (with regard
to statistical fit, validity, and a balanced distribution of item difficulties). For content validity, curricular
analyses—of the courses in all participating universities on the one hand and different textbooks
(school textbooks and university textbooks) on the other—were conducted. Furthermore, the matching
between the test items and the underlying model was checked by several physics student teachers. For
construct validity, a one-dimensional Rasch-model (with just one global CK scale: variance was 0.90;
EAP-reliability .83; 0.8 < MNSQ < 1.2; 21.9 < T < 1.9) was compared with a three-dimensional
Rasch-model (using those three subscales mentioned above: variance was between 0.98 and 1.43;
EAP-reliability between 0.78 and 0.83; 0.8 < MNSQ < 1.2; 21.9 < T < 1.9). Thereby, significantly
higher and better matching of the three-dimensional model could be observed (v2 test: p < 0.001). So,
it seems justified to accept the assumption of three independent subscales for measuring CK. Finally, a
nomological network with both aspects of convergent (e.g., physics grades) and discriminant validity
(e.g., mathematical skills) was examined with correlational analysis (cf. Riese et al., 2015).
3.3.4 | Beliefs on aspects of self-efficacy and teaching and learning

Both groups of beliefs were tested using Likert scales in our study. That is a usual way to assess
beliefs, even though interviews might be a more valid alternative (Aikenhead, 1988). For our analysis,
we needed a quantitative measure and testing time was limited because CK, PCK, and explaining per-
formance were in focus. The instruments were newly developed based on Fechner (2009) and Riese
(2009). As mentioned above, we just focused on aspects of the two groups of beliefs: beliefs on self-
efficacy regarding the instructional strategy of explaining and beliefs on teaching and learning regard-
ing a constructivist or transmissive view on explaining. That was a limitation of our study. The items
are presented in the supplementary material (Instruments S1).
Beliefs about self-efficacy regarding the instructional strategy of explaining

Self-efficacy was operationalized close to the setting and focuses on physics in explaining situations.
The scale consisted of six items (see supplementary material). Internal consistency was moderate
(Cronbach’s a 5 0.62). Some researchers regard Cronbach’s a > 0.5 as sufficient (Nunally, 1978) but
most recent studies treat Cronbach’s a > 0.7 as more appropriate. The moderate internal consistency
might lead to restrictions in the interpretation of the data. We, however, still regard it as suitable. The
average inter-item correlation (r 5 0.25) fell in the range recommended for internal consistency
(0.15 < r < 0.50) (Clark & Watson, 1995).
Beliefs on teaching and learning regarding a constructivist or transmissive view on

explaining
The scale consisted of seven items (see supplementary material). Internal consistency was moderate
(Cronbach’s a 5 0.66) (for a discussion see above). The average inter-item correlation (r 5 0.23), how-
ever, fell in the range recommended for internal consistency (0.15 < r < 0.50) (Clark & Watson, 1995).
T A BL E 2 Descriptive statistics of study variables
M (overall) SD (overall) M (male) M (female)
CK 32.5 11.8 34.5 29.0
PCK 40.2 12.6 40.7 39.3
BT 78.6 7.5 78.6 78.6
SE 55.0 11.7 55.0 55.0
EXP 54.0 14.8 54.5 53.7
CK 5 content knowledge; PCK 5 pedagogical content knowledge; SE 5 self-efficacy; BT 5 belief on teaching and learning
“explaining means transmission”; EXP 5 explaining performance; M 5 means (in percent); SD 5 standard deviation (in percent).
3.4 | Data analyses

We used a path analysis method to answer the research questions. Path analysis is a frequently used
method to estimate relationships between traits that are assumed to be causal. It helps to estimate the
values of explained variance of a trait by other variables (Vinzi, Trinchera, & Amato, 2010). The “path
coefficient” between the traits is a standardized regression coefficient. To estimate the path coefficients
in this study, we used robust maximum-likelihood estimation. This approach helps to deal with missing
data. We used manifest values for the path model. Latent values could not be used because of the sam-
ple size in this study; this, however, is not a problem because manifest models tend to underestimate
the relationship between traits. With respect to our research questions, this is rather a careful way to
identify relationships. For model fits, we used the comparative fit index or CFI (very good fit indicated
by CFI > 0.95; cf. Fan, Thompson, & Wang, 1999) and the Root Mean Square Error of Approxima-
tion or RMSEA (to fit for RMSEA < 0.05). Based on our research questions and hypotheses, we also
tested if CK had a direct impact on explaining performance or if it was mediated by PCK. We com-
pared the resulting path models using the quality criteria mentioned above, the explained variance and
Bayes Information Criterion; this method is a usual means to compare models.
4 | FINDINGS
4.1 | Descriptive results

Descriptive results for the measures included in the data analysis are provided in Table 2. These results
showed that the student teachers performed moderately well in the explaining performance test with a
mean of 55.0% (correct answers) as well as in the PCK test instrument with a mean of 40.2% (correct
answers). In the CK test instrument, they achieved lower scores than in the PCK instrument (32.5%).
To a great extent, they held a rather constructivist perspective on teaching (78.6% for beliefs on teach-
ing and learning, with 0% meaning a transmissive belief, and 100% a constructivist belief), while their
self-efficacy in explaining situations was moderate (55.0%). Male and female student teachers scored
nearly identical in all measured variables, the male student teachers, however, outperformed the female
counterparts in their CK. The difference was small.
Table 3 provides information about the correlations between the measured variables and some selected
demographics that were collected (age, semesters in a physics teacher training program, Credit Points in a
physics teacher training program, high-school exam grade, and weeks of internships in schools).
For the tested student teachers, we could identify significant correlations between both their per-
formance in the CK test and the PCK test and their progress in their studies. Their progress, in this
T A BL E 3 Manifest correlations of the measured variables and selected demographics (Pearson’s r)
Semesters Credit points High-school Weeks of internships

Age physics physics exam grade in schools
CK 0.069 0.354** 0.369** 0.353** 0.286*
PCK 20.017 0.436** 0.365** 0.228* 0.299**
BT 20.093 0.081 20.048 0.220* 0.048
SE 20.256* 20.340** 20.385** 0.127 20.234*
EXP 0.049 0.015 0.482** 0.302** 20.174
“explaining means transmission”; EXP 5 explaining performance.
*p < 0.05; **p < 0.01; ***p < 0.001.
case, is reflected by the three measures “semesters in a physics teacher training program,” “Credit
Points in a physics teacher training program,” and “weeks of internships in schools.” For the explain-
ing performance test, we could only identify correlations to the “Credit Points in a physics teacher
training program.” Interestingly, the high-school exam grade seems to be a fair predictor of the per-
formance in the tests as it correlates with all measures except the measured aspect of self-efficacy.
In order to have a closer look at the explaining performance test, we use Table 4 to present data on
how many of the student teachers used each category for their explaining. For example, 78 out of the
109 student teachers paraphrased a technical term in everyday language at least once. Analysis of the
numbers showed that most of the student teachers (N 5 89) were able to avoid examples that are non-
fitting from a scientific point of view but just a few (N 5 12) attempted to request actions from their
explainee. In general, the interaction with the explainee was the most difficult main category.
T A BL E 4 Number of student teachers who used a particular category during their explaining attempts
Explainers who Explainers who did

Category used a category not use a category
Presenting concrete numbers for formulas 66 43
Explaining physics concepts in everyday language 78 31
Connecting non-verbal elements 55 54
Using items in general 43 66
Connecting items with the topic by showing analogy 13 96
Small demonstrations 19 90
Answering inadequately 89 20
Review 42 67
Summary 33 76
Encouragement 18 91
Diagnosing understanding 27 82
Request action from explainee 12 97

T A BL E 5 Manifest inter-correlations of the measured variables (Pearson’s r)
CK PCK BT SC EXP
CK – – – – –
PCK 0.564*** – – – –
BT 20.266* 20.237* – – –
SE 0.153 0.119 20.079 – –
EXP 0.376** 0.376*** 20.339*** 0.367*** –
“explaining means transmission”; EXP 5 explaining performance.
*p < 0.05; **p < 0.01; ***p < 0.001.
4.2 | Impact of CK and PCK on explaining performance

In Table 5, we present the manifest inter-correlations between the measured traits because we want to
highlight the correlations presented in the last row between explaining performance and the other traits.
First, we found significant correlations in the assumed directions. Second, we could identify that the
correlation between explaining performance and CK was exactly as high as the correlation between
explaining performance and PCK (r 5 0.376, p < 0.001).
All those correlations ranged between r 5 20.339 (p < 0.001) and r 5 0.376 (p < 0.001) and
can be regarded as medium correlations; 0.1 < r < 0.3 is usually regarded as a small effect,
0.3 < r < 0.5 as a medium effect and r > 0.5 as a large affect (Cohen, 1988). The correlation
between CK and PCK (r 5 0.564, p < 0.001) reached a very large value. The correlation between
the belief on teaching and learning “explaining means transmission” and PCK was small but neg-
ative (r 5 20.237, p < 0.05).
In Figure 3, we present the path model that followed our hypotheses. It had appropriate quality cri-
teria, CFI 5 0.994, RMSEA 5 0.030; v2/df 5 1.1, p 5 0.363; and BIC 5 2574. We tested an alternative
model with a direct path from CK to explaining performance but without any path from CK to PCK.
The alternative model had to be rejected based on the used quality criteria, CFI 5 0.801,
RMSEA 5 0.172; v2/df 5 4.2, p < 0.001; and BIC 5 2586. Furthermore, in this alternative model, both
F I G U R E 3 Path model for explaining performance (Robust maximum-likelihood, full-information-maximum-likeli-

hood-estimation). CFI 5 0.994, RMSEA 5 0.030; v2/df 5 1.1, p 5 0.363); *p < 0.05, **p < 0.01, ***p < 0.001. Path coef-
ficients are given. The value at the outcome (explaining performance) represents the explained variance. The dashed line
indicates a non-significant correlation (p > 0.05)
T A BL E 6 Manifest correlations between explaining performance (EXP) (Pearson’s r), respectively with two
particular subscales of PCK and CK
PCK CK
StuCon Concepts SchKno DeepSch PhyUn
EXP 0.34** 0.29** 0.32** 0.34** 0.17
*p < 0.05; **p < 0.01; ***p < 0.001.

StuCon 5 knowledge about students’ misconceptions and how to deal with them; concepts: PCK-related theoretical concepts,
SchKnow 5 physics knowledge from school textbooks, DeepSch: deeper understanding of physics knowledge from a school text-
book, PhyUn: physics knowledge from university textbooks.
the paths from CK and PCK were not significant. That implies that the model in Figure 3 more accu-
rately described the empirical data in this study.
We want to highlight in Figure 3 that PCK had a positive impact on explaining performance, and
so did the self-efficacy “physics in explaining situations.” The belief “explaining means transmission,”
however, had a negative impact on explaining performance. PCK decreased this belief. CK had a small
impact on the self-efficacy in teaching situations but was not found to be significant. The model
explained 29% of the variance, which is large based on Cohen’s (1988) suggestions to interpret R2.
An important question for teacher education is what parts exactly of CK and PCK are important
for high-quality performance and in what situations of teaching. These parts could be highlighted in
teacher education. For explaining physics, our study provides first insights because we can differentiate
empirically between subscales of PCK and CK as described above in the Methods section. In Table 6
we want to point out four correlations between the empirically verified subscales of PCK and CK and
explaining performance in our study. Table 6 should be read as first results and we want to remind that
a correlation does not necessarily mean a causal relationship. We discuss these results in the following
section.
5 | DISCUSSION
Regarding the descriptive results, we interpret the results as follows: the correlations between both the
CK and the PCK test results and the three aspects of the progress in the studies reflected the focus on
curricular validity during the test development. Still, there was a limitation because of the focus on
mechanics. There was a medium correlation between the Credit Points achieved in a physics teacher
training program and the explaining performance (r 5 0.482, p < 0.01) but not between explaining
performance and the semesters in a physics teacher training program. The reason could be that the
Credit Points are a better measure to reflect the actual study progress: the number of semesters would
also increase for “passive” students, whereas Credit Points could only be achieved after completing a
course successfully. Interestingly, the practical experience (“internships in schools”) did not correlate
with the explaining performance as well. One might have expected that because practical experience
perhaps without reflection is unlikely to result in better results. However, our results are limited: it was
unclear what the student teachers really did during their internships or how much teaching experience
they really had. There was a small negative, but significant correlation between the measured aspect of
self-efficacy and the weeks of internships in schools (r 5 20.174, p < 0.05). That could mean that
practical experience or a more elaborate picture of the profession of a science teacher helps student
teachers to become aware of how difficult it is to be a good explainer. This hypothesis would be sup-
ported by the medium correlations to the semesters (r 5 20.340, p < 0.01) and Credit Points
(r 5 20.385, p < 0.01) in a physics teacher training program. However, we can just present the corre-
lations and cannot claim to identify causal relationships.
Analyzing the difficulty of the single categories of the explaining test, we have to highlight that the
explainer’s interaction with the explainee was the category where the least student teachers collected
points. For example, just 12 of them requested actions from their explainee in a way they had to partic-
ipate in the explaining process actively (e.g., “Could you draw the graph in the diagram?”). That shows
how difficult it is to understand an explanation as more than just a presentation.
Concerning our research question, we interpret the findings of our study as follows: Both CK and
PCK were important in improving the quality of explaining. PCK, however, mediated the path of CK to
explaining performance, and therefore, played the key role in transferring CK to explaining performance.
We could not find a direct path from CK to explaining performance. Beliefs were important in transfer-
ring knowledge to explaining performance. In this particular case, first, the belief about teaching and
learning “explaining means transmission” decreased explaining performance but was itself decreased by
PCK. That means that an appropriate constructivist view on explaining increased the explaining perform-
ance and was itself increased by PCK. Second, a high self-efficacy “physics in teaching situations”
increased explaining performance. Of course, our results are limited to the particular aspects of the
beliefs we measured: beliefs on self-efficacy regarding the instructional strategy of explaining and beliefs
on teaching and learning regarding a constructivist or transmissive view on explaining.
This result is of interest not only for the researchers on teacher explanations but also for researchers
on teaching quality in general. For one particular situation of instruction, our study showed that CK
and PCK highlighted in teacher training programs at universities was useful for high-quality perform-
ance. After all, both the test instruments for CK and PCK had been developed in previous studies
stressing above all curricular validity—they focus on the content of German university courses.
Besides, we could show for explaining situations that PCK, in particular, plays a key role. This result
provides a good argument for teacher training programs that not only focus on the development of CK
but also highlight PCK. On the other hand, beliefs seemed to play an essential role for explaining per-
formance as well. There was an additional but indirect path from PCK to explaining performance via
the beliefs on teaching and learning. Also, there was a direct path from the measured aspect of self-
efficacy to explaining performance. Analyzing the curricula of German teacher education, we came
across an interesting observation. The curricula highlighted cognitive variables such as knowledge, and
therefore, focused on PCK and CK, as well as pedagogical knowledge. Probably, most examinations
focus on cognitive variables as well—which very much underestimates the worth of beliefs on actual
performance. Knowledge might affect beliefs implicitly, as PCK affects the epistemological beliefs in
our path model. However, teachers’ knowledge is certainly not the only trait to develop during teacher
education because, for example, their appropriate attitudes towards teaching are also important. In our
case, that could mean that someone might have a lot of knowledge about students’ misconceptions and
how to diagnose them (an important part of our PCK test) but he or she still acts as if explaining meant
transmission. We, as teacher educators, should value beliefs in teacher education and work on appro-
priate ways to test them. Of course, many teacher training programs already do so.
We want to highlight the correlation between explaining performance and the knowledge about stu-
dents’ misconceptions and how to deal with them (r 5 0.34, p < 0.01) as a particular aspect of PCK. It
was the subscale of PCK having the largest significant correlation with explaining performance. Sadler,
Sonnert, Coyle, Cook-Smith, and Miller (2013) found that teachers, who were able to identify miscon-
ceptions in items of a physics assessment instrument that has been administered to their students, had
larger classroom gains than teachers, who only knew the correct answer. Our study showed similar
results. First of all, the teaching quality in explaining situations correlated with the teachers’ knowledge
about students’ misconceptions and how to deal with them. We only focused on explaining situations
but a higher teaching quality, in general, might explain larger classroom gains. Second, in our study,
PCK mediated the path of CK to the teaching quality in explaining situations. That might explain why
in the study of Sadler et al. (2013) teachers, who were able to identify misconceptions (a part of PCK),
had larger classroom gains compared to those teachers who only had the CK to solve the items cor-
rectly on their own. Future studies should further investigate this result. It also yields a second question
for further studies: Can PCK courses better prepare student teachers at universities for explaining situa-
tions by focusing more on the knowledge of student’s misconceptions and how to deal with them (e.g.,
conceptual change)?
Still significant was the correlation between explaining performance and the knowledge on PCK-
related theoretical concepts (e.g., the model of conceptual change). The two subscales integrate the
knowledge about how to diagnose prior knowledge, how to identify common misconceptions in stu-
dents’ utterances, and how to deal with them. It is likely that this kind of knowledge helps in explain-
ing physics.
Both the CK-subscales “Physics knowledge from school textbooks” (r 5 0.32, p < 0.01) and
“Deeper understanding of physics knowledge from a school textbook” (r 5 0.34, p < 0.01) correlated
significantly with the explaining performance. There was no such correlation between explaining per-
formance and the more abstract physics knowledge from university textbooks (r 5 0.17, p > 0.05).
That means the more “abstract” physics knowledge from a university textbook showed no correlation
with explaining performance, whereas both the subscales “Physics knowledge from school textbooks”
and “Deeper understanding of physics knowledge from a school textbook” correlated with explaining
performance having a medium effect size. Of course, the explained topics are from school physics and
maybe the knowledge of university physics was simply not needed. Also, both explaining and prepara-
tion were limited to 10 min—maybe the student teachers just wanted to focus on the more basic
aspects. On the other hand, the more abstract knowledge of university physics might be regarded as
including the more applied school-related knowledge. We believe that it might be possible that some
student teachers are not capable of transferring the more abstract physics knowledge to explaining sit-
uations, whereas the more applied knowledge that relates directly to school textbooks can be used. It
might be the case that it is not an automatic process that someone very capable of solving problems
from university physics is also able to use this knowledge to explain physics on a school level. The
use of “physics knowledge from university textbooks” for teaching physics at a school level might be
a topic for further studies.
We want to highlight the most important limitation of our study—our study was a correlative study
but not a study with an experimental design; and therefore, we cannot interpret these relations simply
as causal relationships. Further studies are needed. Our sample size was rather small after all and just
provides results we can treat as empirically supported hypotheses. We regard our study merely as a
starting point for research that focuses on performance tests. However, performance tests are limited to
one single situation (in our case explaining) and, therefore, their results do not give holistic information
on teaching quality. Regarding our tests, there may be biases: maybe some pairs of explainer and
explainee just worked because they liked one another and that the explainer’s skills were overestimated
due to that. Maybe the measurement was confounded by sympathy. Still, the setting was much more
standardized than videotaping whole lessons (see the section “difficulties in testing teachers’ knowl-
edge and performance” above). Further studies should develop performance tests for other “standard
situations” of science education, such as conducting an experiment as a demonstration. The relation-
ship between PCK, CK, and the performance quality might substantially differ, especially on a sub-
scale level. A goal for science education research should be to find out more about what kind of
knowledge and beliefs are actually useful for improving teaching quality. This could provide very use-
ful evidence on how to develop better teacher education. Maybe the most important outcome of our
study is that we could show the high potential of performance test instruments for an empirical orienta-
tion of teacher education programs at universities. Performance tests might help to find a way to a
more evidence-based teacher education.
O R CI D
Christoph Kulgemeyer http://orcid.org/0000-0001-6659-8170
R EFE RE NC ES
Abell, S. (2007). Research on science teachers’ knowledge. In S. Abell & N. Lederman (Eds.), Handbook of
research on science education (pp. 1105–1149). Mahwah, NJ: Lawrence Erlbaum Associates.
Abrahams, I., & Millar, R. (2008). Does practical work really work? A study of the effectiveness of practical work as
ateaching and learning methodin school science. International Journal of Science Education, 30(14), 1945–1969.
Aikenhead, G. S. (1988). An analysis of four ways of assessing student beliefs about STS topics. Journal of
Research in Science Teaching, 25, 607–629.
Anderson, J. (1976). Language, memory, and thought. Hillsdale, NJ: Lawrence Erlbaum Associates.
Aufschnaiter, C. V., & Bl€omeke, S. (2010). Professionelle Kompetenz von (angehenden) Lehrkräften erfassen –
Desiderata [Measuring teachers’ professional competence - desiderata]. Zeitschrift f€ur Didaktik der Naturwissen-
schaften, 16, 361–367.
Bandura, A. (1977). Self-efficacy: Toward a unifying theory of behavioral change. Psychological Review, 84, 191–215.
Barrows, H., & Abrahamson, S. (1964). The programmed patient: A technique for appraising student performance
in clinical neurology. Journal of Medical Education, 39, 802–805.
Baumert, J., & Kunter, M. (2006). Stichwort: Professionelle Kompetenz von Lehrkräften [Keyword: teachers’ pro-
fessional competence]. Zeitschrift f€ur Erziehungswissenschaft, 9, 469–520.
Baumert, J., Kunter, M., Blum, W., Brunner, M., Voss, T., Jordan, A., . . . Tsai, Y. M. (2010). Teachers’ mathemati-
cal knowledge, cognitive activation in the classroom, and student progress. American Educational Research
Journal, 47(1), 133–180.
Berger, R., & Hänze, M. (2015). Impact of expert teaching quality on novice academic performance in the jigsaw
cooperative learning method. International Journal of Science Education, 37(2), 294–320.
Berland, L. K., & McNeill, K. L. (2012). For whom is argument and explanation a necessary distinction? A
response to Osborne and Patterson. Science Education, 96(5), 808–813.
Bl€omeke, S., Gustafsson, J., & Shavelsson, R. (2015). Beyond dichotomies. Competence viewed as a continuum.
Zeitschrift F€ur Psychologie, 223(1), 3–13.
Bl€omeke, S., K€onig, J., Busse, A., Suhl, U., Benthien, J., D€ohrmann, M., & Kaiser, G. (2014). Von der Lehreraus-
bildung in den Beruf - Fachbezogenes Wissen als Voraussetzung f€ur Wahrnehmung, Interpretation und Handeln
im Unterricht [Content related knowledge as a prerequisite for teachers’ observance, analysis and action]. Zeits-
chrift f€ur Erziehungswissenschaft, 17, 509–542.
Brophy, J. (2000). Teaching. Brussels, Belgium: UNESCO International Academy of Education.
Brouwer, C. (2010). Determining long-term effects of teacher education. In P. Peterson, E. Baker, & B. McGraw
(Eds.), International encyclopedia of education (pp. 503–510). Amsterdam, The Netherlands: Elsevier.
Brown, G. (2006). Explaining. In O. Hargie (Ed.), The handbook of communication skills (pp. 195–228). East Sus-
sex, UK: Taylor & Francis.
Carroll, J. (1963). A model of school learning. Teachers College Record, 64(8), 723–733.
Cauet, E., Liepertz, S., Kirschner, S., Borowski, A., & Fischer, H. E. (2015). Does it matter what we measure? Domain-
specific professional knowledge of physics teachers. Revue Suisse des sciences de l’education, 37(3), 462–479.
Chan, K.-W., & Elliott, R. G. (2004). Relational analysis of personal epistemology and conceptions about teaching
and learning. Teaching and Teacher Education, 20(8), 817–831.
Clark, L. A., & Watson, D. (1995). Constructing validity: Basic issues in objective scale development. Psychologi-
cal Assessment, 7(3), 309–319.
Cleland, A., Abe, K., & Rethans, J. (2009). The use of simulated patients in medical education: AMEE Guide No
42. Medical Teacher, 31(6), 477–486.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences. Hillsdale, NJ: Lawrence Erlbaum
Associates.
Delaney, S. (2012). A validation study of the use of mathematical knowledge for teaching measures in Ireland.
ZDM Mathematics Education, 44(3), 427–441.
Dorfner, T., F€ortsch, C., & Neuhaus, B. (2017). Die methodische und inhaltliche Ausrichtung quantitativer Video-
studien zur Unterrichtsqualität im mathematisch-naturwissenschaftlichen Unterricht [The methodical and content-
related orientation of quantitative video studies on instructional quality in mathematics and science education].
Zeitschrift f€ur Didaktik der Naturwissenschaften. https://doi.org/10.1007/s40573-017-0058-3.
€
Dubberke, T., Kunter, M., McElvany, N., Brunner, M., & Baumert, J. (2008). Lerntheoretische Uberzeugungen von
Mathematiklehrkräften: Einfl€usse auf die Unterrichtsgestaltung und den Lernerfolg von Sch€ulerinnen und
Sch€ulern [Beliefs about teaching and learning of mathematic teachers: impact on teaching quality and achieve-
ment]. Zeitschrift f€ur Pädagogische Psychologie, 22, 193–206.
Duit, R., Gropengießer, H., Kattman, U., Komorek, M., & Parchmann, I. (2012). The model of educational reconstruc-
tion – A framework for improving teaching and learning science. In D. Jorde & J. Dillon (Eds.), Science education
research and practice in Europe: Retrospective and prospective (pp. 13–37). Dordrecht, The Netherlands: Springer.
Eckel, J., Merod, R., Vogel, H., & Neuderth, S. (2014). Einsatz von Schauspielpatienten in den “Psych”-Fächern
des Medizinstudiums – Verwendungsm€oglichkeiten in der Psychotherapieausbildung? [Simulated patients n
“psych”-subjects of medicine – is there a use in educating psychotherapists?] Psychotherapie, Psychosomatik,
Medizinische Psychologie, 64, 5–11.
Enochs, L. G., & Riggs, I. M. (1990). Further development of an elementary science teaching efficacy belief instru-
ment: A preservice elementary scale. School Science and Mathematics, 90(8), 694–706.
Erg€oneç, J., Neumann, K., & Fischer, H. (2014). The impact of pedagogical content knowledge on cognitive activa-
tion and student learning. In H. E. Fischer, P. Labudde, K. Neumann, & J. Viiri (Eds.), Quality of instruction in
physics. Comparing Finland, Germany and Switzerland (pp. 145–160). M€unster, Germany: Waxmann.
Fan, X., Thompson, B., & Wang, L. (1999). Effects of sample size, estimation methods, and model specification on
structural equation modeling fit indexes. Structural Equation Modeling, 6, 56–83.
Fang, Z. (1996). A review of research on teacher beliefs and practices. Educational Research, 38(1), 47–65.
Fechner, S. (2009). Effects of context-oriented learning on student interest and achievement in chemistry education.
Berlin, Germany: Logos.
Fischer, H., Borowski, A., & Tepner, O. (2012). Professional knowledge of science teachers. In B. Fraser, K. Tobin,
& C. McRobbie (Eds.), Second international handbook of science education (pp. 435–448). Dordrecht, The
Netherlands: Springer.
Fischer, H., Neumann, K., Labudde, P., & Viiri, J. (2014a). Theoretical framework. In H. E. Fischer, P. Labudde,
K. Neumann, & J. Viiri (Eds.), Quality of instruction in physics. Comparing Finland, Germany and Switzerland
(pp. 13–30). M€unster, Germany: Waxmann.
Fischer, H., Neumann, K., Labudde, P., & Viiri, J. (Eds.). (2014b). Quality of instruction in physics. Comparing
Finland, Germany and Switzerland. M€unster, Germany: Waxmann.
Gage, N. (1968). The Microcriterion of Effectiveness in Explaining. In N. Gage (Ed.), Explorations of the Teacher’s
Effectiveness in Explaining, Technical Report No. 4, S. 1–8. Stanford, CA: Stanford Center for Research and
Developement in Teaching.
Geelan, D. (2012). Teacher explanations. In B. Fraser, K. Tobin, & C. McRobbie (Eds.), Second international hand-
book of science education (pp. 987–999). Dordrecht, The Netherlands: Springer.
Gramzow, Y., Riese, J., & Reinhold, P. (2013). Modellierung fachdidaktischen Wissens angehender Physiklehrkräfte
[Modelling prospective teachers’ knowledge of physics education]. Zeitschrift f€ur Didaktik der Naturwissenschaf-
ten, 19(1), 7–30.
Harden, R., Stevenson, M., & Wilson, W., G., W. (1975). Assessment of clinical competence using objective struc-
tured examination. British Medical Journal, 1, 447–451.
Hattie, J. A. C. (2009). Visible learning. A synthesis of over 800 meta-analyses relating to achievement. Abingdon,
UK: Routledge.
Helmke, A. (2006). Unterrichtsqualität: Erfassen, Bewerten, Verbessern [Teaching quality: Measurement and
improvement]. Seelze, Germany: Kallmeyersche Verlagsbuchhandlung.
Hempel, C., & Oppenheim, P. (1948). Studies in the logic of explanation. Philosophy of Science, 15(2), 135–175.
Hestenes, D., Wells, M., & Swackhamer, G. (1992). Force concept inventory. The Physics Teacher, 30(3), 141–166.
Hill, H. C., Rowan, B., & Ball, D. L. (2005). Effects of teachers’ mathematical knowledge for teaching on student
achievement. American Educational Research Journal, 42(2), 371–406.
Hodges, B., Regehr, G., Hanson, M., & McNaughton, N. (1998). Validation of an objective structured clinical
examination in psychiatry. Academic Medicine, 73(8), 910–912.
Hofstein, A., & Lunetta, V. (2004). The laboratory in science education: Foundations for the twenty-first century.
Science Education, 88(1), S. 28–54.
Hofstein, A., & Kind, P. M. (2012). Learning in and from science laboratories. In B. Fraser, K. Tobin, & C.
McRobbie (Eds.), Second International Handbook of Science Education (pp. 189–207). Dordrecht, The Nether-
lands: Springer.
Keller, M., Neumann, K., & Fischer, H. (2016). The impact of teachers’ pedagogical content knowledge and moti-
vation on students’ achievement and interest. Journal of Research in Science Teaching, 54(5), 568–614.
Kitcher, P. (1981). Explanatory unification. Philosophy of Science, 48(4), 507–531.
Kulgemeyer, C., & Schecker, H. (2012). Physikalische Kommunikationskompetenz - Empirische Validierung eines
normativen Modells [Physics communication competence – Empirical validation of a normative model]. Zeits-
chrift f€ur Didaktik der Naturwissenschaften, 18, 29–54.
Kulgemeyer, C., & Schecker, H. (2013). Students explaining science – Assessment of science communication com-
petence. Research in Science Education, 43, 2235–2256.
Kulgemeyer, C., & Tomczyszyn, E. (2015). Physik erklären – Messung der Erklärensfähigkeit angehender Physi-
klehrkräfte in einer simulierten Unterrichtssituation [Explaining physics – Measuring teacher trainees’ explaining
skills using a simulated teaching setting]. Zeitschrift f€ur Didaktik der Naturwissenschaften, 21(1), 111–126.
Lee, E., & Luft, J. A. (2008). Experienced secondary science teachers’ representation of pedagogical content knowl-
edge. International Journal of Science Education, 30(10), 1343–1363.
Magnusson, S., Krajcik, J., & Borko, H. (1999). Nature, sources, and development of pedagogical content knowl-
edge for science teaching. In J. Gess-Newsome & N. Lederman (Ed.), Examining pedagogical content knowledge
(pp. 95–132). Dordrecht, The Netherlands: Kluwer Academic Publishers.
Mayring, P. (2000). Qualitative Content Analysis. Forum: Qualitative Social Research, 1(2), 1–10.
McNaughton, N., Ravitz, P., Wadell, A., & Hodges, B. (2008). Psychiatric education and simulation: A review of
the literature. The Canadian Journal of Psychiatry, 53(2), 85–93.
Miller, G. (1990). The assessment of clinical skills/competence/performance. Academic Medicine, 65(9), S63–567.
Nau, J., Halfens, R., Needham, I., & Dassen, T. (2010). Student nurses’ de-escalation of patient aggression: A
pretest-posttest intervention study. International Journal of Nursing Studies, 47, 699–708.
Nunally, J. (1978). Psychometric theory. New York, NY: McGraw Hill.
O’Neill, G. P. (1988). Teaching effectiveness: A review of the research. Canadian Journal of Education, 13(1),
162–185.
Osborne, J. F., & Patterson, A. (2011). Scientific argument and explanation: A necessary distinction? Science Edu-
cation, 95(4), 627–638.
Park, R. S., Chibnall, J. T., Blaskiewicz, R. J., Furman, G. E., Powell, J. K., & Mohr, C. J. (2004). Construct valid-
ity of an objective structured clinical examination (OSCE) in psychiatry: Associations with the clinical skills
examination and other indicators. Academic Psychiatry, 28(2), 122–128.
Park, S., & Oliver, J. S. (2008). Revisiting the conceptualisation of pedagogical content knowledge (PCK): PCK as
a conceptual tool to understand teachers as professionals. Research in Science Education, 38(3), 261–284.
Peterson, P., Carpenter, T., & Fennema, E. (1989). Teachers’ knowledge of students’ knowledge in mathematics
problem solving: Correlating and case analysis. Journal of Educational Psychology, 81(4), 558–569.
Pianta, R. C., Hamre, B. K., & Mintz, S. (2012). Classroom assessment scoring system: Secondary manual. Charlot-
tesville, VA: Teachstone.
Prosser, M., & Trigwell, K. (1999). Understanding learning and teaching: The experience in higher education.
Buckingham, UK: SRHE and Open University Press.
Retelsdorf, J., Butler, R., Streblow, L., & Schiefele, U. (2010). Teachers’ goal orientations for teaching: Association
with instructional practices, interest in teaching, and burnout. Learning and Instruction, 20(1), 30–46. https://
doi.org/10.1016/j.learninstruc.2009.01.001
Rienties, B., Lygo-Baker, S., & Brouwer, N. (2013). The effects of online professional development on higher edu-
cation teachers’ beliefs and intentions towards learning facilitation and technology. Teaching and Teacher Edu-
cation, 29, 122–131.
Riese, J. (2009). Professionelles Wissen und professionelle Handlungskompetenz von (angehenden) Physiklehrkräf-
ten [Professional knowledge and competence of physics student teachers]. Berlin, Germany: Logos.
Riese, J., Kulgemeyer, C., Borowski, A., Fischer, H., Gigl, F., Gramzow, Y., . . . Zander, S. (2015). Modellierung
und Messung des Professionswissens in der Lehramtsausbildung Physik [Modeling and measuring professional
knowledge in physics teacher training]. Beiheft der Zeitschrift f€ur Pädagogik, 61, 55–79.
Riese, J., & Reinhold, P. (2012). Die professionelle Kompetenz angehender Physiklehrkräfte in verschiedenen Aus-
bildungsformen [Professional competence of student teachers enrolled in different teacher training programmes].
Zeitschrift F€ur Erziehungswissenschaft, 15(1), 111–143.
Rochelson, B., Baker, D., Mann, W., Monheit, A., & Stone, M. (1985). Use of male and female professional patient
teams in teaching physical examination of the genitalia. The Journal of Reproductive Medicine, 30(11), 864–866.
Roelle, J., Berthold, K., & Renkl, A. (2014). Two instructional aids to optimise processing and learning from
instructional explanations. Instructional Science, 42, 207–228.
Ross, J. A. (1998). The antecedents and consequences of teacher efficacy. In J. Brophy (Ed.), Advances in research
on teaching (pp. 49–73). Greenwich, CT: JAI Press.
Sadler, P. M., Sonnert, G., Coyle, H. P., Cook-Smith, N., & Miller, J. L. (2013). The Influence of Teachers’ Knowl-
edge on Student Learning in Middle School Physical Science Classrooms. American Educational Research Jour-
nal, 50(5), 1020–1049.
Schecker, H. (1986). Sch€ulerinteresse und Sch€ulervorstellungen zur Mechanik (S II) [Students’ interests and students
conceptions in upper-secondary mechanics]. Physica Didatica, 2/3, 21–33.
Schiefele, U., & Schaffner, E. (2015). Teacher interests, mastery goals, and self-efficacy as predictors of instruc-
tional practices and student motivation. Contemporary Educational Psychology, 42, 159–171. https://doi.org/
10.1016/j.cedpsych.2015.06.005
Scott, P., Asoko, H., & Driver, R. (1992). Teaching for conceptual change — A review of strategies. In R. Duit, F.
Goldberg, & H. Niedderer (Eds.), Research in physics learning: Theoretical issues and empirical studies (pp.
310–329). Kiel, Germany: IPN.
Seidel, T., & Shavelson, R. J. (2007). Teaching effectiveness research in the past decade: The role of theory and
research design in disentangling meta-analysis results. Review of Educational Research, 77(4), 454–499.
Sevian, H., & Gonsalves, L. (2008). Analysing how scientists explain their research: A rubric for measuring the
effectiveness of scientific explanations. International Journal of Science Education, 30(11), 1441–1467.
Shulman, L. (1986). Those who understand: Knowledge growth in teaching. Educational Researcher, 15(2), 4–14.
Shulman, L. (1987). Knowledge and teaching: Foundations of the new reform. Harvard Education Review, 57(1), 1–22.
Staub, F. C., & Stern, E. (2002). The nature of teachers’ pedagogical content belief matters for students’ achieve-
ment gains: Quasi-experimental evidence from elementary mathematics. Journal of Educational Psychology, 94,
344–355.
Stipek, D., Givvin, K., Salmon, J., & MacGyvers, V. (2001). Teachers’ beliefs and practices related to mathematics
instruction. Teaching and Teacher Education, 17, 213–226.
Terhart, E. (2012). Wie wirkt Lehrerbildung? Forschungsprobleme und Gestaltungsfragen [What is the impact of
teacher education?]. Zeitschrift f€ur Bildungsforschung, 2(1), 3–21.
Toulmin, S. (1958). The uses of argument. Cambridge, UK: Cambridge University Press.
Treagust, D., & Harrison, A. (1999). The genesis of effective science explanations for the classroom. In J. Loughran
(Ed.), Researching teaching: Methodologies and practices for understanding pedagogy (pp. 28–43). Abingdon,
UK: Routledge.
Tschannen-Moran, M., & Woolfolk Hoy, A. (2001). Teacher efficacy: Capturing an elusive construct. Teaching and
Teacher Education, 17, 783–805.
Van Driel, J. H., Verloop, N., & De Vos, W. (1998). Developing science teachers’ pedagogical content knowledge.
Journal of Research in Science Teaching, 35(6), 673–695.
Vinzi, V. E., Trinchera, L., & Amato, S. (2010). PLS path modeling: from foundations to recent developments and
open Iissues for model assessment and improvement. In V. E. Vinzi, W. Chin, J. Henseler, & H. Wang (Eds.),
Handbook of Partial Least Squares (pp. 47–82). Berlin, Germany: Springer.
Vogelsang, C. (2014). Validierung eines Instruments zur Erfassung der professionellen Handlungskompetenz von
Physiklehrkräften - Zusammenhangsanalysen zwischen Lehrerkompetenz und Lehrerperformanz [Validating an
instrument to measure physics teachers’ professional competence]. Berlin, Germany: Logos.
Wahl, D. (1991). Handeln unter Druck. Der weite Weg vom Wissen zum Handeln bei Lehrern, Hochschullehrern
und Erwachsenenbildnern [Acting under pressure. The long way from knowledge to Action for teachers and
educators]. Weinheim, Germany: Deutscher Studien Verlag.
Walters, K., Osborn, D., & Raven, P. (2005). The development, validity and reliability of a multimodality objective
structured clinical examination in psychiatry. Medical Education, 39, 292–298.
Weinert, F. (1996). Der gute Lehrer” die gute Lehrerin” im Spiegel der Wissenschaft: Was macht Lehrende wirksam
und was f€uhrt zu ihrer Wirksamkeit? [What makes teachers impactful and what leads to their impact?]. Beiträge
zur Lehrerbildung, 14, 141–151.
Weinert, F. (2001). Concept of competence: A conceptual clarification. In D. Rychen & L. Salganik (Eds.), Defining
and selecting key competencies (pp. 45–65). G€ottingen, Germany: Hogrefe 1 Huber.
Williams, D. (2010). Outcome expectancy and self-efficacy: Theoretical implications of an unresolved contradiction.
Personality and Social Psychology Review, 14(4), 417–425.
Wilson, H., & Mant, J. (2011a). What makes an exemplary teacher of science? The pupils’ perspective. School Sci-
ence Review, 93(342), 121–125.
Wilson, H., & Mant, J. (2011b). What makes an exemplary teacher of science? The teachers’ perspective. School
Science Review, 93(343), 115–119.
Wittwer, J., & Renkl, A. (2008). Why instructional explanations often do not work: A framework for understanding
the effectiveness of instructional explanations. Educational Psychologist, 43(1), 49–64.
S U PP OR TI NG I NFO R M AT IO N
Additional Supporting Information may be found online in the supporting information tab for this
article.
How to cite this article: Kulgemeyer C, Riese J. From professional knowledge to professional
performance: The impact of CK and PCK on teaching quality in explaining situations. J Res Sci
Teach. 2018;00:1–26. https://doi.org/10.1002/tea.21457

1 Kulgemeyer 2018

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

1 Kulgemeyer 2018

Uploaded by

Copyright:

Available Formats

Received: 12 April 2017

| Revised: 30 November 2017

From professional knowledge to professional

J Res Sci Teach. 2018;1–26. wileyonlinelibrary.com/journal/tea V

1.1 | Teaching quality

1.2 | Teachers’ professional knowledge

1.3 | The impact of teachers’ professional knowledge on teaching quality

1.5 | Professional competence and performance

1.6 | Difficulties in testing teachers’ knowledge and performance

1.7 | Testing for performance rather than for knowledge

1.8 | Teacher explanations as a standard situation of physics instruction

1.8.1 | Scientific explanations and science teaching explanations

1.8.2 | A model of explaining physics

FIGURE 1 Model of dialogic explaining

3.3.2 | Explaining performance

1. Identifying every statement of the explainers and the novices.

T A BL E 1 Twelve categories used to calculate the performance index (PI)

Presenting concrete Explainer presents numbers as an example instead of leaving a formula

Summary The explainer summarizes the explanation briefly.

“Answering inadequately” decreases the explaining performance index.

3.3.3 | Professional knowledge

Pedagogical content knowledge

3.3.4 | Beliefs on aspects of self-efficacy and teaching and learning

Beliefs about self-efficacy regarding the instructional strategy of explaining

Beliefs on teaching and learning regarding a constructivist or transmissive view on

T A BL E 2 Descriptive statistics of study variables

M (overall) SD (overall) M (male) M (female)

CK 32.5 11.8 34.5 29.0

PCK 40.2 12.6 40.7 39.3

BT 78.6 7.5 78.6 78.6

SE 55.0 11.7 55.0 55.0

EXP 54.0 14.8 54.5 53.7

3.4 | Data analyses

4.1 | Descriptive results

T A BL E 3 Manifest correlations of the measured variables and selected demographics (Pearson’s r)

Semesters Credit points High-school Weeks of internships

CK 0.069 0.354** 0.369** 0.353** 0.286*

PCK 20.017 0.436** 0.365** 0.228* 0.299**

BT 20.093 0.081 20.048 0.220* 0.048

SE 20.256* 20.340** 20.385** 0.127 20.234*

EXP 0.049 0.015 0.482** 0.302** 20.174

Explainers who Explainers who did

Presenting concrete numbers for formulas 66 43

Explaining physics concepts in everyday language 78 31

Connecting non-verbal elements 55 54

Using items in general 43 66

Connecting items with the topic by showing analogy 13 96

Request action from explainee 12 97

T A BL E 5 Manifest inter-correlations of the measured variables (Pearson’s r)

SE 0.153 0.119 20.079 – –

EXP 0.376** 0.376*** 20.339*** 0.367*** –

4.2 | Impact of CK and PCK on explaining performance

F I G U R E 3 Path model for explaining performance (Robust maximum-likelihood, full-information-maximum-likeli-

EXP 0.34** 0.29** 0.32** 0.34** 0.17

*p < 0.05; **p < 0.01; ***p < 0.001.

You might also like

CK 0.069 0.354 0.369 0.353** 0.286*

PCK 20.017 0.436 0.365 0.228* 0.299**

SE 20.256* 20.340 20.385 0.127 20.234*

EXP 0.049 0.015 0.482 0.302 20.174

EXP 0.376 0.376* 20.339* 0.367* –

EXP 0.34 0.29 0.32 0.34 0.17

p < 0.05; p < 0.01; p < 0.001.