Couzijn - 1999 - Learning To Write by Observation of Writing

Learning and Instruction 9 (1999) 109–142
Learning to write by observation of writing and

reading processes: effects on learning and
transfer
Michel Couzijn*
Graduate School of Teaching and Learning, University of Amsterdam, Wibautstraat 4, 1091 GM,
Amsterdam, The Netherlands
Abstract
A traditional writing pedagogy, learning-by-doing exercises, is criticised for its lack of focus
on balancing writing and learning processes. Three variants of learning-by-observation are
investigated as possible alternatives: observing writers as models (OW), observing both writers
and readers as models (OWR), and observing readers as feedback on writing performance
(FW).
Observations were made by means of authentic video-tape recordings of student writers or
readers (model conditions), or by live confrontations between writers and their readers
(feedback condition). Training focused on argumentative text. Participants were pre- and post-
tested on reading skill and writing skill in order to measure learning and transfer effects.
Results show that all observation conditions were more effective than the learning-by-doing
condition: OW and FW showed larger learning effects (on writing skill) and larger transfer
effects (on reading skill). Condition OWR only showed larger transfer effects. It is concluded
that the effective components of learning-by-observation deserve to be studied in more detail.
 1999 Elsevier Science Ltd. All rights reserved.
1. Introduction
Traditional methods for writing instruction rely firmly on the effect of doing exer-
cises. Such a method can be summarised as follows:
* Tel. ⫹ 31-20-525.1588/1288, Fax. ⫹ 31-20-525.1290, E-mail: couzijn@ilo.uva.nl
0959-4752/99/$ - see front matter  1999 Elsevier Science Ltd. All rights reserved.
PII: S 0 9 5 9 - 4 7 5 2 ( 9 8 ) 0 0 0 4 0 - 1
110 M. Couzijn / Learning and Instruction 9 (1999) 109–142
1. presenting subject-matter (by the teacher)

2. doing exercises (by the students)
3. evaluating the products and providing feedback (by the teacher).
Because of the accent on doing exercises, we will call this method learning-by-doing.
Several handicaps of such a method can be identified (Couzijn, 1995; Couzijn &
Rijlaarsdam, 1996). The most important are: there is little stimulation to reflect on the
writing process; these are often unclear criteria for “good” and “weak” performance;
evaluation activities are left to the teacher; students hardly detect or remedy the flaws
in their writing; feedback goes out to the product rather than the process. In sum,
it is very hard for writing students to gather information about the quality of their
writing activities, let alone their learning progress.
Considering the relationship between writing activities and learning activities, I
will argue that the cognitive load on attentional processes is so high during writing
that traditional writing pedagogies fail to help students balance their attention
between the writing and the learning task. Several experimental pedagogies are inves-
tigated here which may offer useful alternatives.
1.1. Writing versus Learning-to-Write
Educational psychologists have called attention to the crucial role of self-obser-

vation. Self-observation is related to self-monitoring (perceiving one’s activities dur-
ing task execution), with self-reflection (processing the output of monitoring by
evaluation, abstraction and attribution) or with self-regulation (controlling the task
execution for the sake of effectiveness, based on information from self-observation
and self-reflection) (e.g. Simons & Beukhof, 1987; Kuhl & Kraska, 1989; Vermunt,
1992; Ng & Bereiter, 1992; Schunk, 1991; Schunk & Zimmerman, 1994). Many of
them place their theory in the context of learning processes. However, it is important
to note that for regulation of learning a certain skill depends on regulation of the
executive processes of that particular skill. Task regulation is conditional to the regu-
lation of the learning process for that task (Ng & Bereiter, 1992). By comparing the
processes of executing a task and learning to execute it, I will attempt to clarify the
key mediating function of self-monitoring and self-reflective activities.
We can divide the cognitive activities for performing a writing task into
executional activities aimed at text production (orientation, writing and revising
activities), monitoring activities aimed at acquiring on-line knowledge of one’s actual
task behaviour (self-observation, evaluation and reflection) and regulative activities
aimed at strategic control of the former types of activities, dependent on how they
are being evaluated in the course of the writing (Fig. 1).
In Fig. 1, executive and monitoring activities are placed on levels I and II, respect-
ively.
An effective temporal organisation of these activities is governed by regulative
activities placed on a third level. Straight arrows indicate the flow of information
between activity categories; curved arrows indicate activation prompts. A central
position is given to monitoring and evaluative activities, since they supply the knowl-
M. Couzijn / Learning and Instruction 9 (1999) 109–142 111
Fig. 1. Functional relationships between levels of performance activities.
edge for skilful regulation and thus execution of the entire writing process. Being
self-aware of one’s writing activities and their consequences is an essential step
towards detecting possible flaws in, and possibilities to enhance one’s, writing. Thus,
good writers invest in staying aware of their activities during the course of the writ-
ing process.
How does this perspective on performance regulation relate to learning? Writers
in a learning situation, like students at school, should consider a writing task or
exercise as being part of a learning task. They must execute two processes at the
same time: a writing process (with a material aim: producing a text) and a learning
process (with a cognitive aim: acquiring knowledge about, and skill in producing
such texts). This parallel learning process can be represented with the same mor-
phology as the writing process, including executive activities (orientation on the
learning task, performing learning activities), monitoring activities (self-observation
and evaluation of learning activities) and regulation of learning (e.g. starting over
again, or deciding to skip parts of the exercise). The writing task ought to be instru-
mental to the learning task; thus the quality of learning is dependent on they way
the student is able to connect these two tasks.
This connection lies in the instructiveness of writing experiences evoked in edu-
cation. To be instructive, writing rules, techniques, strategies must not only be
executed, but also be consciously monitored and conceptualised (“What have I been
doing now? How should I call it? Which strategy must I choose? Have I done any-
thing similar before?”), and their positive or negative effects determined (“That strat-
egy has been very time-consuming. This brainstorm gave me a lof of material. This
type of sentences served will in the conclusion.”). In this way, student writers may
use their writing experiences or evaluations (the output of the monitoring processes
on level II) as input for their learning. Students who put some effort into the evalu-
ation of their working-method invest in the meaningfulness and effectiveness of
their learning.
In sum, learning-to-write by doing writing exercises makes a strong appeal to the
learners’ self-observing and self-regulative capacities. The regulation concerns three
aspects of the task: a) students must maintain a “double agenda” with activities aimed
at text production, and activities aimed at learning; b) for each of these agendas they
must effectively alternate executional, monitoring, and regulative activities, and c)
they must control a variety of executive activities for the composition or comprehen-
sion of text.
It is, however, not possible nor productive for learners to be constantly aware of
all of these mental activities. Thus, a permanent awareness is not advocated here: it
takes much cognitive effort of learners to switch between levels of task execution,
self-observation and regulation. As a result, it may be that when learning-by-doing,
the self-observation activities focus on writing performance at the cost of learning
performance: the short-term interest may dominate. If this turns out to be a weak
spot of the learning-by-doing method, compensation must be sought in alternative
methods.
1.2. Learning to write by observation
If we assume that self-monitoring and self-evaluation are important for learning

complex skills, a simple learning-by-doing type of instruction for writing may not
offer sufficient support for many students. While some students are keen on finding
instructive aspects in even simple writing or reading assignments, others will just
go through the motions as it were and fail to observe, let alone improve, their writing
behaviour; mostly because a strategic distribution of attention across writing and
learning levels is beyond their capacities. Such students may profit from more pro-
cess-oriented types of instruction that stimulate their monitoring and evaluation of
writing activities, and thus help to balance their attention between writing and the
learning processes.
In this study, an instructional method is tested which favours observation of peers
as a learning activity over individual exercise. Observation activities are specially
focused on monitoring and evaluation, and this focus may take away some impedi-
ments for learning. Instead of self-observation, the students concentrate on observing
and evaluating other students’ performance. With its accent on observation as a learn-
ing activity, this study adheres to the observational learning or social cognitive learn-
ing paradigms (Bandura, 1977, 1986; Schunk, 1991; Schunk & Zimmerman, 1994).
We discern two functionally different types of observation, which both can be
compared to a learning-by-doing pedagogy. The two types are adapted from studies
by Sonnenschein & Whitehurst (1983); Sonnenschein & Whitehurst, (1984); two
variants of observation-of-models) and by Schriver (1989); Schriver, (1992); obser-
vation-of-readers-as-feedback). Observation of models aims at monitoring and evalu-
ating the executional activities of a task one wants to learn. Evaluation means
determining whether the observed activity is an example worth following or worth
avoiding. In this learning method, the observed behaviour represents the observer’s
learning goal, and the learner does not take part in the observed communication.
Observation-as-feedback, however, relies on the observers’ participation in the
communication. Their initial task execution (here: doing a writing exercise) is fol-
lowed by observation of a communicative partner who performs the complementary
task (here: analysing and comprehending the text). Thus, the observers acquire infor-
mation about the adequacy of their writing which they cannot acquire in another way
(Schriver, 1992). The observed behaviour is complementary (from a communicative
perspective) to the observer’s learning goal, and the observer takes part in the com-
munication.
Both learning-by-observation methods may be superior to learning-by-doing,
although for different reasons. Observation-of-models may be effective because the
models offer concrete, realistic examples, and the focus is on monitoring and evaluat-
ing the execution processes. Observation-of-feedback may be effective because the
feedback is meaningful and authentic and the observer is personally involved. There
is no theory favouring one of these methods over the other.
In analogy to Sonnenschein and Whitehurst’s study, we included two variants of
the observation-of-models pedagogy: observing one communicative role (writers)
and observating both communicative roles (writers and readers, who perform a com-
plete communicative transfer). We expected that communication rules, observed in
both writing and reading contexts, will be acquired in a more abstract way which
enables active use in either of these modes. In other words, the acquired rule will
be transferred to the reading as well as the writing mode.
In sum, this experiment investigates the effectiveness of three functionally differ-
ent types of learning-by-observation in comparison to learning-by-doing. Two vari-
ants of observation-of-models are examined: Observing Writers as models (OW),
and Observing Writers and Readers as models (OWR). One variant of observation-
as-feedback is examined: observing readers as Feedback on one’s own Writing per-
formance (FW).
We distinguish effects on learning and effects on transfer. Learning stands for the
acquisition of skill within the same mode as the learning activities are aimed at:
transfer stands for acquisition of skill within the complementary mode. Since all
activities aim at writing skill acquisition, we call the effect on writing ability learning,
and the effect on reading skill transfer.
2. Research questions and expectations
2.1. Variable structure and research questions
Fig. 2 represents the structural relations between the independent variables, and
indicate the research questions:
Fig. 2. Structure of independent variables in this experiment.
2.1.1. Questions regarding learning effects

Q1: Is observing writing (OW) more effective than doing writing exercises
(DW)?
Q2: Is observing reading as feedback on writing (FW) more effective than
doing writing exercises (DW)? (Expectation: both types of learning-by-
observation are more effective.)
Q3: Is observing writing and reading (OWR) more effective than doing
writing exercises (DW)? (Expectation: no difference (Sonnenschein and
Whitehurst, (1984)
Q4: Is observing writing and reading (OWR) more effective than observing
writing (OW)? (Expectation: no difference (Sonnenschein & Whitehurst
(1984))
2.1.2. Questions regarding transfer effects

Q5: Does observation of writing (OW) promote more transfer than doing
writing exercises (DW)?
Q6: Does observation of readers as feedback on writing (FW) promote more
transfer than doing writing exercises (DW)? (Expectations for Q5 and
Q6: observation promotes less context-specific learning and thus more
transfer (but see Sonnenschein & Whitehurst, 1984).
2.2. Operationalisation
2.2.1. Independent variables

The experimental conditions were created by manipulating one variable: the
instructional method. Content of the writing lessons and time spent on studying sub-
ject matter and on doing exercises were kept equal for all conditions. Learning-by-
doing was operationalised as individually doing various structured writing exercises,
as part of a four-lesson course on argumentative text. Learning-by-observation-of-
models (one mode) was operationalised as observing peer students who thought aloud
while performing these writing exercises. Learning-by-observation-of-models (both
modes) was operationalised as first observing a writer (thinking aloud while doing
a writing exercise) and then observing a reader (thinking aloud while analysing the
writer’s text). Learning-by-observation-as-feedback was operationalised as first writ-
ing an argumentative text, then observing a reader (a peer student who thought aloud
while analysing the text), and evaluating one’s own performance based on the obser-
vation.
2.2.2. Dependent variables

There are two dependent variables. Learning is considered the increase in writing
skill regarding argumentative texts. Transfer is the progress within the complemen-
tary mode: reading argumentative text.
We distinguished several indicators of argumentative writing and argumentative
reading, derived from the content and learning goals of the argumentative writing
course (Appendix A). These indicators were measured in the posttests. We had no
specific expectations as to which of these abilities would profit most from the experi-
mental interventions. The productive and receptive activities are considered comp-
lementary, in that they reflect how writers and readers co-operate in order to make
their communication successful. The writer’s knowledge, which students acquire,
may be generalised and transferred to the reading mode (Appendix B).
In selecting subject-matter for the argumentative writing course, we chose to join
the pragma-dialectical perspective on argumentation advocated by Van Eemeren &
Grootendorst (1983); Van Eemeren & Grootendorst, (1992)). Argumentation is
explicitly framed within the social situation of a (critical and problem-solving) dis-
cussion. We expect that this social perspective with its distinct, but related communi-
cative roles allows for meaningful and useful integration of receptive and productive
skills. The strategy a writer applies in structuring his text can be reversed and used
by the reader to analyse this structure. For instance, if the writer has used verbal
markers for his argumentation (“First... second....”) then readers may use such mark-
ers to enhance their analysis. Or readers may look for a clear presentation of the
issue (“the main problem is that...”) which the writer has intentionally formulated
in this way.
3. Method
3.1. Design
In this pretest-posttest control-group design, the learning by doing group was

denominated as control group. Pretests, assessing intelligence and skill in reading
and writing argumentative text, were used as covariates in the analyses of posttest
data in order to correct for initial group differences. Random assignment of subjects
to groups had kept these differences small (Table 1).
The experimental conditions differ in learning activities. Posttests, however, are
the same for all conditions: measurements of both writing and reading skill regarding
argumentation. For the DW, OW and FW conditions, the writing posttests measure
learning (because training and testing are within the same mode) and the reading
posttests measure transfer to the complementary communication mode.
The position of the OWR condition is different, since its training aims equally at
both modes; thus the writing and reading posttests will measure learning for each
of these modes. We do not assume the existence of between-mode transfer effects
in addition to the learning effects within each mode. Even if such transfer occurred,
it would probably sink into insignificance beside the learning effects.
3.2. Subjects
120 students from 8 city schools who had just finished the 9th grade (intermediate
and high levels) took part in the experiment. The average age was 15.5 years. 65%
of the participants were female; boys and girls were almost equally spread over the
conditions. For their participation, the students received a modest financial reward.
In the assignment of subjects, a stratification was applied regarding gender and
the level of education (intermediate vs. high); further assignment across the strata
was random. In each condition precisely 12 students from intermediate level and 18
from high level took part.
Table 1
The experimental design
Condition n Pre-tests Learning activities Post-tests
DW 30 x Doing Writing exercises x

OW 30 x Observing Writers as models x
OWR 30 x Observing Writers and Readers as x
models
FW 30 x Observing Readers as Feedback on x
Writing
3.3. Training materials: experimental courses on argumentative texts
A four-lesson course on argumentative writing was developed, which served as a

basic course for each condition. Adaptation to each of the four conditions was done
by adding a particular type of exercises that would evoke specific learning activities
(see below).
The short, self-instructing course consisted of four cumulative lessons of one hour
each. All subjects could work individually without a teacher’s help. There was a
workbook for each lesson, containing theory and specific exercises. The theory was
divided in small parts of 1 ⫺ 1 12 page and was explained using realistic examples.
The course was pre-tested for comprehension and duration. Formulations were
changed and one exercise was dropped.
3.4. Instructional sequence and types of exercises
The theory on argumentative texts forms the backbone of the four different courses
that were developed for the experimental conditions. Nevertheless, the subjects spent
about 70% of the time on the exercises in which the theory must be applied. The
nature of a course as a learning-by-doing course or a learning-by-observation course
is therefore not at all determined by the theory, but only by the type of exercises.
First, subjects in all conditions study the same theoretical part, and subsequently
answer one or two “control questions”. These questions ask for the gist of the part
that has just been studied and are intended to stimulate active reading of the theory.
Next, subjects apply the theory in one of four different types of exercises: individ-
ual writing exercises (DW), observation of writers (OW) or communicative dyads
(OWR), or observation as feedback on writing exercises (FW). After completing one
or more exercises, subjects continue with the next portion of theory, the next control
question, the next exercise, and so on.
The differences between the types of exercises can be explained by an example
from the first lesson. First, in the theoretical part, the two characteristics of argumen-
tative text have been introduced: a stated opinion, and one or more reasons for having
this opinion:
I think we should go to Italy for our holidays, because the whether is always
fine and the food is great.
You must really put the volume of your music down. I cannot work with all that
noise in my ears.
Thus, concept learning takes place (Mayer, 1983, 1987): subjects learn the concept
argumentative text, and learn to identify a text as belonging to a particular type,
according to a conceptual rule:
S and A are the essential parts/properties of a type B text
Such neutral conceptual rule can be used in either receptive or productive communi-
cation. It would have to be adapted to receptive tasks, aimed at determining the
properties from which class membership is inferred:
⬍ if (goal ⫽ typify text as type B)

then (actions ⫽ (identify characteristic A) and (identify characteristic B)) >
Its counterpart, the productive form, can be used in writing tasks for generating
activities:
⬍ if (goal ⫽ produce text type B)
then (actions ⫽ (produce characteristic A) and (produce characteristic B)) >
For further discussion, see Couzijn (1995). After the subjects reproduced the charac-
teristics on paper (control question), they start with the first exercise.
3.4.1. DW (learning by doing writing exercises)

The subjects from the DW condition do the following assignment:
Check again the three examples on page 2 and then write three new examples
of argumentative texts.
DW subjects must use the rule productively. In the workbooks a limited space is
reserved for the answer, so they must confine themselves to application of the rule.
More specifically, they must give meaning to the abstract concepts “opinion” and
“reason for having this opinion”, aided by the examples. Secondly, they must under-
stand that both characteristics are necessary to meet the rule: opinions only, however
floridly presented, will not suffice. Finally, they must generate concretisations of the
characteristic concepts.
3.4.2. OW (observation of writing)

OW subjects do not have to do the writing exercises themselves. Instead, they
observe age-group students doing theses exercises. Authentic videotape recordings
of age-group students are used for the observations. The observed students think-
aloud during writing, so the observer can closely follow the writing process. Immedi-
ately after the writing assignment is presented on paper to them, their observation
assignment runs:
You are going to see two students doing this assignment. It is your task to find
out what they do well, and what they do wrong. When you have observed both
students, you may advance to the next page.
And on the next page it reads:
You saw two students doing the assignment. They wrote the following texts:
— Student 1: “I don’t need a dog any more, because I already have three.”
— Student 2: “Dogs are more fun than cats, but they need much more attention.”
⫽ ⫽ ⫽ > > > Which student did better, according to you? Student.....
⫽ ⫽ ⫽ > > > Explain briefly why you think the other student did worse.
— Student..... did worse, because: …………………………………………
The subjects get oriented to the observation exercise by reading the writing assign-
ment. Next they are explicitly instructed to evaluate the observed students’ task per-
formance, which should stimulate engaged and therefore instructive observation.
Observation thus holds that the subject checks the application of the rule by the
observed students — and which problems may arise.
After having observed two different student writers (see section Procedures) the
subject must determine if one of them did worse, and explain what exactly made
this performance less successful. In this way the subjects are forced to designate
“good models” and “worse models”, and not take anything for granted.
It should be noted that the subjects in this condition not only observe writing
processes, but also perform comprehension processes. In order to evaluate the texts
of the observed writers, they must analyse them in terms of the argumentative charac-
teristics. This side-note is important for the explanation of transfer effects.
3.4.3. OWR (observation of writers and readers)

Whereas the evaluation in the previous condition is focused on only one communi-
cative role, for the OWR subjects it is aimed at both roles. They observe a complete
communicative process: the construction of a text by a writer and the reconstruction
of the writers’ intention by a reader. Immediately after the writing assignment is
presented to them, their exercise runs like this:
The first student you will observe is the writer who was instructed to:
“Write a short argumentative text”
After (s)he wrote the text, the second student or reader was asked to:
“Determine if this is an argumentative text. Tell us why.”
Now you are going to observe both the writer and the reader. It is your task to
find out whether each of them does well, and what they may be doing wrong.
When you have observed both students, you may advance to the next page.
And on the next page it reads:
You saw two students doing writing and reading assignments. They answered:
— Writer: I will enjoy reading this book, because the introduction pleases me.
— Yes, that is an argumentative text. She gives her opinion about the book.
Explain briefly on which aspects the communication was successful or not.
—- Did the writer do well? O Yes O No because: ………………………..
—- Did the reader do well? O Yes O No because: ………………………….
OWR subjects must divide their attention between the two communication modes.
More than for the other subjects it may become visible for them how strongly writing
and reading — or the construction and reconstruction of meaning — are related
through the use of a conceptual rule for ’argumentative text’. The subjects evaluate
writers and readers by their use of this rule; or, more precisely, of the two variants
of the conceptual rule mentioned above. Because of this varied representation of the
rule, the theoretical element may become more flexible and therefore more readily
transferable to both reading and writing.
3.4.4. FW (reader observation as feedback for writers)

In the previous observation conditions, information was gathered about successful
and unsuccessful reading and writing processes. By having the subjects focus on
process evaluation, it is expected that learning takes place by imitation of good
examples and avoidance of worse examples.
In the present condition, the observations serve a different goal. FW subjects start
with application of criteria in a writing exercise, then hand over the text to an age—
group reader, and observe this reader while (s)he performs an analysis task by the
same criteria.
FW subjects do the writing task in the same manner as a DW subject. But instead
of using all practice time for writing exercises, only 50% is spent on writing, and
50% is spent on observing readers’ as feedback. The total time-on-task is the same
for these conditions.
An FW subject also has to process the two variants of the conceptual rule: the
productive variant while writing and its receptive counterpart while following the
observed reader’s activities. The observation is aimed at discovering potential com-
prehension problems for readers, attribution of these problems to the quality of the
text, and proposing possible remedies.
Check again the three examples on page 2, and write a new example of argumenta-
tive text. When you finish your text, you present it to the reader.
And before presenting it to the reader:
Now you will see a student analysing your text while (s)he is reading aloud. It
is the reader’s task to find out: a) whether your text is argumentative or not, and
b) which part represents the opinion, and which the reasons for it
Observe this reader’s performance. Don’t interrupt. It is your task to check if the
reader can fluently perform these tasks, and if not, to find out why not.
And after the observation:
Now answer these questions:

a) Has the reader recognised your text as argumentative? O No O Yes
b) Could the reader correctly identify the opinion and thereasons for it? O No
O Yes
c) Can you change the text in order to make it (even) clearer for the reader?
O No O Yes, namely ………………………………..
3.4.5. The videotape recordings

Videotape recordings of students executing writing and reading processes are
required for the OW and OWR conditions. Recordings were made on another school,
with students of the same age as students in the sample. These were not staged, but
authentic. 16 students did do all four lessons in front of the video camera, while
thinking-aloud whenever they had to do an exercise. Microphones helped to get the
necessary clear, comprehensible speech.
Thus, many successful and less successful task executions were collected from
which we could choose in editing the tapes for the experimental sessions. These
were edited such that for almost every exercise two different processes or solutions
were to be observed. This would provoke active interest from the observer s: they
would have to choose the best from the two realistic solutions to the task.
3.5. Comparing learning activities in the DW, OW, OWR and FW conditions
We can make a comparison between the four conditions with respect to the type
of cognitive activities they require. By doing so, it becomes clearer to which differ-
ences in activities we may attribute possible differences in effectiveness (Table 2).
We see that the orientation and reflection steps in the exercises are the same for
each condition. In these steps, the learner’s cognition is construed (making an initial
Table 2
Learning activities in the four experimental conditions
DW: “Write a OW: “Check OWR: “Check FW: “Write a

correct whether P writes whether P writes correct
argumentative text” the argumentative the text correctly argumentative text
text correctly” and whether Q and check whether
analyses it Q analyses it
properly” correctly”
ORIENTATION Reading, same as DW same as DW same as DW

interpreting and
conceptualising the
coding rule:
B→S⫹A
TASK Concretising: Observe writers 1a: Observe writer; same as DW
EXECUTION writing S and concretising by 2a: Observe reader
matching A writing S and
matching A
MONITORING & Self-evaluation Observe writing 1b: Comment on Guided self-
EVALUATION and give evaluative writer; evaluation by
comments 2b: Comment on reader observation
reader
REFLECTION Accommodate: same as DW same as DW same as DW
change or confirm
conception of
coding rule or
constituting
elements
conceptualisation of the rule in the first step) or accommodated (changed or con-

firmed as a result of experiences in the exercise). The steps in between are practice
steps in which the execution of a linguistic process is either undertaken or observed,
and evaluated. The monitoring and evaluation activities are supposed to yield the
information that is the basis for learning. By invoking these learning activities in
the four conditions, we attempt to find if there are any differences in learning associa-
ted with them.
3.6. Testing materials
The posttests are aimed at the measurement of the dependent variables, as oper-
ationalised in four indicators (see Section 3). The quality of posttest measurement
was enhanced by adding pretests scores in a covariance analysis, filtering out poten-
tial disturbing effects, such as pre-experiment differences in abilitiy between groups.
3.6.1. Posttests for learning (writing ability) and transfer (reading ability)
The four indicators for writing skill (1a to 1d in Appendix A) were measured in
three posttests; the four matching indicators for reading skill (2a to 2d) were meas-
ured in three other posttests. The indicators for reading and writing skill match (e.g.
1a with 2a; 1b with 2b etc.) in that they build on the same knowledge or content
of the lessons. Some indicators are repeatedly measured in more than one posttest.
In Appendix A, second column, it is reported how the indicators were measured.
Writing indicators 1a, 1b, 1c (simple) and 1d were measured in two posttests (W1
and W3). Only indicator 1c (complex) required a specific test (W2). There were
no specific expectations as to which of the abilities would profit most from the
experimental interventions.
Reading indicators 2a, 2b, 2c (complex) and 2d were measured by having the
students perform an analysis of two larger texts (R2 and R3). In the other reading
test (R1), indicator 2c (simple) was measured. It consisted of multiple choice items
in which argumentation had to be identified and categorised.
3.6.2. Pretests (covariates) for IQ, initial writing ability and initial reading ability
Covariate analysis requires pretest measurement of relevant variables, which may
be — unintentionally — included in the posttest measurement and influence the
experimental effects. In this case, the initial skill level in reading and writing argu-
mentative text is relevant but, however, rather low (in the lower streams at school
there is not much systematic attention for argumentative texts) which impedes
measurement with the same instrument as used to measure the level after the training.
We have therefore not measured all indicators, and added three pretests for “intelli-
gence” as an alternative explanatory factor for differences in posttest performance.
Only the indicators 1a, 1b and 1c were measured in the writing pretests (PW1–
PW3), by asking the subjects to write two short argumentative essays, in which two
given standpoints had to be defended on the basis of some documentation.
Only indicator 2c was measured in a reading pretest (PR1): students had to do
two multiple choice tests, in which argumentation had to be identified (but not yet
categorised).
Added were three pretests for intelligence (IQ1–IQ3). Analysis of argumentation
has been considered as an ability to discern abstract relations between verbal units
(Oostdam, 1991); therefore we chose two validated CMR tests (“Conclusions”
(Elshout, 1966) and “Verbal analogies” (DAT, 1984)) and one CMU test (test Word
list (DAT, 1984).
The variable list can now be summarised as follows:
pretest posttest
1a) Writing — social context: PW1 W1D
1b) Writing — text structure: PW2 W1A, W3B
1c) Writing — argumentation structure — W3
Writing — argumentation structure — W1C, W2
(complex):
1d) Writing — means for presentation: PW3 W1B, W3A
2a) Reading — social context: — R3 ABCD
2b) Reading — text structure: — R2D, R3F
2c) Reading — argumentation structure PR1 R1
(simple):
Reading — argumentation structure — R2C, R3E
(complex):
2d) Reading — means for presentation: — R2AB, R3G
iq1 Intelligence — CMR Conclusions: IQ1 —
iq2 Intelligence — CMU Word list: IQ2 —
iq3 Intelligence — Verbal Analogies: IQ3 —
3.7. Procedures
For each subject, participation in the experiment took place in two four-hour ses-
sions over two days. On the first day, the pretests were administered during the first
two hours, after which students followed lesson 1 and lesson 2. On the second day,
the course continued with lesson 3 and lesson 4. After lesson 4, the posttests were
administered during the last two hours. We varied the order in which the six writing
and reading posttests were made, so that no test systematically followed another.
All subjects from conditions DW, OW, OWR and FR worked individually from
a workbook, in which theory and exercises were combined. Condition FW required
some co-operation between the subjects, so these sessions were limited to small
groups only. Students were informed about the time every fifteen minutes, so they
would not be surprised by a sudden deadline. The video-conditions were also con-
trolled by the length of the videotape (57–63 minutes) and an on-screen timer dur-
ing viewing.
3.7.1. Learning-by-doing condition (DW)

Sessions of the learning-by-doing conditions were rather straightforward. After
groupwise administration of the pretests, subjects worked individually from a work-
book in a normal tempo until the hour was over. The time estimation of one hour
appeared to be sufficient.
3.7.2. Learning-by-observation and evaluation (OW and OWR)

The learning-by-observation and evaluation sessions were more complicated. Sub-
jects had a workbook and pen at their disposal, and were seated facing a personal
videoplayer with headphones. At the start of each lesson, a tape would be inserted
and started. By means of an on-screen timer and on-screen messages, the subject
was informed about how much time was left for each acitivity: reading a piece of
theory in the workbook, answering a control question, and doing a reading or writing
exercise. Short beeps made the student aware that an observation exercise was
approaching; this to make sure that the fragments would not be missed. After the
video fragments, there was ample time to write down comments on the observed per-
formances.
In this way, the time spent on the various parts of the lessons was highly con-
trolled. Because the tempo was not too high and the tape could be stopped if neces-
sary, this method appeared not to entail more stress for the subjects than the other
conditions.
3.7.3. Learning-by-observation-as-feedback (FW)

The procedure for the FW condition was very different. Subjects of the FW con-
ditions worked in triads: one student, the feedback supplier, who did not take part
in these four conditions, served as a test reader for two FW writers, who came in
turns to offer their texts for analysis.
All subjects worked from their workbooks. After an FW writer had completed a
writing exercise, the text was presented to the test-reader. The writer was seated
opposite the test reader in order to observe the reading. The reader had a predeter-
mined amount of time to complete the reading/analysis task. After that time, it was
the second writer’s turn to have his/her text analysed. The triads thus worked accord-
ing to a strict time schedule. They timed themselves using a stopwatch and a working
schedule. The turn-taking went on throughout the course and was supervised by a
research assistant.
4. Results
The results of this study will be presented in two parts. First we will report on
the instrumentation for the measurement of pre- and posttest variables. Quality
assessment is necessary because the instruments differ in nature and length, and
because most of them were constructed for the purpose of this study and thus not
tried out elsewhere. We will also pay attention to pretest scores when reporting and
discussing their inter-correlations. Differences in pretest scores between groups are
not statistically tested, since we will only use them as covariates in the analysis of
posttest data. In such a covariance analysis, the posttest scores must be regressed
first on the relevant covariating pretest scores; then the part of the posttest score that
can be safely attributed to the pretest is subtracted from the posttest scores, and a
variance analysis is performed on the corrected posttest scores. To this end, quantitat-
ive relations between the posttests are tested and discussed using a correlation matrix.
In the second part, the research hypotheses will be statistically tested, and a report
is given on the MANOVA’s performed on the posttest data using the relevant pretest
data as covariates. The first of the two sections in this part is about the learning
measures, the second about the transfer measures.
4.1. Instrumentation
We will list the instruments used for pre- and posttest measurement, give a short
description and some psychometric data: number of items (standard and rejected after
item-analysis using Ritem-total3 0,15 as a criterion) and homogeneity (after removal of
non-fitting items).
Indicators 1b, 1d, 2b, 2c (complex) and 2d were measured with more than one
test. The relevant parts of the tests were taken together in the analysis. The psycho-
metric data reported in this table is based on the two collapsed parts.
The quality of each test is indicated by its homogeneity (reliability) and other
aspects of validity. We must confine ourselves to the assessment of homogeneity,
since we have no other validity indicators than face-validity.
The homogenity of the tests (Cronbach’s alpha without the rejected items) is in
most cases sufficient ( > 0.60), with the exception of the pretest measurement of
indicator 1a and the posttest measurements of indicators 2a and 2d. It is not surprising
that these tests all have low numbers of items. When corrected for test length (with
the Spearman-Brown formula), the reliability of these tests falls within acceptable
ranges.
Items with an item-total correlation ⬍ 0.15 were rejected from the tests: they may
have an unclear or ambiguous formulation (thus functioning as trap questions) or an
extraordinarily high p-value, which could not discriminate between overall high scor-
ers and overall low scorers (Table 3).
With the help of a pretest-posttest correlation table (Appendix C), we can deter-
mine which pretest variables correlate significantly with their corresponding posttest
scores, and thus may function as covariates. One should be careful not to take up
too many covariates in the covariance-analytical model, since they decrease the
degrees of freedom in the final analysis of variance, and thus test power; even though
they might not filter out any undesirable variance from the posttest scores.
It can be read from the correlation table (Appendix C) that none but one of the
pretests correlates significantly with the posttest that was to measure the same con-
struct. Only the (parallel) pre- and posttest measures of indicator R3 (single) corre-
late. Other theoretically related pre- and posttests do apparently not measure the
same construct. A possible explanation is that students acquired really very new
knowledge, and their behaviour in coping with argumentative texts has changed dras-
tically, compared to how they wrote and read before the experiment. In sum, only
one of the reading and writing pretests will function as covariate.
In the upper half of Appendix C one can see that the average intercorrelation of
126
Table 3
Number of Items, number of rejected items, and reliability scores for pre- and posttests
Pre-test # items #items alpha Post-test # items #items alpha correlation

total rejected total rejected pre-post
1a) Writing - social context: 1A 6 1 0.52 W1D 3 0.69 0.134

1b) Writing - text structure: 1B 18 3 0.61 W1A, W3B 9 2 0.67 0.261
1c) Writing - argumentation struct. (simple): — W3 11 1 0.82
Writing - argumentation struct. (complex): — W1C 12 3 0.67
W2 32 0.88
1d) Writing - means for presentation: 1D 6 1 0.60 W1B, W3A 14 2 0.71 0.118
2a) Reading - social context: — R3ABCD 4 0.59
2b) Reading - text structure: — R2D, R3F 14 3 0.82
2c) Reading - argumentation struct. (simple): 2Ca 33 3 0.83 R1 48 4 0.93 0.513
Reading - argumentation struct. (complex): 2Cb 20 5 0.76 R2C, R3E 19 2 0.82 0.072
2d) Reading - means for presentation: — R2AB, R3G 18 0.56
iq1 Intelligence - CMR ’Conclusions’: IQ1 40 1 0.86 —
iq2 Intelligence - CMU ’Word list’: IQ2 75 10 0.91 —
M. Couzijn / Learning and Instruction 9 (1999) 109–142
iq3 Intelligence - CMR ’Verbal Analogies’: IQ3 50 1 0.87 —

posttest measures within each of the modes (both reading and writing) is much higher
than the average intercorrelation between the modes. Thus, we cannot consider the
operationalizations of each dependent variable (the indicators) as independent. For
this reason we will use multivariate analysis of variance (MANOVA), a technique
that takes the mutual influence of dependent variables into account. Since our design
consists of only one independent variable (type of practice), we will perform a multi-
variate one-way analysis of variance.
The lower half of the table is used to select the covariates that can be included
in the analysis. Note that the first IQ-pretest does not correlate with any posttest
variable, while the second and specially the third do, with two and with seven post-
tests respectively. A covariate will only be included in the analysis of posttest meas-
ures with which it statistically and theoretically related.
The mean pretest scores for the conditions do not show relevant differences (see
Appendix D), as could be expected from the random assignment of subjects. No test
for differences was executed, since we chose to use pretest data as covariates in
posttest analysis: this procedure removes external influences at the individual level
rather than at the group level.
From the pretest intercorrelation table (Table 4) it can be concluded that the three
IQ subtests do not measure the same components of intelligence. IQ3 (verbal
analogies) is obviously an outsider, while IQ1 (logical operators) and IQ2 (word list)
share only little variance. Each of these measures may be independently added as
covariates to the analyses of posttest data, insofar they share significant variance
with these tests.
The high correlation between IQ3 and PR1 (the identification of argumentative
relations) is remarkable. A general skill like the identification of abstract semantic
relations might underlie the strong relation between the two tests. However, this
must remain speculative.
The last conclusion is that the two tests for argumentation analysis or reading do
not feed on the same cognitive skill. Mere identification of argumentative relations
Table 4
Correlations between pretest scores
Corr. IQ1 IQ2 IQ3 PW1 PW2 PW3 PW4 PR1
IQ1 1.0000
IQ2 0.2735** 1.0000
IQ3 0.0235 0.1510 1.0000
PW1 0.0638 ⫺0.0027 0.1320 1.0000
PW2 0.0447 0.0489 0.0160 0.1316 1.0000
PW3 ⫺0.0217 ⫺0.0367 ⫺0.0092 0.3168** ⫺0.2763** 1.0000
PW4 0.0347 0.0138 ⫺0.0180 0.3685** 0.0033 0.3697** 1.0000
PR1 0.0485 0.1980* 0.9244** 0.1298 0.0381 ⫺0.0399 0.0131 1.0000
PR2 ⫺0.0781 0.0130 0.0172 0.0390 ⫺0.0631 ⫺0.0152 0.0595 ⫺0.0568
(n ⫽ 119) ( * ⫽ p ⬍ 0.01, ** ⫽ p ⬍ 0.001).

(PR1) is different from the reconstruction of more complex hierarchical structures

(PR2).
4.2. Posttests data analysis: effects on learning and transfer
Table 5 contains the mean posttest scores and standard deviations for each con-
dition.
In contrast to the pretest scores, there is much variance in mean posttest scores.
The groups apparently differ on most of the measures. On the other hand, the within-
group variance is considerable in comparison to the difference in mean scores. There-
fore it must be determined by a MANOVA whether the between-group differences
can be generalised. We test the hypotheses in multivariate procedures because the
Table 5
Means and standard deviations for posttest scores across conditions
Posttests for WRITING

CONDITION W1 soc.context W2 argu. simple W3-1 W3-2 W4
(text structure) (argu. complex) presentation
DW: Learning by 1.51 (2.11) 7.51 (5.82) 21.51 (8.52) 16.55 (7.17) 4.13 (2.85)
Doing Writing
Exercises
OW: Learning by 4.13 (2.32) 9.34 (5.63) 27.44 (8.06) 22.65 (5.82) 8.89 (3.53)
Observation (1
mode)
OWR: Learning by 3.58 (2.35) 8.75 (5.16) 26.03 (8.05) 20.27 (7.92) 5.65 (3.65)
Observation (2
modes)
FW: Learning by 3.79 (2.09) 8.51 (4.68) 26.55 (7.51) 22.55 (6.83) 8.72 (3.82)
Observation as
Feedback
Max. score: 6 14 30 32 12
Posttests for READING:

CONDITION R1 soc.context R2 argu. simple R3-1 R3-2 R4
(text structure) (argu. complex) presentation
DW: Learning by 5.72 (3.25) 8.97 (5.69) 23.32 (8.15) 8.58 (5.50) 5.36 (1.82)
Doing Exercises
OW: Learning by 8.06 (2.64) 17.26 (4.59) 27.65 (10.05) 24.27 (4.62) 12.08 (2.17)
Observation (1
mode)
OWR: Learning by 8.75 (3.18) 15.54 (5.05) 28.62 (9.74) 21.44 (5.43) 12.54 (2.80)
Observation (2
modes)
FW: Learning by 8.72 (2.21) 18.10 (4.84) 29.34 (8.75) 24.84 (4.39) 12.26 (1.96)
Observation as
Feedback
Max. score: 12 22 43 34 18
dependent variable writing skill is made up of several correlating variables (the

indicators). Separate testing would camouflage this correlation.
4.2.1. Learning effects

We will start with answering the research questions regarding learning effects on
writing skill. Table 6 shows the results of the MANOVA procedure.
Research question Q1 asks for the comparison of conditions OW versus DW:
is learning-by-observing task executions more effective than learning-by-doing the
execution yourself? The mean score of the observation condition is significantly
higher than the mean of the individual practice conditions. As expected, learning-
by-observing turns out to be more effective.
Research question Q3 compares learning-by-doing to observations of both writer
and reader. A MANOVA (on line 3 in the table) shows that no experimental effect
can be found of observing “both roles”; the groups scored alike on writing skill. This
is remarkable, because subjects in the OWR condition have had 50% less practice on
writer’s tasks since they were forced to balance their time between observation of
readers as well as writers. In analogy of the findings of Sonnenschein & Whitehurst
(1984), our expectation was neutral either.
Research question Q4 asks for a direct comparison of the OW and OWR con-
ditions, to know whether observation of two roles enhances or impedes learning
compared to one role. The MANOVA procedure (fourth line) shows that there is no
difference in writing skill resulting from the conditions; again in agreement with
our expectations.
Students in the FW condition write, observe readers, and revise. The transfer effect
that the “reader observation” may yield, raised research question Q2: is learning-to-
write by observation of readers useful compared to learning-by-doing writing exer-
cises? The second line in the table shows that observation of readers as feedback is
indeed more succesful. Note that the content of the tasks, and the time-on-task has
been kept constant across the conditions.
In summary, when it comes to learning to write argumentative texts, the experi-
mental learning activities OW and FW offer an advantage in comparison with the
more traditional DW. Condition OWR is at least not second to DW. Thus, learning
to write argumentation seems best served with either observation-of-writers or feed-
back-from-readers.
Table 6
MANOVA tests for between-group differences in ‘learning to write’. In the statistical design, all five
indicators are included in the construct ‘writing skill’
Question Conditions Posttest Covariates n F Signif. Test
Q1 DW - OW Writing IQ3, PR1 59 2.67 ⬍ 0.04** one-sided

Q2 DW - FW Writing IQ3, PR1 59 2.93 ⬍ 0.02** one-sided
Q3 DW - OWR Writing IQ3, PR1 59 1.76 0.08* two-sided
Q4 OW-OWR Writing IQ3, PR! 59 1.13 0.17 two-sided
* ⫽ significant at ␣ ⫽ 0.05, one-sided testing** ⫽ significant at ␣ ⫽ 0.05, two-sided testing.

4.2.2. Transfer effects

The last two research questions (Q5 and Q6) are concerned with inter-modal trans-
fer. Sonnenschein & Whitehurst (1983); Sonnenschein & Whitehurst, (1984)) found
that transfer between the productive and receptive modes was hard to induce; only
participants placed in the role of observer/evaluator were able to transfer their experi-
ences to new tasks. We have repeated this experiment with another age group, and
quite a different task, without the external feedback that was necessary for Son-
nenschein and Whitehurst to make the intervention effective.
Transfer effects are also expected from the engaged observation group FW,
because of the precise match between writing performance and reader observation,
which may be optimal for translating encoding rules into decoding rules (as explained
before). The writing method of Schriver (1991); Schriver, (1992)) writing method,
on which this pedagogy is based, has established learning effects but did not yet
focus on transfer effects.
Neither of the groups had been told in advance that the posttest would include
writing as well as reading tasks.
For the present purpose we chose MANOVA procedures again to determine if
between-group differences exist on the transfer task. The only difference is now that
the writers’ reading skill was assessed in the posttest. The results are in Table 7.
Question Q5 asks whether learning-by-observing students doing writing exercises
leads to higher transfer to the opposite mode than learning-by-doing these exercises.
Although some counter-evidence can be found in the studies of Sonnenschein &
Whitehurst (1984), it was suggested that observation may have a beneficial effect
on transfer, because the observation activities rely to a large part on analysis and
understanding of the observed process. In Table 7 we see that OW subjects outscore
DW subjects with respect to reading skill; observation activities have apparently
prepared students better for reading tasks. Question Q5 is answered positively.
Q6 addresses transfer effects of observing one’s communicative partner: readers
instead of writers. Observation of readers appears to be highly beneficial; not only
did it afflict the student’s writing skill (Q2) but in Table 7 we see that it also enhances
their reading.
No transfer effects have been assessed or presented in the OWR condition. It was
argued that in the OWR condition, practice is not limited to one mode. Because
complete communication transactions are observed, the OWR students observed as
many readers as writers. They had to collect information about the quality of writing
Table 7
MANOVA tests for between-group differences in ‘transfer to reading’. The design includes all five indi-
cators of the construct ‘reading skill’
Question Conditions Posstest Covariates n F Signif. Test
Q5 DW-OW Reading IQ2, IQ3, PR1 59 3.34 ⬍ 0.01* one-sided

Q6 DW-FR Reading IQ2, IQ3, PR1 60 3.12 ⬍ 0.02* one-sided
* ⫽ significant at a ⫽ 0.05, one-sided testing.

and reading processes. According to our definitions of learning and transfer, we must
consider the increased reading or writing skill that results from these observations
as learning. Thus we can, and do not, pay attention to any, maybe embedded, transfer
effects between the two learning effects.
We should nevertheless compare the OWR score on the reading posttest to the
transfer-to-reading score of the DW condition. As we have seen, their scores on the
writing posttest are equal. A difference on the reading posttest would be an important
argument to favour one condition over the other. A MANOVA shows a large signifi-
cant result (F ⫽ 3,64; p ⬍ 0.01; n ⫽ 29) in favour of the OWR group in comparison
with DW. So we must conclude that, although learning-by-doing and learning-by-
observing both roles have equal learning effects, the latter method deserves more
credit because its transfer effects are larger. Its hidden strength lies in the learners’
ability to adapt their knowledge to complementary situations: the reading mode.
In summary, we found that transfer from writing practice to reading skill was
promoted more by two types of learning-by-observation than by learning-by-doing
activities. At this point we can only establish that more transfer takes place; in order
to establish how much more we must use a kind of quantification which 1. enables
incorporation of the five indicators into one construct writing skill or reading skill,
and which 2. is informative in that it expresses the achieved amount of transfer in
relation to some meaningful criterion.
5. Summary and discussion
The aim of this experiment was to test a theory about effective learning activities
for writing instruction; in this case regarding argumentative text. It was expected
that two instances of observational learning would yield higher learning effects on
writing and higher transfer effects on reading. The rationale for the learning activities
has been presented in Section 2.
The expectations were experimentally put to the test, using a full — between
pretest — posttest design, in which four groups of thirty 15-year old high school
students took part. The four treatments consisted of short experimental courses aimed
at learning to write argumentative text. The presented subject-matter was the same
for each group, but the learning activities varied systematically: doing writing exer-
cises (DW), observing writers (OW), observing both writers and their readers
(OWR), and doing a writing exercise and observing a reader as feedback (FW). After
a pretest session and four one-hour training sessions, the same set of posttests for
reading and writing skill were administered to all participants.
Multivariate analysis of variance was used in order to test the hypotheses regarding
learning effects, using the instructional method as an independent variable and a set
of five indicators for writing skill as a complex dependent variable. The hypotheses
about transfer effects were tested in the same way, with a set of five indicators for
reading skill.
The main findings are that both types of learning-by-observation (observation-as-
model and observation-as-feedback) are more effective than learning-by-doing
(research questions Q1 and Q2). Two variants of observation-as-model were dis-

tinguished: observing only writers, and observing writers and their readers (a com-
plete communicative transactions). It was found that learning-by-observation of writ-
ers and their readers was equally effective as learning-by-doing (Q3), but that
observing readers adds an important additional effect: a much stronger transfer to
reading (Q5).
Moreover, observing complete writing-reading processes is equally effective as
observing only writing processes (Q4). Finally, learning-by-observation of your own
readers as feeback promotes more transfer than learning-by-doing (Q6).
Part of this experiment can be seen as a replication of the study of Sonnenschein &
Whitehurst (1984). They also found transfer of the observation role to the ability to
perform the observed tasks themselves. In their study, however, the transfer from
observation of one role to performance in the complementary mode was not strong,
while a fair amount of such transfer is found in the present study. A difference is
that in our study, the observations are not sanctioned by a teacher who supplies oral
feedback. Moreover, the group of learners and the subject matter is very different.
A second main finding is that between-mode transfer can be observed in all of
the conditions, although transfer is strongest in the observation conditions. This is
most likely due to the many clues that the students have at their disposal to link the
writing and reading situation. They have situational clues (a different task, but in
the same time and place and with the same training assistant and materials) and
clues regarding content (the same concepts and terminology). It would indeed be
very strange if the learning-by-doing condition did not transfer at all.
In sum, observation of task execution processes turns out to be an effective learn-
ing method, both for observation-of-models (Sonnenschein & Whitehurst, 1983) as
for observation-as-feedback (Schriver, 1989). It also turns out to be an efficient
method for learning to communicate, because the observations promote transfer of
the acquired skill. It must also be noted that in these observations, that what must
be observed (comprehension and construction processes) is related to the act of
observing itself (an active process of comprehension and re-construction); this may
be a key factor in the effects we have found.
5.1. Validity
The experiment has been designed in such a way that several alternative expla-
nations have been ruled out. The experimental groups can be considered comparable,
the time-on-task was equal for all students, no teacher could have influenced the
results since all courses are self-instructive, the students were equally motivated by
a small reward, the treatments and the tests correspond for every condition because
the subject matter was the same for everyone.
Nevertheless, criticism of the validity of the results is possible. For instance, an
important difference between the learning-by-doing and the learning-by-observation
conditions is that the former are very familiar for the student and the latter not. It
may be that the novelty of observations, the use of video, the observation of live
models, was more interesting for the participants, which can partly account for the
experimental effects. On the other hand, research assistants have noticed both
enthusiastic and tedious reactions from all students while they did the tests or worked
in their workbooks. Tediousness was not necessarily greater in a particular condition,
although we have not checked this. An indication may be that the number of not
completed workbooks or not completed tests (a possible symptom of disinterest)
does not vary across the groups. It must also be added that a more active attitude
may be specific for learning activities that call for special attention, such as obser-
vations and evaluations.
Due to requirements of the organisation, workings conditions were not equal for
all conditions. Subjects in the learning-by-doing group worked individually, while
seated in a large room with 3 to 8 people at a table, leaving more than enough space
to work. It was not allowed to co-operate or to converse during the lessons. Subjects
in the model condition, who had to use a videoset, were seated in a middle-size
room with a table for themselves. Only six persons were at the same time in the
room. Subjects in the feedback condition worked in a large room, with only two
writers, the proof-reader, and the research assistant present. If group size influences
performance, then this has worked to the advantage of the feedback condition. On the
other hand, these subjects had to cope with more organisational problems (walking to
and from the proof-reader, keeping a very strict time schedule).
There are also weaknesses in the experimental and statistical design of the study.
In the first place, the pretest-posttest design is not genuine, since the pre- and post-
tests are not equal. We had to use different pre- and posttests, because two quite
different levels of mastery had to be reliably measured without bottom- or ceiling
effects. The pre- and posttests therefore aimed at the same main skills (aspects of
reading and writing argumentative text), but different subskills may be assessed.
This is related to the problem of the covariates. Pretests were included in the
design to enable covariance analysis which would filter out undesirable effects in
the posttest measurements. However, the majority of the pretests did not correlate
with posttests that were aimed at the same construct. It is uncertain what the pretests,
which in themselves are sufficiently homogeneous, have measured. It is anyway
unwise to use non-correlating pretests as covariates, so we have left them out.
There are some threats to the external validity too. Due to the organisation of the
experiment, the posttests were administered almost immediately after the training
had taken place. We can therefore not be certain about the durability of the results.
On the other hand, it was not our aim to develop long-lasting skills for the students,
but to find an answer to our research questions about effective learning activities.
Durability will be an important feature in real educational settings, but did not have
the highest priority for this study.
A similar threat to external validity may be that the pretests were administered
immediately before the first training session. The pretests did not need much instruc-
tion and explanation, but it remains possible that they influenced the prior knowledge
or attitudes of the participating students. Since all participants started with the pre-
tests, it will not account for between-group differences after the experiment. Never-
theless it can put limitations on the generality of the findings, if the pretests served
as primers to make effective observations. We have no reason to assume such prim-

ing effect for the observation conditions only.
The use of the words reading and writing is problematic. In the course of this
article the word analysing is sometimes used as a synonym for reading. These terms
are not true synonyms, because realistic reading contains more than analysis; and
real writing is more than combining sentences and inserting conjunctions. On the
other hand, the kind of technical reading and writing that is taught at school is also
more limited than what students do at home. Analytical activities are anyway
important in process-oriented instruction, because the overall process must be ana-
lysed so it can be stepwise learnt, observed, or evaluated, or performed.
This leads us to the limited scope of the learning goals we have attempted to
obtain. A course in argumentative reading or writing is always too narrow — because
argumentation is such an all-embracing, widely applicable, and last but not least
important and highly valued subject. A four-lesson course in argumentation must,
then, be extremely narrow. For our research purpose, we have merely attempted to
instruct students in a set of basic skills for argument analysis or construction. Of
course, our students will have brought along a certain level of argument control when
they entered our project: argumentation may be new to them as a school subject, but
they do have intuitive knowledge on reasoning and arguing in everyday life. The
level attained at the end of the experiment will reflect some of that prior knowledge.
We may have ensured that this entrance level was the same for all conditions and
will not have influenced our research results. But the attained level of mastery must
not be attributed to the short training only.
5.2. Final considerations
Although in this experiment learning-by-observation turns out to be more effec-

tive, we should not use the results to discredit learning-by-doing in general. We
believe that learning-by-doing remains indispensable in, and essential to language
skill education — but that it is not the only true or most obvious method. We have
contrasted this learning method to three types of observation. We chose to focus on
the writing and analysis of argumentative text; these skills were comparably new
and difficult for the students. In such a situation, the students will be in need of
good examples (models) who demonstrate what the behavior-to-be-acquired is like,
along with examples demonstrating the pitfalls to avoid; pitfalls which they, as com-
parable novices, are likely to make too.
In this situation, learning-by-observation showed to be advantageous. However,
once a basic cognitive level of knowledge and skill has been acquired, the need to
proceduralise the knowledge and making it more flexible arises (Salomon & Glober-
son, 1987; Salomon & Perkins, 1989; Anderson, 1990). This calls again for learning-
by-doing activities. Those activities may profit from a prior observation stage,
because criteria for self-evaluation have become more explicit.
In the end, we expect most of a well-balanced interplay of learning-by-observation
and learning-by-doing activities. Each of these methods has its qualities and disad-
vantages. It is up to the educator to compose learning programs in which the qualities
are combined and the drawbacks compensated. That the qualities of learning-by-
observation deserve to be studied in more detail, is what we hope to have demon-
strated.
Acknowledgements
The author wishes to thank two anonymous reviewers for their constructive and
instructive comments.
Appendix A. Indicators for and measurements of the dependent variables
Indicators for writing ability:
Indicator: Measurement/scoring:
1a) Writing — social context: the These items are scored in an

ability to put the text in its social argumentative text that the subjects had
context, by making the issue explicit, to write on a given subject. For
the defended viewpoint (positive, instance, all items are present in the
negative, neutral), the other parties in following introduction: “Do you think
the discussion and their viewpoints. that pupils should give grades to their
These concepts were taught in lesson 1. teachers? Well, the teachers certainly
don’t think so. But I think it is a great
idea.”
1b) Writing — text structure: the Every component of the standard
ability to organise argumentative texts argumentative structure is scored in
using a standard structure: Introduction texts written by the subjects: For
(attractor, issue, parties and their instance, a good attractor is: “Did you
standpoints), Body (standpoint, see Channel 3 last night? If you did,
argumentation, refutation) and Ending you may have gotten just as angry as
(conclusion, most important me? Why? I’ll tell you all about it”.
argumentation, consequence or
exhortation). These components are
taught in lesson 2.
And a good exhortation: “If you agree
with me, do something useful, and write
a letter to the director of our school.”
1c) Writing — argumentation Simple argum.: based on argumentation
structures: the ability to express structures like: (1) We really should go
hierarchical argumentation structures in home (1.1) Your mother is waiting
prose, which can be reconstructed by (1.2) You have become pretty

readers; both simple (singular, tiredstudents had to write normal
subordinate, compound) argumentation paragraphs: “Your mother is waiting
and complex (any combination of these and, besides, you have become pretty
argumentation types). This ability was tired, so we really should go home
practised in lesson 3 now.” Every correctly connected
argument was scored. For complex
argumentation, students had to write on
the basis of larger argumentation
structure, consisting of various types of
argumentation.
1d) Writing — means for Texts written by subjects were scored
presentation: the ability to apply verbal on these three types of verbal means.
means to enhance the presentation of For instance, when applied to the
argumentation and of text structure: — sentence: “The cupboards are all empty;
the use of paragraphing to highlight it’s time to do shopping” the use of
textual organisation; — use of markers standpoint and argumentation markers
to indicate specific textual parts; — use may aid to the comprehensibility,
of connectors to explicitize the relation namely: “I think the cupboards must be
between textual parts. These concepts pretty empty, because it is time again to
were taught in lesson 4. do shopping”
Indicators for reading ability:
Indicator: Measurement/scoring:
2a) Reading — social context: the Subjects must indicate the social
ability to identify certain concepts in the parameters in several argumentative
text, by which the text can be placed in texts. An example: “They are too lazy
the social context of an argumentative to work!”. That is what I keep hearing
discussion; (See 1a) above for this set when I ask people what we should do
of concepts.) about the growing army of the
unemployed ( ⫽ attractor). I find more
and more people talking about the
question whether the labour act of 1963
shouldn’t be sharpened ( ⫽ issue).
There is quite some disagreement: our
government seems to be quite fond of
the idea, and the Parliament has reacted
rather moderate — but hasn’t
condemned the plan either ( ⫽ other
parties ⫹ standpoints). Personally I feel
little sympathy for a change of law, and
I will gladly explain why” ( ⫽ defended

viewpoint).
2b) Reading — text structure: the In the posttest, students analyse two
ability to analyse argumentative texts in texts (400 and 500 words) using the
terms of a standard structure, which same structural components as presented
asks for specific subdivisions of above under 1b). The texts have been
introduction, body and ending specially constructed for the purpose,
(components: see above under 1b). which makes the job doable. Each
analysed component is scored.
2c) Reading — argumentation For simple argumentation, students made

structures: the ability to reconstruct the a m.c. test in which they had to analyse
hierarchy of argumentation written in singular, compound, and subordinate
prose, as expressed by the writer. Both argumentation. Example: “That old
simple argumentation and more complex newspaper is handy to set the barbecue
structures (see above under 1c). to fire, so I think we really shouldn’t
throw it away. And we can use it to
wrap up the flowers, too!” If the text
contains argumentation, underline the
arguments and indicate the type
(singular-compound-subordinate). For
complex argumentation, the students
analysed larger piece of prose, resulting
in a structure that contained all types of
arguments (ca. 10 arguments). Each
analysed argument was scored.
2d) Reading — means for In two texts, the subject had to

presentation: the ability to identify categorise paragraphs, and identify
verbal means by which the presentation textual markers and argumentative
and thus the comprehensibility of connectors. “In my opinion, changing
argumentation and of text structure has the law is certainly worth it, because the
been enhanced (see above under 1d). unemployment in a country should be as
low as possible, for the simple reason
that we shouldn’t waste manpower and
we shouldn’t waste money”.
Appendix B. Learning objectives for the short course in “argumentative

texts”
Writing Reading skill Theoretical

skill content
Lesson 1 Expliciting social Recognition of “Argumentative texts and

para-meters of the the social discussions”Concepts: —
dis-cussion: issue, parameters of the standpoint (opinion): positive,
parties, discussion in the negative, neutral; argument
communicative text (reason); argumentative text;
goal issue; discussionIn an
inductive fashion, the concept
of argumentative texts is
explained by means of its
constituting elements:
standpoints and arguments
(opinions and reasons for
having these opinions). The
genre of argumentative texts
is placed in the social context
of discussions aimed at
resolving a dispute, which
centers around the
acceptability of a certain
proposition (the ‘issue’).
Lesson 2 Composing a Analyzing the text “The structure of

well-structured structure on the argumentative texts”
text on the basis basis of a model
Presentation of a rhetorical
of a model
model: Introduction : request
for attention; issue at stake;
parties and standpoints; Body:
author’s standpoint, pro-
argumentation, refutation of
counter-argumentation; Ending
: conclusion; most important
arguments; consequence.
The global text structure
consisting of introduction,
body and ending is specified
for argumentative texts. The
function of each subpart is
discussed in relation to the

discussion goal. Various
examples help to give
meaning to the concepts.
Lesson 3 Writing on the Analyzing “The argumentation in
basis of simple complex argumentative texts”
and complex argumentation and
Presentation and discussion of
argumentation its simpler
singular argumentation;
structures constituents
compound argumentation;
subordinate argumentation;
and the complex
argumentative structures of
which these are the
constituents.
A simple notation system for
schematizing complex
structures is also taught.
Lesson 4 Applying various Identification of “The presentation of
means for means for argumentative texts”
presentation of presentation of
Three means for the
text elements text elements
clarifyication of the text
structure: paragraphing of the
rhetorical model; using verbal
structure markers;
argumentative connectors.
It is demonstrated how each
of these three means is helpful
in recognizing or expressing
the global text structure, the
parts that make up this
structure, or the complex
structure of the pro-
argumentation.
140
Appendix C
Pretest-posttest correlation table
Correls.: W1 W2 W3-1 W3-2 W4 R1 R2 R3A R3B R4
W1
W2 0.1017
W3-1 0.4556** ⫺0.2602**
W3-2 ⫺0.0054 0.0763 ⫺0.0741
W4 0.2996** 0.0734 0.4283** 0.0383
R1 0.1183 ⫺0.0227 ⫺0.0496 0.0849 0.1770*
R2 0.0890 0.1780 0.0166 0.1411 0.1909* 0.4523**
R3A 0.2166* 0.0191 0.1133 0.1226 0.1549 0.0584 0.1925*
R3B 0.0377 0.0342 0.0733 0.1479 0.2029* 0.4269** 0.6523** 0.1560
R4 0.1036 0.0826 ⫺0.0167 0.1117 0.1137 0.4297** 0.6342** 0.2199* 0.5773**
IQ1 0.0147 ⫺0.0751 0.1010 ⫺0.0972 0.1223 0.0963 0.1994 ⫺0.0141 0.1724 0.1470
IQ2 ⫺0.1163 ⫺0.0250 ⫺0.0392 0.0231 0.0296 0.1271 0.2423** 0.0611 0.2905** 0.1355
IQ3 0.2420** 0.0277 0.1432 0.3670** 0.2108* 0.1541 0.4268** 0.4514** 0.2995** 0.4311**
PR1 0.1897 ⫺0.0064 0.1601 0.3275** 0.2192* 0.1918* 0.4310** 0.4495** 0.3278** 0.3369**
PR2 0.0345 0.0805 0.1608 0.0205 0.0761 0.0692
⫺0.0988 ⫺0.0314 ⫺0.0342 ⫺0.0155

PW1 0.0968 ⫺0.0632 0.1291 0.1347 0.1579 0.0717 0.0649 0.0374 0.1518 0.0494
PW2 0.1293 0.2102 0.1130 ⫺0.0496 0.0775 0.0501 0.0906 ⫺0.0109 0.0978 ⫺0.0211
PW4 0.0672 0.0839 ⫺0.1044 ⫺0.0464 0.0911 0.1292 0.0350 ⫺0.0181 0.1113 0.0147
# of 1 0 0 2 2 1 3 2 3 2
covariates:
* ⫽ p ⬍ 0.01 ** ⫽ p ⬍ 0.001
Appendix D
Means and standard deviations for pretest scores across conditions
Pretests for IQ: Pretests for

READING:
CONDITION: IQ1 IQ2 IQ3 PR1 PR2

CMR-test CMU-test CMR-test argu. argu.
simple complex
Learning by Doing Exercises 19.53 8.92 46.50 7.78 22.51 10.19 22.10 6.15 5.55 2.77
Learning by Observation (1 mode) 20.60 7.22 50.55 7.82 23.31 10.01 22.03 4.72 6.06 3.18
Learning by Observation (2 20.79 7.66 52.31 11.33 25.13 10.86 23.13 6.03 4.58 2.89
modes)
Learning by Observation as 18.82 6.93 52.55 9.71 25.51 10.17 23.62 5.57 4.00 2.29
Feedback
Max. score: 40 75 50 33 20
Pretests for WRITING:
CONDITION: W1 W2 W4
soc. Context text presentation
structure
Learning by Doing Exercises 2.82 2.22 9.30 5.45 3.82 1.76

Learning by Observation (1 mode) 4.92 1.52 9.81 3.31 5.03 2.13
Learning by Observation (2 3.75 1.70 9.89 3.46 4.62 2.37
modes)
Learning by Observation as 3.85 1.26 8.13 3.53 4.20 2.57
Feedback
Max. score: 8 18 6
141
References
Anderson, J. R. (1990). Cognitive psychology and its implications (3rd ed.). New York: Freeman.
Bandura, A. (1977). Social Learning Theory. Englewood Cliffs, N.J.: Prentice Hall.
Bandura, A. (1986). Social foundations of thought and action: A social-cognitive theory. Englewood
Cliffs, N.J.: Prentice Hall.
Couzijn, M. J. (1995) Observing writing and reading processes. Effects on learning and transfer. Amster-
dam: Dissertation University of Amsterdam.
Couzijn, M. J. & Rijlaarsdam, G. C. W. (1996). Learning to write by reader observation and written
feedback. In G. Rijlaarsdam, M. Couzijn & H. v.d. Bergh (Eds.), Effective teaching and learning of
writing. Amsterdam: Amsterdam University Press.
Elshout, J. J. (1966). Conclusies. Amsterdam: Psychological Laboratory, University of Amsterdam.
(Internal publication).
Kuhl, J. & Kraska, K. (1989). Self-regulation and metamotivation: Conceptual mechanisms, development,
and assessment. In R. Kanjer, P.L. Ackerman & R. Cudeck (Eds.), Abilities, motivation, and method-
ology. Hillsdale, N.J.: Lawrence Erlbaum.
Mayer, R. E. (1983). Thinking, problem solving, cognition. New York: Freeman
Mayer, R. E. (1987). Educational psychology. A cognitive approach.. S.L.: Harper Collins.
Ng, E., & Bereiter, C. (1992). Three levels of goal-orientation in learning. The Journal of Learning
Sciences, 1(3), 243–273.
Oostdam, R. J. (1991). Argumentatie in de peiling. Amsterdam: SCO. Dissertation University of Amsterd-
am.
Salomon, G., & Globerson, T. (1987). Skill may not be enough: The role of mindfulness in learning and
transfer. International Journal of Educational Research, 11, 623–637.
Salomon, G., & Perkins, D. N. (1989). Rocky roads to transfer: Rethinking mechanims of a neglected
phenomenon. Educational Psychologist, 24(2), 113–142.
Schriver, K. A. (1989). Evaluating text quality: the continuum from text-focused to reader-focused
methods. IEEE Transactions on professional communication, 32, 238–255.
Schriver, K. A. (1991). Plain language through protocol-aided revision. In E. R. Steinberg (Ed.), Plain
Language: Principles and Practice. Detroit, MI: Wayne State UP, 148-172.
Schriver, K. A. (1992). Teaching writers to anticipate readers’ needs: a classroom-evaluated pedagogy.
Written Communication, 9(2), 179–208.
Schunk, D. H. (1991). Learning Theories. An Educational Perspective. New York. etc.: Merril/Macmillan.
Schunk, D. H. & Zimmerman, B. J. (1994). Self-regulation of learning and performance. Issues and
educational applications. Hillsdale, N.J.: Lawrence Erlbaum.
Simons, P. R. J. & Beukhof, G. (1987). Regulation of Learning. ’s-Gravenhage: SVO-Selecta.
Sonnenschein, S., & Whitehurst, G. J. (1983). Training referential communication skills: The limits of
success. Journal of Experimental Child Psychology, 35, 426–436.
Sonnenschein, S., & Whitehurst, G. J. (1984). Developing referential communication: A hierarchy of
skills. Child Development, 55, 1936–1945.
Van Eemeren, F. H. & Grootendorst, R. (1983). Argumentatieleer 1. Het analyseren van een betoog.
Groningen: Wolters-Noordhoff.
Van Eemeren, F. H. & Grootendorst, R. (1992). Argumentation, communication, and fallacies. Hillsdale,
N.J.: Lawrence Erlbaum.
Vermunt, J. D. H. M. (1992). Leerstijlen en sturen van leerprocessen in het hoger onderwijs. Lisse: Swets
and Zeitlinger.

Couzijn - 1999 - Learning To Write by Observation of Writing

Uploaded by

Copyright:

Available Formats

You might also like

Couzijn - 1999 - Learning To Write by Observation of Writing

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Couzijn - 1999 - Learning To Write by Observation of Writing

Uploaded by

Copyright:

Available Formats

Learning and Instruction 9 (1999) 109–142

Learning to write by observation of writing and

* Tel. ⫹ 31-20-525.1588/1288, Fax. ⫹ 31-20-525.1290, E-mail: couzijn@ilo.uva.nl

1. presenting subject-matter (by the teacher)

1.1. Writing versus Learning-to-Write

Educational psychologists have called attention to the crucial role of self-obser-

Fig. 1. Functional relationships between levels of performance activities.

1.2. Learning to write by observation

If we assume that self-monitoring and self-evaluation are important for learning

2. Research questions and expectations

2.1. Variable structure and research questions

Fig. 2. Structure of independent variables in this experiment.

2.1.1. Questions regarding learning effects

2.1.2. Questions regarding transfer effects

2.2.1. Independent variables

2.2.2. Dependent variables

In this pretest-posttest control-group design, the learning by doing group was

Condition n Pre-tests Learning activities Post-tests

DW 30 x Doing Writing exercises x

3.3. Training materials: experimental courses on argumentative texts

A four-lesson course on argumentative writing was developed, which served as a

3.4. Instructional sequence and types of exercises

⬍ if (goal ⫽ typify text as type B)

3.4.1. DW (learning by doing writing exercises)

3.4.2. OW (observation of writing)

And on the next page it reads:

3.4.3. OWR (observation of writers and readers)

And on the next page it reads:

3.4.4. FW (reader observation as feedback for writers)

And before presenting it to the reader:

And after the observation:

Now answer these questions:

3.4.5. The videotape recordings

DW: “Write a OW: “Check OWR: “Check FW: “Write a

ORIENTATION Reading, same as DW same as DW same as DW

conceptualisation of the rule in the first step) or accommodated (changed or con-

3.6. Testing materials

3.7.1. Learning-by-doing condition (DW)

3.7.2. Learning-by-observation and evaluation (OW and OWR)

3.7.3. Learning-by-observation-as-feedback (FW)

Pre-test # items #items alpha Post-test # items #items alpha correlation

1a) Writing - social context: 1A 6 1 0.52 W1D 3 0.69 0.134

iq3 Intelligence - CMR ’Verbal Analogies’: IQ3 50 1 0.87 —

Corr. IQ1 IQ2 IQ3 PW1 PW2 PW3 PW4 PR1

(n ⫽ 119) ( * ⫽ p ⬍ 0.01, ** ⫽ p ⬍ 0.001).

(PR1) is different from the reconstruction of more complex hierarchical structures

4.2. Posttests data analysis: effects on learning and transfer

Posttests for WRITING

Posttests for READING:

dependent variable writing skill is made up of several correlating variables (the

4.2.1. Learning effects

Question Conditions Posttest Covariates n F Signif. Test

Q1 DW - OW Writing IQ3, PR1 59 2.67 ⬍ 0.04** one-sided

* ⫽ significant at ␣ ⫽ 0.05, one-sided testing** ⫽ significant at ␣ ⫽ 0.05, two-sided testing.

4.2.2. Transfer effects

Question Conditions Posstest Covariates n F Signif. Test

Q5 DW-OW Reading IQ2, IQ3, PR1 59 3.34 ⬍ 0.01* one-sided