Download as pdf or txt
Download as pdf or txt
You are on page 1of 35

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/381030941

The effect of an intervention focused on academic language on CAF measures


in the multilingual writing of secondary students

Article in IRAL - International Review of Applied Linguistics in Language Teaching · May 2024
DOI: 10.1515/iral-2023-0137

CITATIONS READS

0 51

3 authors, including:

Ainara Imaz Agirre Roberto Arias-Hermoso


Mondragon Unibertsitatea Mondragon Unibertsitatea
17 PUBLICATIONS 218 CITATIONS 5 PUBLICATIONS 2 CITATIONS

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Roberto Arias-Hermoso on 03 June 2024.

The user has requested enhancement of the downloaded file.


THE EFFECT OF AN INTERVENTION FOCUSED ON ACADEMIC LANGUAGE ON CAF

MEASURES IN THE MULTILINGUAL WRITING OF SECONDARY STUDENTS

Imaz Agirre, A., Arias-Hermoso, R. & Ipiña, N. (2024). The effect of an intervention focused on

academic language on CAF measures in the multilingual writing of secondary students. International

Review of Applied Linguistics in Language Teaching. https://doi.org/10.1515/iral-2023-0137

Abstract

The present study aims to explore the effect of an experimental intervention based on academic writing

instruction and scientific argumentation on the argumentative multilingual writing of secondary school

students. Complexity, accuracy, and fluency (CAF) measures were used to evaluate the texts. A quasi-

experimental study with a pre-test/post-test design was carried out with a control group (n=49) and an

experimental group (n=63) of Basque-Spanish bilingual Year 8 students. The students composed

scientific argumentative texts before and after a science unit was taught. Participants in the experimental

group received instruction on academic writing and the discourse aspects of argumentation. The corpus

of 678 texts was processed using MultiAzterTest and CAF measures were retrieved. Repeated measures

ANOVAs were used to compare pre-test and post-test results. The control group exhibited a significant

decrease in some fluency, syntactic complexity and accuracy measures, while the experimental group

showed a significant improvement in some syntactic complexity and accuracy measures. These results

suggest that the experimental intervention might have had a positive impact on written CAF measures.

This study emphasises the importance of teaching academic language in multilingual contexts.

Keywords: CAF, academic language, academic writing, multilingualism, intervention, MultiAzterTest.

1
1. Introduction and literature review

Multilingual writing is a relatively new area in second language (henceforth, L2) writing research

(Rinnert and Kobayashi 2016). Extensive research has been carried out on multilingual students’

language learning in different linguistic skills and areas, such as overall speaking or writing skills, but

most of this research has been carried out from a monolingual perspective (Cenoz and Gorter 2011),

and consequently, there remains a lack of understanding of students’ multilingual writing skills

(Granados et al. 2022). Our definition of multilingual writing is in line with that of Rinnert and

Kobayashi (2016:365), who define it as “the ability to write in two or more languages, so is intended to

subsume bilingual writing ability or biliteracy. The study of multilingual writing involves comparison

of writing by the same writers in more than one language.” As previous language experiences affect

subsequent language learning (Cenoz 2013; Cummins 2021), education needs to consider the whole

linguistic repertoire of students to foster learning. This is especially important in multilingual or

bilingual education contexts in which the language of instruction is not the first language (henceforth,

L1) or dominant language of the majority of the students, an aspect that requires careful consideration

so that students achieve balanced literacies in all of their languages (Lorenzo and Rodríguez 2014;

Lorenzo and Trujillo 2017).

Although multilingual education programmes and second language (L2) acquisition have received

attention recently, multilingual writing remains under-researched (Granados et al. 2022). Thus, this

quasi-experimental study seeks to explore how multilingual writing is affected by an intervention

focused on scientific argumentation and academic language in the language of instruction of

multilingual students (in this case, Basque). To this end, we analysed subject-specific argumentative

compositions written in Basque, Spanish and English by a control group and an experimental group of

multilingual secondary education students enrolled in an immersion programme. This study aims to

contribute to an under-researched field within multilingual acquisition, specifically the development of

multilingual writing after experimental instruction.

2
1.1. Academic and disciplinary literacies

The linguistic characteristics of academic settings differ from those of everyday communication

(see Cummins 1980, 2021). In fact, academic discourse has often been characterised by its higher

complexity (Dalton-Puffer 2007), containing more complex linguistic resources, such as

nominalisation, the advanced use of cohesion devices, or abstract terminology from the disciplines

(Granados et al. 2022). Students, therefore, need to master academic discourse to be able to succeed in

academic disciplines in subject-specific ways following disciplinary conventions (Shanahan and

Shanahan 2008; Moje 2015). Furthermore, different disciplines shape the linguistic features required

for knowledge construction and communication, as some linguistic features are more prevalent in

certain disciplines. This is evident at different linguistic levels, such as terminology or discourse.

Regarding the lexical level, for example, social science and humanities tend to use more abstract terms,

while science and technology experts require more concrete language (Durrant 2017). Discourse-related

disciplinary specifics are also presented in terms of subject-specific cognitive discourse functions

(CDFs) (Dalton-Puffer 2013, 2016), which can be understood as discourse patterns that represent certain

cognitive actions and that arise in specific educational contexts, such as defining, arguing, hypothesising

and categorising. For example, in the subject of history, causal explanations are central for historical

narratives (Coffin 2006; Lorenzo 2017), whereas physical descriptions are more frequent in geography

(Llinares et al. 2012). These discourse-level patterns inherently require using distinctive

lexicogrammatical elements (i.e., past tenses and temporal connectives in historical narratives, or

descriptive adjectives and locators in geography).

Many of these subject-specific conventions appear in the form of written texts, as writing is a

fundamental practice of language learning (Manchón and Polio 2021) and of education overall (Christie

2012). For successful education, the learner’s home languages and all the languages of instruction need

to be taken into account in the development of literacies (Lorenzo and Trujillo 2017), as many students

in multilingual programs study through a language that is not their L1 (Cenoz 2009; Cenoz and Gorter

2011). Previous research has shown between-language commonalities in the development of academic

literacies in one’s languages (Granados et al. 2021). In order to capture the processes involved in

3
multilingual academic writing, in line with Cummins’ (1980) common underlying proficiency (CUP)

theory, Rinnert and Kobayashi (2016) proposed a model of multilingual writing, which is divided into

three important components: the writer’s repertoire of knowledge, the social context, and the written

outcome. The extent to which the linguistic and textual features of multilingual texts overlap largely

depends on the writer’s repertoire of knowledge and social context (Rinnert and Kobayashi 2016).

1.2. Complexity, accuracy and fluency in writing

Focusing on student writing offers an outstanding opportunity to explore linguistic and educational

development (Christie 2012) and, along with corpus linguistic approaches, can serve to track

development (Durrant et al. 2021). One of the most influential methods in corpus analysis has been the

CAF construct, a set of measures of complexity, accuracy and fluency that have long been used in

research in linguistics and applied linguistics to capture written or oral performance (Wolfe-Quintero

et al. 1998). These measures have been shown to be appropriate in the evaluation of writing quality

(Crossley and Kim 2022). Typically, such measures include analytic and quantitative descriptors of

linguistic features related to syntactic and lexical complexity, accuracy and fluency. The CAF construct

should be viewed as “multifaceted, multi-layered and multidimensional in nature and [the dimensions]

are interrelated in complex and not necessarily linear ways” (Michel 2017:52; Bulté and Housen 2012).

Due to the complex nature of the CAF construct, researchers must define and operationalise its

dimensions (Bulté and Housen 2012). In this paper, we understand syntactic complexity as the

production of a “variety and degree of sophistication of the syntactic structures deployed in written

production” (Lu 2017:494). Measures of lexical complexity reveal both the size (diversity) and depth

(sophistication) of a writer’s lexicon (Crossley and McNamara 2014; Maamuujav 2021). Accuracy

refers to the ability to produce error-free writing or speech and to follow the rule system of a language

(Bui and Skehan 2018; Wolfe-Quintero et al. 1998), while fluency is directly related to a writer’s control

of the language, as it reflects the “speed and ease” with which they produce linguistic output (Housen

and Kuiken 2009:462), and is often measured as the number of utterances produced.

4
Due to the aforementioned multidimensional nature of the CAF construct, it is crucial to understand

the complex tasks such as writing will result in trade-off effects (Skehan 1998, 2015). In fact, writers

might not have the necessary resources to focus on the successful production of all CAF dimensions,

often producing texts that are either fluent or complex or accurate, but not all three, which has previously

referred to the limited attention capacity (LAC) hypothesis (Skehan, 2009). Trade-off effects are

particularly seen between complexity and accuracy (Skehan 2009). For instance, as suggested by Michel

(2017), when L2 learners perform a task that requires complex language use, they might be less fluent

(e.g., produce shorter utterances, or write/speak more slowly), probably due to focusing on the

complexity of the task. Another clear example is the overuse of simple linguistic structures when

learning an L2 – learners might be more inclined to produce simple but accurate language.

Numerous studies have used some or all of the dimensions to analyse different sets of corpora, as

quantitative corpus linguistic approaches provide clear and objective indicators of a learner’s

performance (Durrant et al. 2021). Many studies, such as those by Navés (2011), Roquet and Pérez-

Vidal (2017), and Lahuerta (2020), have focused on the effects of type of instruction (e.g., Content and

Language Integrated Learning vs. English as a Foreign Language) on written outcomes by using CAF

measures. Other studies have used CAF measures to track developmental trajectories in subject-specific

language writing (e.g., Granados et al. 2021, 2022; Lorenzo and Rodríguez 2014). CAF measures have

also been used in process-oriented approaches, such as to explore how pre-planning (e.g., Ashoori

Tootkaboni and Pakzadian 2020), the writer’s perceptions (e.g., García-Ponce et al. 2022), genre

differences (e.g., Yoon and Polio 2017) or individual differences (e.g., Kormos 2012; Vasylets et al.

2022) affect writing outcomes. Most of these studies found that some or all CAF metrics were

influenced by a myriad of factors, including cognitive, sociological or instructional variables (Larsen-

Freeman 2009).

1.3. Development and instruction of CAF

Research has produced mixed findings regarding the role of CAF measures in language

development, often overusing the term development (Polio and Park 2016), as an increase in certain

5
measures does not imply development per se. However, there seems to be agreement that CAF measures

develop over time, with writers producing more complex, accurate and fluent texts as they become more

proficient in the language (Berninger et al. 2011; Crossley et al. 2011; Durrant et al. 2021). Nonetheless,

the aforementioned development must be considered carefully, as the acquisition and development of a

language is a dynamic process (Larsen-Freeman 2009; Michel 2017) in which development is normally

non-linear in nature (Polio and Park 2016), especially in the development of academic written language

(Pessoa et al. 2014).

As mentioned by Granados et al. (2022), much research has used the dimensions of CAF to assess

the development of academic writing at different stages and levels. Durrant et al. (2021) provide a

thoughtful review of some CAF measures in (academic) writing, considering both L1 and L2 studies.

Their review highlights the significant development of certain measures across educational levels (or

age), such as T-unit length or lexical diversity, in addition to their relationship to written quality

assessment. However, not all measures develop with time, as is proposed by the non-linear and dynamic

nature of CAF development (Michel 2017). For example, Polio and Shea (2014) and Yoon and Polio

(2017) longitudinally analysed students’ compositions using complexity and accuracy measures, to

capture academic language change in a one-semester-long language course. They found little significant

change in L2 English writing in a one-semester-long language course. Literature focusing on the use of

CAF for academic writing development is well-established in the field (see Durrant et al. 2021 for a

review), nevertheless, instructional studies focusing on the effects of experimental interventions are

scarcer. For example, Marashi and Chizari (2016) explored the effect of critical discourse analysis, Teng

and Huang (2021), that of metacognitive strategies, and Fathi and Rahimi (2022), that of a flipped

classroom approach.

Many of the aforementioned studies, however, have focused on general academic topics, and little

is known about their development in subject-specific writing. The studies by Lorenzo and Rodríguez

(2014) and Granados et al. (2021, 2022) explored changes in CAF measures in secondary education

history writing, and found that some of the metrics changed over time from one year to another.

6
Granados et al. (2022) found a similar development in students’ L1 Spanish and L2 English during a

two-year longitudinal study.

2. Research questions

To sum up, CAF measures have long been used in corpus linguistics and academic language

research, as they serve as objective indicators of linguistic performance (Granados et al. 2022; Lorenzo

and Rodríguez 2014). Using CAF measures requires defining and operationalising each dimension

(Bulté and Housen 2012) and being aware of their dynamic and multifaceted nature (Michel 2017), and

the potential trade-off effects in which learners’ attentional resources are involved (Skehan 1998, 2009,

2015; Robinson 2007). However, as has been mentioned, the number of instructional studies focusing

on CAF is still limited, even more so when considering both disciplinary and multilingual writing. In

addition, many studies utilising CAF have only employed one of the dimensions (Phuoc and Barrot

2022), or a single measure per dimension (e.g., Sagasta 2003).

This study, therefore, seeks to fill these gaps by analysing the effects of instruction focused on

CDFs and academic language on students’ written production of CAF measures in disciplinary trilingual

writing. The main objective of the present study is to explore how CDF and academic language teaching

affect student subject-specific writing, answering the following research question: Does instruction on

academic language influence complexity, accuracy and fluency measures in students’ writing in

Basque, Spanish and English?

Three main hypotheses (H) have been formulated for this study:

H1. According to the multilingual writing model proposed by Rinnert and Kobayashi (2016),

it is hypothesised that instruction focused on the students’ repertoire of knowledge (scientific

argumentation) will positively influence their written outcomes as measured by CAF.

H2. Literature suggests that the development of CAF measures is non-linear (Michel 2017) and

that trade-off effects appear in their development (Skehan 1998). It is predicted, therefore, that

not all CAF measures will develop positively, but rather, due to trade-off effects and the LAC

7
hypothesis (Skehan 1998, 2015), students will focus primarily on either accuracy or complexity

(Skehan 2009).

H3. The third hypothesis assumes that students in the experimental group will improve their

writing in the language of instruction (Basque), which will be reflected by improvement in CAF

measures. In addition, gains will also be significant in Spanish and English texts, as certain

characteristics are shared across languages and overlap (Rinnert and Kobayashi 2016). This is

supported by Granados et al. (2022) and Arias-Hermoso and Imaz Agirre (2023), who claim

that academic language develops in parallel across one’s languages.

3. Methodology

3.1. Participants

This was a quasi-experimental study with a pre-test/post-test design carried out in four Basque

immersion Model D schools in the Basque Autonomous Community (BAC). Under Model D

instruction, students are taught through the medium of Basque in all subjects, except for 3 hours a week

each of Spanish and English language classes (see Cenoz 2023 for more information on the language

models in the BAC). Therefore, students are exposed to Basque in school for 26 hours a week. These

four schools were selected because they are members of the same educational network and follow the

same pedagogical approaches, and as a result, this would minimise the effect of confounding variables

such as different teaching methods, instructional materials or students’ previous knowledge of the topic.

Two intact Year 8 (13-14-year-old students) classes were selected in each of three initially chosen

schools, one assigned to the control group in 2021 and the other to the experimental group in 2022.

Table 1 shows the total number of participants, the number included in the final sample, and the number

that did not meet the inclusion criteria. Participants who were absent for at least one data collection

point were excluded from the final sample. Due to data being collected in March 2021 during COVID-

19 and the inclusion criteria, a fourth school was added to the experimental group.

8
Table 1: Number of participants included and excluded per school and group. School D was not
included in the control group.

School Control group Experimental group

Included Excluded Included Excluded

School A 16 14 12 18

School B 12 5 14 5

School C 21 5 20 2

School D – – 17 14

Total 49 24 63 39

All participants had Basque, Spanish or both as their L1 or dominant language. The majority of

students reported having Spanish as their L1 (67.25%), followed by those having Basque (23.89%) or

both (8.85%). Despite L1 differences, all participants had sufficient command of the language of

instruction, Basque. They were expected to be advanced learners of both Basque and Spanish (around

B2), and initial-intermediate learners of English (around A2). In order to control for potential

differences in language proficiency between schools and groups, their knowledge of Basque, Spanish

and English was measured with a LexTale test (de Bruin et al. 2017; Izura et al. 2014; Lemhöfer &

Broersma 2012). Although LexTale focuses primarily on word recognition, scores are correlated with

general proficiency and language dominance (e.g., Bonvin et al. 2023). In the task, participants have to

indicate whether the words presented to them are real or pseudowords. Correct and incorrect answers

were subtracted to obtain a numerical score ranging from -1 to 1 for each student in each language, with

scores closer to 1 indicating more proficient learners. Both groups scored better in Spanish (M=0.66,

SD=0.2) than in Basque (M=0.55, SD=0.15) and English (M=0.17, SD=0.16). T-tests showed no

significant difference between the experimental and control groups regarding the scores (all ps>0.3),

therefore, a shared similar baseline proficiency between the groups can be assumed.

Written informed consent from parents or guardians was obtained before data collection, and all

participants were informed about the procedure of the study. Data were treated confidentially and were

collected solely for research purposes, and participants were given the option to withdraw from

9
participating in the study at any time. Anonymity could not be granted due to data collection via e-mail,

however, all personal information was pseudonymised. This study received approval from the

university research ethics committee.

3.2. Instruments and data collection

The participants were asked to write three texts as part of the pre-test before learning the subject content,

and three as part of the post-test after having covered the topic in class. For the control group, data were

collected in March 2021 (pre-test) and three months later, in June 2021 (post-test). The experimental

group completed the tests on the same dates in 2022. At each data collection point, students were asked

to write three letters, one each in Basque, Spanish and English. In order to control for potential task

repetition and practice effects, the order in which the letters were completed was pseudo-randomised

using the following sequences: Basque-Spanish-English, Spanish-English-Basque, and English-

Basque-Spanish. Students in each class were randomly assigned to one of these sequences. They were

given no word limit for the task and had 55 minutes (the average duration of a school session) to

complete each essay. The first two texts were completed in two contiguous sessions, while the third was

completed the following day. Although this schedule might affect students’ writing due to recency,

tiredness and task familiarity, the pseudo-randomised order of completion was used to mitigate these

effects. The texts were written on a computer under individual test conditions, and the participants were

explicitly told not to use online resources, translators or any aid not provided by the researchers.

Teachers and researchers were present during the tests and students were not allowed to ask questions.

In addition to the texts, students also completed a background questionnaire and the LexTale tests in

the three languages in the corresponding sessions.

In their letters, the students wrote to different people or institutions to define renewable energy and

argue in favour of its use in their schools. The prompts were identical but the texts in Basque were

intended for parents, the ones in Spanish for the Spanish Ministry of the Environment, and the ones in

English for the Green School Foundation.

10
3.3. The intervention design and its context

In line with the pedagogical approach of the network of schools of which they are members, all four

schools participating in the present study make use of a competence-based approach in which subjects

at the secondary level are organised as three three-month-long projects. This competence-based

curriculum emphasises both the acquisition and transfer of lifelong learning knowledge based on

students’ exit profile (Antero et al. 2023). These projects are not interdisciplinary by nature, as each

content subject is taught separately. The present study was carried out during a project focused on

renewable energies, whose main objective was to learn about energy sources, their characteristics and

their use, and to critically evaluate and express opinions on that topic. Thus, the students were expected

to be able to use explanatory-sequential discourse, explore the advantages and disadvantages of

different energy sources, and provide effective arguments supported by evidence.

An interdisciplinary group was created for the study in order to collaboratively analyse and adapt

the teaching materials used to teach the project and to design subsequent instructional sequences. The

interdisciplinary group included six researchers, four secondary science teachers and two materials

designers who had designed the original materials and textbooks used for the project. The group held

six two-hour collaborative sessions between June 2021 and February 2022. The first two sessions were

mainly theoretical, focusing principally on training teachers and materials designers to acquaint them

with academic language, disciplinary literacies and the CDF construct. The third and fourth sessions

were aimed at collaboratively analysing the teaching materials already in use with a checklist (see

Lersundi, 2023). The analysis showed that, although students were expected to develop scientific

arguments and explanatory discourse, there was little explicit instruction on either in the materials. As

a result, the final two sessions focused on proposing relevant modifications to the teaching materials

and designing instructional sequences based on them.

After having adapted the materials, the four science teachers participating in the project carried out

the modified intervention in the classroom between March 2022 and June 2022. All teaching was in

11
person and during school hours, during the usual time periods scheduled for the subject of science (3

hours a week). The language of instruction was Basque. The instruction modifications focused mainly

on three one-hour sessions, which were specifically designed to teach students to argue scientifically.

Following Rinnert and Kobayashi’s (2016) multilingual writing model, our intervention addressed the

students’ repertoire of knowledge, focusing on all four components (topic knowledge, genre knowledge,

disciplinary knowledge and multilingual writing knowledge). Figure 1 summarises the objectives and

content of each of the three sessions. Sessions 1 and 2 took place in the middle of the project, while

session 3 took place at the end.

Figure 1. Summary of the intervention.

The first one-hour session aimed to teach the students the CDF argue. Science teachers provided

the students with a handout focusing on Toulmin’s Argumentation Pattern (1958). The main elements

of Toulmin’s Model (claim, data, warrants, rebuttals) were defined and supported with examples. The

importance of using evidence to support one’s opinion was highlighted in the handout. Subsequently,

the students were asked to analyse a text about eating sweets, identify Toulmin’s elements in the text,

and justify whether the author had produced a successful argument. The main objective of the session

was for the students to become aware of the main elements required for effective argumentation. The

second session focused on learning the lexicogrammar required to express cause and effect in Basque.

To do so, the students participated in an interactive digital simulation in which they were asked to

combine different elements such as solar panels, batteries, human force or LED lights to assess the

effect of the elements on each other. They were asked to orally explain the energy transformation

processes by using the lexicogrammatical resources that had been taught. In the third and final one-hour

12
session, the students had to prepare an oral defence of an infographic comparing renewable and non-

renewable energy sources. They were explicitly asked to analyse and provide arguments for the

advantages and disadvantages of each, after having carried out a small research project on the topic.

The only CAF dimension included in the instruction was the lexicogrammar needed to express

cause and effect.

3.4. Data coding and analysis

For coding, the final trilingual corpus consisted of 672 texts. Student essays were given a code

(including student ID, school, language and time) and were aligned by participant, language and time.

Each text was processed using MultiAzterTest (Bengoetxea et al. 2020), a computational multilingual

corpus analysis tool previously used to track academic language development in writing in the discipline

of history in Secondary Education (e.g., Granados et al. 2022). MultiAzterTest was chosen because it

is one of the only multilingual tools that supports the three languages under study and provides common

quantitative measures for them. MultiAzterTest provides 163 indices in English, 141 in Spanish and

125 in Basque, as some are language-dependent, e.g., Common European Framework of Reference

(CEFR) word classifications in English. For this study, only measures that were common to the three

languages and that had previously been used in the literature were considered. We acknowledge that

some measures might be more language-specific or more common in a certain language, however, the

objective of this study was to compare pre-post differences rather than differences across languages.

Due to the multilayered nature of the construct (Bulté and Housen 2012), it is recommended to use

more than one measure for each dimension. Therefore, in this study, at least three measures were used

to track development in each of the four dimensions. Concerning syntactic complexity, the mean

number of modifiers per noun phrase, the number of subordinate clauses and the mean sentence length

(mean words per sentence) were calculated; all of these measures have been employed in previous

research (e.g., Casal and Lee 2019; Lorenzo and Rodríguez 2014, Maamuujav et al. 2021). The number

of modifiers per noun phrase, for example, has been shown to be correlated with writing quality

13
(Crossley and McNamara 2014), and sentence length development is regarded as an expansion of

academic language (Lorenzo and Rodríguez 2014). Logical and causal connectives were also included

as a measure of syntactic complexity due to the textual characteristics of the elicited genre.

Regarding lexical complexity, the measure of textual lexical diversity (MTLD), the number of rare

content words and the number of distinct rare content words were calculated. The MTLD is a validated

measure (McCarthy and Jarvis 2010) that is stable across texts of different lengths (Zenker and Kyle

2021). Rare words are defined as those with a word frequency lower than 4 in wordfreq (see Speer et

al. 2018). Lexical units of agglutinative languages (i.e., Basque) are segmented and lemmatised by

MultiAzterTest (Bengoetxea et al., 2020), a necessary step for the recognition of both the lemmas and

agglutinated functional words, i.e., determiners and/or declensions (Otegi et al. 2017). Semantic

similarity measures provided by the computational analysis were also included within lexical

complexity. Fluency was measured by the total number of words, sentences and paragraphs in the texts

(Wolfe-Quintero et al. 1998).

MultiAzterTest does not provide scores for accuracy, and the authors therefore analysed accuracy

manually for each of the texts using MAXQDA, which provides inter-rater agreement scores. The

authors performed an initial separate analysis of all control texts from School C (n=126), and in order

to calculate the consistency and agreement in their evaluations, intercoder agreement for error frequency

in each text was determined by MAXQDA to be 91%. Discrepancies were discussed and the remaining

analysis was performed independently by the authors.

For the purposes of this study, the type of error was not relevant, therefore, all errors were coded

and quantified identically. If an error appeared more than once, all appearances were counted as errors.

For example, if a student wrote reniwable instead of renewable three times, it was counted as three

errors. For the purposes of this study, neither content-related errors (e.g., stating that solar energy is

non-renewable) nor style-related errors (e.g., a very informal greeting in the letter to the Ministry of the

Environment) were counted. However, if a content error produced textual incoherence, it was

considered an error. In summary, spelling, typographical, coherence, syntactic and lexical errors were

14
counted. It should be noted that words in non-standard Basque were also counted as errors. Previous

research has used error-free T-units to capture written accuracy, however, this measure might be

inappropriate for beginners, as they tend to make errors in every sentence (Polio and Shea 2014).

Consequently, three scores were used to measure accuracy: the total number of errors, errors per word

(Lahuerta 2020; Orcasitas-Vicandi 2021) and errors per sentence (Sagasta 2003; Wolfe-Quintero et al.

1998). Table 2 summarises the measures analysed in each dimension.

Table 2: Measures used in each dimension


DIMENSION MEASURES INCLUDED

Syntactic complexity  Mean number of modifiers per noun phrase


 Mean length of sentence (words per sentence)
 Logical connectives
 Causal connectives

Lexical complexity  MTLD


 Number of rare content words
 Number of rare distinct content words
 Semantic similarity between adjacent sentences
 Semantic similarity between all pairs of sentences

Accuracy  Total number of errors


 Error per word ratio
 Error per sentence ratio

Fluency  Number of words


 Number of sentences
 Number of paragraphs

Statistical analyses were performed with JAMOVI (2022) and showed data to be not normally

distributed; therefore, non-parametric tests were carried out to address the research questions. Repeated

measures Friedman analyses of variance (ANOVAs) and Durbin-Conover post-hoc tests were

performed to explore differences between pre-tests and post-tests.

4. Results

In this section, the results of the statistical analyses are presented. To facilitate a clear and organised

presentation, the findings of each dimension are reported separately. Significant differences from the

15
pre-test to the post-test are indicated with asterisks next to the post-test results. Only variables exhibiting

a significant change from pre-test to post-test in any of the languages (Basque, EU; Spanish, ES;

English, EN) or groups (control, experimental) are included in the tables; those measures that did not

reach statistical significance in any language or group are omitted from the tables in the upcoming

sections.

4.1. Syntactic complexity

Regarding syntactic complexity measures in student writing, some significant differences between

testing moments were found, as indicated in Table 3. The control group decreased their use of

subordinates significantly in Basque (χ²(5)=3.722, p<.001) and English (χ²(5)=2.91, p=.004), and

marginally in Spanish (χ²(5)=1.745, p=.082). In addition, a significant decrease in the use of logical

connectives was found in the texts in English (χ²(5)=2.516, p=.013). No other measure changed

significantly. In contrast, the experimental group showed no significant decreases in those measures but

improved in the mean number of words per sentence, significantly in Basque (χ²(5)=2.8, p=.006), and

marginally in Spanish (χ²(5)=1.95, p=.052). Participants in the experimental group also increased their

use of modifiers per noun phrase in Basque (χ²(5)=2.65, p=.009) and Spanish (χ²(5)=2.81, p=.006).

Changes in the other measures in this dimension (the number of causal connectives) did not reach

statistical significance.

16
Table 3: Syntactic complexity measures: ***p<0.001, **p<0.05, *p<0.09.
CONTROL GROUP EXPERIMENTAL GROUP
MEASURE T
EU ES EN EU ES EN

16.542 (7.620) 22.73 (10.67) 20.00 (11.55) 18.637 17.23


Pre 12.522 (5.272)
(8.153) (8.19)
Mean words per sentence
16.17 (13.97) 22.58 (19.82) 19.270 14.809 20.731 16.96
Post
(8.731) (6.225)** (6.967)* (8.32)

13.286 (4.869) 14.429 12.408 9.844 (4.001) 12.672 10.07


Pre
(5.148) (5.330) (3.948) (5.02)
N of subordinates
10.796 12.245 9.480 9.302 (4.791) 11.672 9.92 (4.25)
Post
(6.265)*** (5.600)* (4.105)** (5.354)

0.673 (0.574) 1.161 (0.169) 0.925 (0.229) 0.559 (0.103) 1.100 (0.184) 0.87
Pre
(0.1811)
N of Modifiers Per Noun
Phrase
0.585 (0.133) 1.168 (0.173) 0.899 (0.223) 0.577 1.216 0.96 (0.24)
Post
(0.150)** (0.218)**

12.204 (5.070) 6.061 (3.037) 8.510 (4.579) 6.59


Pre 11.219 (5.144) 5.594 (2.422)
(3.589)
Logical connectives
11.755 (6.610) 5.714 (3.182) 6.306
Post 10.857 (5.146) 5.328 (2.890) 6.03 (3.32)
(3.709)**

4.2. Lexical complexity

Few differences were found in lexical complexity measures between pre-test and post-test results.

As illustrated in Table 4, there was a significant increase in the number of rare distinct words in Spanish

used by students in the control group from the pre-test to the post-test (χ²(5)=2.08, p=.039). In contrast,

the MTLD significantly decreased in Basque in the experimental group (χ²(5)=2.20, p=.029). No other

significant changes were found in any group, as neither the number of rare words nor semantic similarity

measures showed significant differences between the pre-test and post-test in any language.

Table 4: Lexical complexity measures: ***p<0.001, **p<0.05, *p<0.09.

CONTROL GROUP EXPERIMENTAL GROUP


MEASURE T
EU ES EN EU ES EN

106.96 79.419 (22.283) 44.931 111.40 (41.45) 74.63 (19.34) 47.94


Pre
(36.701) (15.989) (17.64)
MTLD
111.214 76.656 (21.590) 49.719 104.03 67.704 45.79
Post
(45.691) (14.878) (44.22)** (18.71) (14.54)

5.429 (2.189) 12.898 (5.080) 7.265 (3.904) 5.17 (2.29) 13.03 (3.95) 7.13 (3.58)
Pre
N of distinct rare
words
4.837 (2.392) 14.388 7.429 (4.088) 5.12 (2.28) 13.34 (6.17) 7.48 (3.86)
Post
(5.450)**

17
4.3. Accuracy

Regarding accuracy measures, the control group showed an increase in errors overall in all

languages, as shown in Table 5. However, significant differences from the pre-test to the post-test were

found only in the error per word ratio in Basque (χ²(5)=2.089, p=.038) and English (χ²(5)=2.059,

p=.041). The experimental group, however, improved significantly in Basque, as they produced fewer

total errors (χ²(5)=3.466, p<.001), fewer errors per word (χ²(5)=3.089, p=.002) and fewer errors per

sentence (χ²(5)=2.210, p=.028). There was also a marginally significant improvement in the total

number of errors in English (χ²(5)=1.854, p=.065). No other significant changes were found.

Table 5: Accuracy measures: ***p<0.001, **p<0.05, *p<0.09.


CONTROL GROUP EXPERIMENTAL GROUP
MEASURE T
EU ES EN EU ES EN

7.837 (5.467) 8.633 (8.162) 13.245 (8.543)


Pre 9.14 (5.35) 8.91 (6.63) 12.83 (8.46)
Total errors
8.082 (4.609) 9.918 (8.786) 13.592 (8.715) 6.44 (4.46)*** 8.44 (7.26) 11.25 (7.68)*
Post

0.069 (0.049) 0.059 (0.053) 0.098 (0.060) 0.10 (0.08) 0.07 (0.06) 0.14 (0.1417)
Pre
Errors per word
0.084 (0.047)** 0.077 (0.065) 0.136 (0.108)** 0.08 (0.05)** 0.07 (0.07) 0.11 (0.08)
Post

1.346 (1.712) 1.341 (1.304) 1.960 (1.324)


Pre 1.31 (1.02) 1.37 (1.39) 2.36 (2.63)
Errors per sentence
1.544 (2.678) 1.871 (2.630) 2.658 (3.125)
Post 1.20 (1.27)** 1.61 (2.02) 1.8 (1.39)

4.4. Fluency

In regard to fluency metrics, the control group showed a significant decrease from the pre-test to

the post-test in the number of words in the texts in Spanish (χ²(5)=1.993, p=.047) and in English

(χ²(5)=3.675, p<.001), as indicated in Table 6, while the number of words did not change in the

experimental group (all p<0.5). There were no significant changes in the number of sentences or

paragraphs in either group.

18
Table 6: Fluency measures: ***p<0.001, **p<0.05, *p<0.09.
CONTROL GROUP EXPERIMENTAL GROUP
MEASURE T
EU ES EN EU ES EN

117.306 155.551 (42.666) 142.694 (41.151) 96.67 133.44 117.33


Pre
(32.111) (26.46) (28.16) (46.33)
N of words
106.204 139.184 (50.760) 120.020 (45.203) 96.22 135.09 116.87
Post
(46.975) ** *** (31.97) (52.07) (44.33)

5. Discussion

The main objective of this study was to explore how an experimental intervention focused on

academic language and argumentation influenced students’ writing as measured by syntactic

complexity, lexical complexity, accuracy and fluency measures (Bulté and Housen 2012; Michel 2017;

Wolfe-Quintero et al. 1998). Our findings showed some significant differences between groups (control

and experimental), suggesting that the experimental intervention and the modified teaching materials

had a significant effect on the outcomes of the participants’ writing. Considering that all participants in

the sample, both control and experimental, completed the same tasks under identical conditions (i.e.,

exposure to the languages, teaching hours, school teachers, data collection schedule, science content)

and had a similar language proficiency (as measured by LexTale tests), the positive effect of the

intervention can be confirmed.

Focusing explicitly on the students’ repertoire of knowledge positively influenced their written

outcomes, which supports Rinnert and Kobayashi’s (2016) multilingual writing model and confirms the

first hypothesis in the present study. The control group showed no significant changes from pre-test to

post-test in most of the measures, and did significantly worse in 4 of them, improving in only one. In

contrast, the experimental group showed significant improvements in 5 measures, mostly in Basque.

The fact that the latter showed more improvements in CAF measures might suggest that all four

components of the repertoire of knowledge require consideration when carrying out teaching

interventions. Teaching for the control group focused only on the topic and on disciplinary knowledge,

and the students in the control group showed no improvement in their writing from pre-test to post-test.

However, the proposed experimental instructional sequence also addressed the other two components

19
of knowledge (genre and multilingual writing knowledge). CDFs are considered the linguistic

realisation of cognitive activities (Dalton-Puffer 2013) and lie at the crossroads of language, content

and disciplinary literacies (Morton 2020). Therefore, the teaching of academic language and CDFs

might have stepped into all four components of the knowledge repertoire proposed by Rinnert and

Kobayashi (2016).

The experimental group performed worse in only one measure, the Basque MTLD. However, this

needs to be interpreted with caution due to the limitations of MultiAzterTest when performing

computational analysis of an agglutinative language such as Basque. In fact, the tool counts declined

words as different words when calculating the MTLD, which can result in inaccuracies in the analyses.

For example, the Basque word for renewable, "berriztagarri" (an adjective with no declension marks),

was counted differently depending on whether it included a declension mark, such as "berriztagarriak"

(plural accusative) or "berriztagarrienak" (plural accusative with superlative). Therefore, the decrease

might be due to fewer declension marks being used, rather than to the use of fewer different words.

Overall, teaching academic language was shown to be beneficial and had a significant positive effect

on students’ writing.

H2 was partially confirmed. The control group showed non-significant changes from the pre-test to

the post-test in lexical and syntactic complexity measures, and worse performance in fluency and

accuracy, whereas the experimental group improved in two dimensions (accuracy and syntactic

complexity) at the expense of the other two (fluency and lexical complexity), which did not change

significantly. The worse performance of the control group might be explained by a lack of genre and

writing strategies, which might have influenced how the students expressed their recently acquired topic

knowledge (Rinnert and Kobayashi 2016). As for the experimental group, the students’ writing showed

non-linear development after the experimental intervention (Michel 2017).

Two possible explanations may illustrate these findings. One explanation could be that the present

findings might not provide evidence to support trade-off effects if we acknowledge that the lack of

change does not necessarily mean that students have not developed their writing skills (Durrant et al.

20
2021). The fact that not all measures improved from the pre-test to the post-test does not necessarily

mean that no linguistic improvement took place (Michel 2017). As suggested by previous research (see

for example Polio and Shea 2014; Yoon and Polio 2017), CAF measures might need more time to

change significantly, which would explain the lack of development in many of the measures.

Additionally, the time between the pre-test and the post-test (3 months) might not have been sufficient

for significant change to occur, probably due to the cognitive, linguistic and maturational features of

the participants. Furthermore, the experimental intervention was not focused on teaching CAF or on

raising the students’ awareness of them, and measuring the effects of explicit instruction of CAF along

with academic language would help us to better understand this.

Another possible interpretation of the findings could support the LAC hypothesis and trade-off

effects between dimensions (Skehan 1998, 2009, 2015; de Jong et al. 2015). Other analyses of the

corpus showed that students in the sample had a better command of argumentation skills after the

experimental intervention (Garro et al., under review), which further supports the LAC hypothesis, as

it suggests that students in the experimental group focused primarily on arguing properly in the post-

test, rather than on the linguistic features of their compositions. However, mastering discourse-related

aspects such as argumentation might be correlated with the development of CAF measures. The

resource-directing dimensions of tasks (see Robinson 2007, Robinson and Gilabert 2007) shown by

students could have resulted in beneficial effects on accuracy and complexity measures. Further

research is needed to shed light on these two possible explanations and their potential interpretations,

especially regarding subject-specific lexical complexity measures.

Our third hypothesis predicted that improvement in CAF measures would be present mainly in the

language of instruction (Basque), but parallelisms in the other two languages were also expected. This

hypothesis was confirmed by the findings, as Basque measures were the most influenced by the

experimental intervention. Improvement in Basque CAF measures probably came about as a

consequence of Basque being the students’ language of instruction and the language in which the

experimental instruction was conducted. However, as hypothesised, improvement tendencies were also

found in Spanish (syntactic complexity measures) and English (accuracy measures). These findings

21
may suggest that interventions and instructional sequences aimed at fostering students’ academic

language skills, such as scientific literacy, expressing cause and effect or argumentation, are indeed

beneficial for their acquisition of balanced literacies in their languages (Lorenzo and Rodríguez 2014;

Lorenzo and Trujillo 2017). The results of the present study seem to support Cummins’ (1980, 2021)

CUP theory in that they confirm the idea that some aspects of language are crosslinguistic, going beyond

language barriers. Along the same line, our results align with those reported by Granados et al. (2022)

and Arias-Hermoso and Imaz Agirre (2023) in that a parallel development of disciplinary academic

language takes place throughout a learner’s whole linguistic repertoire, presenting some overlapping

characteristics in multilingual writing (Rinnert and Kobayashi 2016).

6. Conclusions, future research, limitations, implications

In conclusion, this study suggests that a focus on academic language and argumentation has a

positive impact on multilingual writing as measured by complexity, accuracy and fluency. Teaching

academic language was found to have a positive influence on students’ production of syntactic

complexity and accuracy in Basque, the language of instruction. In addition, trade-off effects were

observed in the results, as lexical complexity and fluency did not change significantly in the

experimental group. Unexpectedly, and in contrast to previous studies (e.g., de Jong et al. 2015), both

syntactic complexity and accuracy developed in parallel. This development, however, did not occur

only in the language of instruction but also in the students’ other two languages: their L1 (Spanish) and

the foreign language (English); this result supports multilingual models of writing (Arias-Hermoso and

Imaz Agirre, 2023; Cummins 2021; Granados et al. 2022; Rinnert and Kobayashi 2016).

Certain limitations to the present study must be acknowledged. First, participant exclusion in this

study was high, and the resulting sample was smaller than expected. Data collection took place during

the 2020-2021 and 2021-2022 academic years, during which COVID-19 health recommendations and

policies resulted in the implementation of lockdown protocols for students who tested positive or who

had been in contact with someone who tested positive. Consequently, 34% of the participants had to be

22
excluded from the sample because they were absent for at least one data collection point. As a result,

the students could not be separated according to their L1, which could provide a clearer picture of the

effectiveness of the experimental intervention. Second, language proficiency was measured by LexTale

tests. While these tests provide good representation of general proficiency (Bonvin et al. 2023; de Bruin

et al. 2017), they might not capture all skills related to language proficiency, as they focus solely on

word recognition. Moreover, the fact that Basque is an agglutinative language poses challenges when

conducting computational linguistic analyses. Some inaccuracies emerge, such as those mentioned in

the Discussion section regarding the MTLD in Basque. In addition, limitations related to the accuracy

analysis emerged during coding, including both the lack of inclusion types of errors and the impact of

errors on lexical complexity measures. Finally, we have to acknowledge the potential effect of the

different registers in the essays, which could have affected the production of CAF to a certain extent.

More research is needed to fully understand how disciplinary or subject-specific writing develops,

and quantitative computational tools that can carry out subject-specific language analyses are needed to

identify possible subject-sensitive aspects so that quantitative descriptors of subject-specific indicators

can then be created. Future studies should further investigate the effects of different types of

interventions on student writing measured by the CAF construct, since few intervention studies with a

(quasi-)experimental design have been conducted. Moreover, research and curricula designers should

consider interdisciplinary approaches in which science and language teachers collaborate to focus on

both disciplinary and language requirements. In addition, a larger sample of students with different

linguistic profiles and backgrounds is needed to explore whether similar developmental paths take place

after an intervention. Additionaly, the attentional requirements of producing CDFs need to be addressed

in subsequent studies.

Some implications can be drawn from this study, at both the theoretical and practical levels. The

present study theoretically supports the multilingual writing model suggested by Rinnert and Kobayashi

(2016) in that the repertoire of knowledge of a writer affects subsequent written production.

Furthermore, our study contributes to the field of multilingual writing with findings from a subject-

specific trilingual corpus, which is scarce in research. However, the main implications of this study are

23
pedagogical, as several applications for teaching and learning (disciplinary) languages can be drawn

from the present findings. Our study highlights the importance of focusing on academic and subject-

specific language conventions, such as scientific argumentation, in this case, as it may benefit students

not only in genre mastery but also in their written production of CAF. Benefits were observed despite

Basque being the L2 of the majority of the subjects, which emphasises the prominent role of the

language of instruction in building disciplinary literacies. As a concluding remark, and also as suggested

by previous research (Banegas and Mearns 2023; Sato 2023), it is crucial that researchers, educators

and material developers collaborate to design learning contexts, activities and interventions that foster

students’ development of content, language and literacy.

24
REFERENCES

Arias-Hermoso, Roberto, & Ainara Imaz Agirre. 2023. Exploring multilingual writers in secondary

education: insights from a trilingual corpus. European Journal of Applied Linguistics. Advance

online publication. https://doi.org/10.1515/eujal-2023-0022

Antero, Amaia, Artolazabal, Amaia, Garaialde, Esther, & Ibarzabal, Zigor. 2023. Bazatoz?

Ikastolen hezkuntza marko orokorra. Ikastolen Elkartea.

Ashoori Tootkaboni, A., & Pakzadian, M. 2020. Exploring the effects of pre-task planning time

on EFL learners’ narrative writing. Bellaterra Journal of Teaching & Learning Languages &

Literature, 13(4). https://doi.org/10.5565/rev/jtl3.851

Banegas, Dario L., & Mearns, Tessa. 2023. The Language Quadriptych in content and language

integrated learning: Findings from a collaborative action research study. Journal of Multilingual

and Multicultural Development. Advance online publication.

https://doi.org/10.1080/01434632.2023.2281393

Bengoetxea, Kepa, Gonzalez-Dios, Itziar, & Aguirregoitia, Andoni. 2020. AzterTest: Open Source

Linguistic and Stylistic Analysis Tool. Procesamiento Del Lenguaje Natural, 64, 61-68.

https://doi.org/10.26342/2020-64-7

Berninger, Virginia, Nagy, William, & Beers, Scott. 2011. Child writers’ construction and

reconstruction of single sentences and construction of multi-sentence texts: Contributions of syntax

and transcription to translation. Reading and Writing, 102, 151-182.

https://doi.org/10.1007/s11145-010-9262-y

Bonvin, Audrey, Brugger, Ladina, & Berthele, Raphael. 2023. Lexical measures as a proxy for

bilingual language dominance?. International Review of Applied Linguistics in Language

Teaching, 61(2), 257-285. https://doi.org/10.1515/iral-2020-0093

25
Bui, Gavin, & Skehan, Peter. 2018. Complexity, Accuracy, and Fluency. The TESOL Encyclopedia

of English Language Teaching, 1-7. https://doi.org/10.1002/9781118784235.eelt0046

Bulté, Bram, & Housen, Alex. 2012. Defining and operationalising L2 complexity. In A. Housen,

F. Kuiken, & I. Vedder (Eds.), Dimensions of L2 Performance and Proficiency. Complexity,

Accuracy and Fluency in SLA (pp. 50-68). John Benjamins Publishing Company.

Casal, J. Elliot, & Lee, Joseph. J. 2019. Syntactic complexity and writing quality in assessed first-

year L2 writing. Journal of Second Language Writing, 44, 51-62.

https://doi.org/10.1016/j.jslw.2019.03.005

Cenoz, Jasone. 2009. Towards multilingual education: Basque educational research from an

international perspective. Bristol: Multilingual Matters.

Cenoz, Jasone. 2013. The influence of bilingualism on third language acquisition: Focus on

multilingualism. Language Teaching, 46(1), 71-86. https://doi.org/10.1017/S0261444811000218

Cenoz, Jasone. 2023. Plurilingual education in the Basque Autonomous Community. In J. M. Cots

(Ed.). Profiling plurilingual education: A pilot study of four Spanish autonomous communities (pp.

33-53). Edicions de la Universitat de Lleida.

Cenoz, Jasone, & Gorter, Durk. 2011. A Holistic Approach to Multilingual Education:

Introduction. The Modern Language Journal, 95(3), 339-343. https://doi.org10.1111/j.1540-

4781.2011.01204.x

Christie, Frances. 2012. Language education throughout the school years: a functional

perspective. Wiley-Blackwell.

Coffin, Caroline. 2006. Historical discourse: the language of time, cause and evaluation.

Continuum.

26
Crossley, Scott. A., & Kim, Minkyung. 2022. Linguistic Features of Writing Quality and

Development: A Longitudinal Approach. Journal of Writing Analytics, 6, 59-93.

https://doi.org/10.37514/JWA-J.2022.6.1.04

Crossley, Scott. A., & McNamara, Danielle. S. 2014. Does writing development equal writing

quality? A computational investigation of syntactic complexity in L2 learners. Journal of Second

Language Writing, 26, 66-79. https://doi.org/10.1016/j.jslw.2014.09.006

Crossley, Scott. A., Weston, Jennifer L., McLain Sullivan, Susan. T., & McNamara, D. S. 2011.

The development of writing proficiency as a function of grade level: A linguistic analysis. Written

Communication, 28(3), 282-311. https://doi.org/10.1177/0741088311410188

Cummins, Jim. 1980. The exit and entry fallacy in bilingual education. NABE Journal, 4(3), 25-

60. https://doi.org/10.1080/08855072.1980.10668382

Cummins, Jim. 2021. Rethinking the Education of Multilingual Learners: A Critical Analysis of

Theoretical Concepts. Multilingual Matters.

Dalton-Puffer, Christiane. 2007. Discourse in Content-and-Language-Integrated Learning (CLIL)

Classrooms. John Benjamins. https://doi.org/10.1075/lllt.20

Dalton-Puffer, Christiane. 2013. A construct of cognitive discourse functions for conceptualising

content-language integration in CLIL and multilingual education. European Journal of Applied

Linguistics, 1(2), 216-253.

Dalton-Puffer, Christiane. 2016. Cognitive discourse functions: Specifying an integrative

interdisciplinary construct. In T. Nikula, E. Dafouz, P. Moore, & U. Smit (Eds.), Conceptualising

integration in CLIL and multilingual education (pp. 29-54). Multilingual Matters.

https://doi.org/10.21832/9781783096145-005

de Bruin, Angela, Carreiras, Manuel, & Duñabeitia, Jon Andoni. 2017. The BEST Dataset of

Language Proficiency. Frontiers in Psychology, 8, 1-7. https://doi.org/10.3389/fpsyg.2017.00522

27
de Jong, Nivja H., Groenhout, Rachel, Schoonen, Rob, & Hulstijn, Jan. H. 2015. Second language

fluency: Speaking style or proficiency? Correcting measures of second language fluency for first

language behavior. Applied Psycholinguistics, 36, 223-243.

https://doi.org/10.1017/S0142716413000210

Durrant, Philip. 2017. Lexical Bundles and Disciplinary Variation in University Students’ Writing:

Mapping the Territories. Applied Linguistics, 38(2), 165-193.

https://doi.org/10.1093/applin/amv011

Durrant, Philip, Brenchley, Mark, & McCallum, Lee. 2021. Understanding Development and

Proficiency in Writing: Quantitative Corpus Linguistic Approaches. Cambridge University Press.

https://doi.org/10.1017/9781108770101

Fathi, Jalil, & Rahimi, Masoud. 2022. Examining the impact of flipped classroom on writing

complexity, accuracy, and fluency: a case of EFL students. Computer Assisted Language Learning,

35(7), 1668-1706. https://doi.org/10.1080/09588221.2020.1825097

García-Ponce, Edgar Emmanuelle, Mora-Pablo, Irasema, & Segovia-Hernández, Juan Gabriel.

2022. Role of EFL learners’ perceptions of task difficulty in complexity, accuracy and fluency: An

exploratory case study. Porta Linguarum, 37, 123-142.

https://doi.org/10.30827/portalin.vi37.15855

Granados, Adrián, Lorenzo-Espejo, Antonio, & Lorenzo, Francisco. 2021. Evidence for the

interdependence hypothesis: a longitudinal study of biliteracy development in a CLIL/bilingual

setting. International Journal of Bilingual Education and Bilingualism, 25(8), 3005-3021.

Granados, Adrián, Lorenzo-Espejo, Antonio, & Lorenzo, Francisco. 2022. A portrait of academic

literacy in mid-adolescence: a computational longitudinal account of cognitive academic language

proficiency during secondary school, Language and Education,

https://doi.org/10.1080/09500782.2022.2079951

28
Housen, Alex, & Kuiken, Folkert. 2009. Complexity, Accuracy, and Fluency in Second Language

Acquisition. Applied Linguistics, 30(4), 461-473. https://doi.org/10.1093/applin/amp048

Izura, Cristina, Cuetos, Fernando, & Brysbaert, Marc. 2014. Lextale-Esp: a test to rapidly and

efficiently assess the Spanish vocabulary size. Psicológica, 35, 49-66.

Kormos, Judit. (2012). The role of individual differences in L2 writing. Journal of Second

Language Writing, 21(4), 390-403. https://doi.org/10.1016/j.jslw.2012.09.003

Lahuerta, Ana. 2020 Analysis of accuracy in the writing of EFL students enrolled on CLIL and

non-CLIL programmes: the impact of grade and gender. The Language Learning Journal, 48(2),

121-132. https://doi.org/10.1080/09571736.2017.1303745

Larsen-Freeman, Diane. 2009. Adjusting Expectations: The Study of Complexity, Accuracy, and

Fluency in Second Language Acquisition. Applied Linguistics, 30(4), 579-589.

https://doi.org/10.1093/applin/amp043

Lemhöfer, Kristin, & Broersma, Mirjam. 2012. Introducing LexTALE: A quick and valid Lexical

Test for Advanced Learners of English. Behaviour Research Methods, 44, 325-343.

https://doi.org/10.3758/s13428-011-0146-0

Lersundi, Amaia. 2023. Analysis of Subject-Specific Literacies in a Multidisciplinary Project in

Upper-Secondary Education. Case Study. [Doctoral thesis, Mondragon University. Original in

Basque: Arloetako alfabetatzearen azterketa batxilergoko diziplinarteko proiektu batean. Kasu

azterketa]. https://hdl.handle.net/20.500.11984/5964

Llinares, Ana, Morton, Tom, & Whittaker, Rachel. 2012. The Roles of Language in CLIL.

Cambridge University Press.

Lorenzo, Francisco. 2017. Historical literacy in bilingual settings: Cognitive academic language in

CLIL history narratives. Linguistics and Education, 37, 32-41.

https://doi.org/10.1016/j.linged.2016.11.002

29
Lorenzo, Francisco, & Rodríguez, Leticia. 2014. Onset and expansion of L2 cognitive academic

language proficiency in bilingual settings: CALP in CLIL, System, 47, 64-72.

https://doi.org/10.1016/j.system.2014.09.016

Lorenzo, Francisco, & Trujillo, Fernando. 2017. Languages of schooling in European

policymaking: Present state and future outcomes. European Journal of Applied Linguistics, 5(2),

177-197. https://doi.org/10.1515/eujal-2017-0007

Lu, Xiaofei. 2017. Automated measurement of syntactic complexity in corpus-based L2 writing

research and implications for writing assessment. Language Testing, 34(4), 493-511.

https://doi.org/10.1177/0265532217710675

Maamuujav, Undarmaa. 2021. Examining lexical features and academic vocabulary use in

adolescent L2 students’ text-based analytical essays. Assessing Writing, 49, 100540.

https://doi.org/10.1016/j.asw.2021.100540

Maamuujav, Undarmaa, Olson, Carol Booth, & Chung, Huy. 2021. Syntactic and lexical features

of adolescent L2 students’ academic writing. Journal of Second Language Writing, 53, 100822.

https://doi.org/10.1016/j.jslw.2021.100822

Manchón, Rosa M., & Polio, Charlene. 2021. L2 Writing and Language Learning. In R. M.

Manchón, & C. Polio (Eds.), The Routledge Handbook in Second Language Acquisition: Second

Language Acquisition and Writing (pp. 1-7). Routledge.

Marashi, Hamid, & Chizari, Azam. 2016. Using Critical Discourse Analysis Based Instruction to

Improve EFL Learners’ Writing Complexity, Accuracy and Fluency. Journal of English Language

Pedagogy and Practice, 9(19), 37-61.

McCarthy, Philipp, & Jarvis, Scott. 2010. MTLD, vocd-D, and HD-D: A validation study of

sophisticated approaches to lexical diversity assessment. Behavior Research Methods, 42 381-392.

https://doi.org/10.3758/BRM.42.2.381

30
Michel, Marije. 2017. Complexity, Accuracy and Fluency in L2 Production. In S. Loewen, & M.

Sato (Eds.), The Routledge Handbook of Instructed Second Language Acquisition (pp. 50-68).

Routledge.

Moje, Elizabeth Birr. 2015. Doing and teaching disciplinary literacy with adolescent learners: a

social and cultural enterprise. Harvard Educational Review, 85(2), 254-278.

https://doi.org/10.17763/0017-8055.85.2.254

Morton, Tom. 2020. Cognitive Discourse Functions: A Bridge between Content, Literacy and

Language for Teaching and Assessment in CLIL. CLIL Journal of Innovation and Research in

Plurilingual and Pluricultural Education, 3(1), 7-17. https://doi.org/10.5565/rev/clil.33

Navés, Teresa. 2011. How promising are the results of integrating content and language for EFL

writing and overall EFL proficiency? In Y. Ruiz de Zarobe, J. M. Sierra, & F. Gallardo del Puerto

(Eds.), Content and Foreign Language Integrated Learning: Contributions to Multilingualism in

European Contexts (pp. 155-186). Peter Lang.

Orcasitas-Vicandi, María. 2021. Towards a multilingual approach in assessing writing: holistic,

analytic and cross-linguistic perspectives. International Journal of Bilingual Education and

Bilingualism, 25(6), 1-22. https://doi.org/10.1080/13670050.2021.1894089

Otegi, Arantxa, Imaz, Oier, Díaz de Ilarraza, Arantza, Iruskieta, Mikel, & Uria, Larraitz. 2017.

ANALHITZA: a tool to extract linguistic information from large corpora in Humanities research.

Procesamiento del Lenguaje Natural, 58, 77-84.

Pessoa, Silvia, Miller, Ryan T., & Kaufer, David. 2014. Students’ challenges and development in

the transition to academic writing at an English-medium university in Qatar. International Review

of Applied Linguistics in Language Teaching, 52(2), 127-156. https://doi.org/10.1515/iral-2014-

0006

31
Polio, Charlene, & Park, Ji-Hyun. 2016. Language development in second language writing. In R.

Manchón, & P. K. Matsuda (Eds.), Handbook of Second and Foreign Language Writing (pp. 287-

307). De Gruyter Mouton. https://doi.org/10.1515/9781614511335-016

Polio, Charlene, & Shea, Mark C. 2014. An investigation into current measures of linguistic

accuracy in second language writing research. Journal of Second Language Writing, 26, 10-27.

https://doi.org/10.1016/j.jslw.2014.09.003

Rinnert, Caroline, & Kobayashi, Hiroe. 2016. Multicompetence and multilingual writing. In R. M.

Manchón, & P. Matsuda (Eds.), Handbook of Second and Foreign Language Writing (pp. 365-

386). De Gruyter Mouton. https://doi.org/10.1515/9781614511335-020

Robinson, Peter. 2007. Task complexity, theory of mind, and intentional reasoning: effects on L2

speech production, interaction, uptake and perceptions of task difficulty. International Review of

Applied Linguistics in Language Teaching, 45(3), 193-213.

Robinson, Peter, & Gilabert, Roger. 2007. Task complexity, the Cognition Hypothesis and second

language learning and performance. International Review of Applied Linguistics in Language

Teaching 45(3), 161-176. https://doi.org/10.1515/iral.2007.007

Roquet, Helena, & Pérez-Vidal, Carmen. 2017. Do Productive Skills Improve in Content and

Language Integrated Learning Contexts? The Case of Writing. Applied Linguistics, 38(4), 489-

511. https://doi.org/10.1093/applin/amv050

Sagasta, María Pilar. 2003. Acquiring writing skills in a third language: The positive effects of

bilingualism. International Journal of Bilingualism, 7(1), 27-42.

https://doi.org/10.1177/13670069030070010301

Sato, Masatoshi. 2023. Navigating the research–practice relationship: Professional goals and

constraints. Language Teaching, 1-16. https://doi.org/10.1017/S0261444823000423

32
Shanahan, Timothy, & Shanahan, Cynthia. 2008. Teaching disciplinary literacy to adolescents:

Rethinking content-area literacy. Harvard Educational Review, 78(1), 40-59.

Skehan, Peter. 1998. A cognitive approach to language learning. Oxford University Press.

Skehan, Peter. 2009. Modelling second language performance: Integrating complexity, accuracy,

fluency, and lexis. Applied Linguistics, 30(4), 510-532. https://doi.org/10.1093/applin/amp047

Skehan, Peter. 2015. Limited Attention Capacity and Cognition: Two hypotheses regarding second

language performance on tasks. In M. Bygate (Ed.). Domains and Directions in the Development

of TBLT: A decade of plenaries from the international conference. John Benjamins Publishing

Company.

Speer, Robin, Chin, Joshua, Lin, Andrew, Jewett, Sara, & Nathan, Lance. 2018.

Luminosoinsight/wordfreq: v2.2, October.

Teng, Mark Feng, & Huang, Jing. 2021. The effects of incorporating metacognitive strategies

instruction into collaborative writing on writing complexity, accuracy, and fluency. Asia Pacific

Journal of Education. https://doi.org/10.1080/02188791.2021.1982675

The JAMOVI Project 2022. JAMOVI (Version 2.3) [Computer Software]. Retrieved from

https://www.jamovi.org

Toulmin, Stephen. 1958. The Uses of Argument. Cambridge University Press.

Vasylets, Olena, Mellado, M. Dolores, & Plonsky, Luke. 2022. The role of cognitive individual

differences in digital versus pen-and-paper writing. Studies in Second Language Learning and

Teaching, 12(4), 721-743. https://doi.org/10.14746/ssllt.2022.12.4.9

Wolfe-Quintero, Kate, Inagaki, Shunji, & Kim, Hae-Young. 1998. Second Language Development

in Writing: Measures of Fluency, Accuracy, and Complexity. University of Hawaii Press.

33
Yoon, Hyung-Jo, & Polio, Charlene. 2017. The Linguistic Development of Students of English as

a Second Language in Two Written Genres. TESOL Quarterly, 51(2), 275-301.

https://doi.org/10.1002/tesq.296

Zenker, Fred, & Kyle, Kristopher. 2021. Investigating minimum text lengths for lexical diversity

indices. Assessing Writing, 47, 100505. https://doi.org/10.1016/j.asw.2020.100505

34

View publication stats

You might also like