Professional Documents
Culture Documents
Conrad 2023
Conrad 2023
Conrad 2023
Article
Susan Conrad*
Register in corpus linguistics: the role and
legacy of Douglas Biber
https://doi.org/10.1515/cllt-2022-0032
Received April 16, 2022; accepted November 3, 2022; published online November 25, 2022
1 Introduction
This article provides an overview of the pivotal role that Douglas Biber’s work has
played in the development of register as both a research focus within corpus
linguistics and a theoretical construct important for understanding human’s use
of language. Given his prodigious output on multiple fronts, condensing his
contributions into a short article is a daunting task, which could be organized in
numerous ways. I frame my commentary around four developmental phases that
I perceive from having been a periodic collaborator as well as reader of his work
over the past several decades. The phases correspond roughly to spans of years,
but the threads of work within them overlap and intertwine throughout the
phases. Over the years he has also worked with numerous collaborators, and I
*Corresponding author: Susan Conrad, Applied Linguistics, Portland State University, Box 751,
Portland, OR, USA, E-mail: conrads@pdx.edu
8 Conrad
over the years: Any general description of a language would hide important dif-
ferences about the language used in different situations, and any single perspective
on a register gave an incomplete representation of it. Other better-known corpus
work of the time, especially the COBUILD project (Sinclair 1991), was not covering
register variation.
During this phase, Biber also started work on a more specific thread of
research within the study of speech-writing differences: the nature of gram-
matical complexity. In the 1988 MDA of English, grammatical complexity was
represented by different features on different dimensions, with spoken registers
as well as written registers having grammatical complexity, but of different
kinds. More focused work was required to understand complexity more fully.
Biber’s (1992) study of discourse complexity used confirmatory factor analysis to
test the predictive adequacy of theoretically motivated models of complexity. The
findings showed that spoken registers were more unified in their complexity
profile, while written registers had more variation, suggesting speech may
constrain grammatical complexity in some ways. This study was one step on an
empirical path and associated theorizing that continue in Biber’s work
throughout the phases.
Though his first MDA study was of English, Biber was immediately interested in
investigating other languages, too, recognizing that the question of speech-writing
differences was fundamental to human language, not just to a single, specific lan-
guage. His next book, Dimensions of Register Variation: A Cross-Linguistic Compar-
ison (Biber 1995) included MD analyses of Nukulaelae Tuvaluan, Somali, and Korean
which had appeared in other articles and his students’ dissertations. The book’s
concluding chapter discusses the possibility of cross-linguistic universals of register
variation. Most intriguing was the finding of some systematic linguistic variation
related to real-time production circumstances for spoken texts versus time for careful
production with planning and editing for written texts.
Another thread of research Biber started in this phase concerned the his-
torical development of registers. His work on Somali examined the linguistic
consequences of literacy there, showing that the range of linguistic diversity
increased when written registers were added (Biber and Hared 1991). Studying
English from 1650 to the present, he found patterns that first reflected writers
taking advantage of the careful production circumstances of writing to produce
discourse that had more integration of information and more precise meanings,
but later written registers splitting into those for more specialized audiences and
those for more popular, general audiences (Biber 1995: Ch. 8; Biber and Finegan
1989a; Biber and Finegan 1997).
In this first phase of work, and at the beginning of the next phase, Biber sought
to reach an audience of traditional sociolinguists, trying to integrate register into
Register in corpus linguistics 11
more general theories of language variation. Using evidence from MDA work and
other phonological, lexical, and grammatical studies, he and Finegan advanced
“an integrated theory” of register and social dialect variation (Finegan and Biber
1994), later called the “Register Axiom” (Finegan and Biber 2001). They argued that
the empirical evidence showed that variation due to the situation of use (in other
words, register variation) underlay much of what had been identified as social
dialect variation. The argument was criticized on many levels, a primary one being
emphasis given to writing (Bell 1995; Preston 2001). Biber’s interest in a wide range
of registers, function-related linguistic differences, and corpus techniques was not
well received by the traditional sociolinguistics community, where quantitative
studies usually focused on analyses of variants that had no impact on meaning,
data was typically gathered in sociolinguistic interviews, and spontaneous speech
was given primacy over all other language. As his research progressed, Biber
addressed traditional sociolinguistics audiences less and instead moved to audi-
ences in more general descriptive linguistics, applied linguistics, and especially
the developing field of corpus linguistics.
Early in the development of corpus linguistics, Biber started conducting
studies to help improve corpus design and research methods. In the early 1990s,
his was some of the rare work that addressed reliable sample sizes and repre-
sentativeness empirically (Biber 1990, 1993). In this first phase, too, and as a
contrast with traditional variationist studies, he began emphasizing “the unit of
analysis” for corpus studies – specifically, a feature or a text (Biber et al. 1998:
Methodology Box 8). This issue would gain even more importance over the years.
The book remains the only reference grammar where register comparisons are
integrated throughout. That it has been redesigned and newly published 20 years
later (Biber et al. 2021b) is a testament to its importance and usefulness. Yet when
Biber took on the project years earlier, few people could envision how it would
work. As told to me when I started working on the project in 1994, the then-head of
the Longman dictionary division thought their dictionary corpus ought to be useful
for a grammar, too, and asked Biber if he could take on the project. Although the
general concept of a corpus-based reference grammar was agreed upon, no one
could predict exactly how the concept would translate into a book. Again, Biber’s
computational expertise and research design insights were critical. Biber would
work out the analytical processes for the major corpus investigations, using both
automatic and interactive computer programs, and, with his students, write pro-
grams and conduct the corpus analyses. Meanwhile, he also coordinated the work
of the team of authors and wrote his own chapters. Further complicating the
project was the slow speed of computers as compared to today. I remember, for
example, running the first lexical bundle analysis. This analysis could be done
almost instantaneously today, but in the early 1990s, we left the program to run on
a Friday afternoon. We expected it would take some time over the weekend since
the analysis required moving through every text four words at a time, comparing
the sequence to all sequences already stored, and either adding to the count or
adding the new sequence to the dictionary. Only on Monday, when we arrived to
find the computer still running, did we realize we had forgotten to write a line in the
code to show what file was being processed. Doug and I stood there staring down at
the computer, debating whether to kill the program and potentially lose a week-
end’s worth of work or let it keep going in what could be an endless loop. (Luckily,
we decided to let it run and it finished before the end of the day.) Overall, the size,
complexity, and initial ambiguity of the project would have overwhelmed many
researchers. Randolph Quirk’s comment in the forward was no exaggeration: “The
co-authors were lucky in being led by a man of such determination, vision, energy,
and a fine track record in corpus theory and computational practice.”
Besides the new reference grammar, strands of work that had begun previ-
ously expanded in this phase. Biber continued to research grammatical complexity
(Biber 2003). Multi-dimensional analysis was also becoming more widely known,
with a collection highlighting the variety of domains it had been applied to (Conrad
and Biber 2001) along with more studies of other languages (e.g., Spanish, Biber
et al. 2006) and specialized domains (e.g., outsourced call centers, Friginal 2008).
Many MDA studies found that the expression of stance figured prominently in
a dimension, and the study of stance itself was also an interest of Biber’s. Begun in
the first phase (e.g., Biber and Finegan 1989b), his study of stance diversified in this
second phase. The Longman Grammar highlighted how certain grammatical
Register in corpus linguistics 13
The web register studies are innovative in using manual coding and statistical
analysis of situational characteristics, a technique that can be applied to studying
other registers. Again the work highlights Biber’s commitment to accounting for
empirical evidence and to trying new techniques to solve research questions.
Future years are sure to see the impact this work has on studying register variation
and defining the construct of register.
Biber and colleagues have also started new areas of work in this phase. A
notable one concerns the thorny issue of conversational discourse. Although
conversation has long been acknowledged as having chunks with different
purposes and topics, this phenomenon tended to be addressed in intensive
studies of turns, as in conversation analysis, and not systematically addressed in
most corpus studies. Using some techniques from the web register studies, Biber
et al. (2021a) and Egbert et al. (2021) present a system for the functional seg-
menting of conversation – an advancement that will surely lead to understanding
the situational and linguistic variation in conversation more fully.
6 Conclusion
The phases of Biber’s work constitute a developmental arc that has resulted in
register gaining empirical and theoretical status in studies of language use. Other
scholars have worked on register variation, but none have so consistently argued
its importance, produced so much empirical evidence, or worked so persistently
to improve both research methods and the conceptualization of register. Some
scholars work in just one area, focusing for instance on cross-linguistic studies,
diachronic studies, academic registers, or methodological improvements, but
Biber has been prolific in numerous areas. Although each area is interesting on
its own, he saw each contributing empirical evidence to the larger picture of how
human language varies based on situations of use.
Biber’s contribution to register studies has also extended to being exceed-
ingly generous with his time and expertise in mentoring the next generation.
Many of his publications are collaborations with his students. Look at current
professionals whose work explores or applies register variation, and you will find
a remarkable number of his former students, whether university faculty, lan-
guage teachers, journal editors, book authors, principal investigators on major
grants, or computer programmers. The specific skills and opportunities offered
by his mentoring are invaluable. Yet, from my perspective almost 30 years after
having been his graduate student, an even greater, longer-lasting impact of his
mentoring actually comes from something else: experiencing his general attitude
toward research and problem-solving. He makes clear through his own actions
18 Conrad
that you can commit to a challenging research goal without knowing exactly how
you are going to accomplish it all and even realizing that you probably don’t
already know everything you will need to know. In other words, the appropriate
response to an important theoretical or practical problem that intrigues you is
to dive in and keep figuring out ways to work on it.
Not just his students but also international collaborators, visitors to Northern
Arizona University’s corpus linguistics lab, and even conference attendees who
have talked with him have benefitted from Doug sharing his expertise and his
enthusiasm for research. I have lost track of the number of people who have
spoken or emailed with him and later told me they didn’t expect someone so
smart and well-known to be so nice and helpful. In sum, Douglas Biber’s legacy
extends not just throughout the literature about register, but throughout the
human community as well.
References
Bell, Allan. 1995. Review of Sociolinguistic perspectives on register edited by Douglas Biber and
Edward Finegan. Language in Society 24(2). 265–270.
Berber-Sardinha, Tony & Marcia Veirano-Pinto (eds.). 2014. Multi-dimensional analysis, 25 years
on: A tribute to Douglas Biber. Philadelphia: John Benjamins.
Berber-Sardinha, Tony & Marcia Veirano-Pinto (eds.). 2019. Multi-dimensional analysis: Research
methods and current issues. London: Bloomsbury.
Biber, Douglas. 1984. A model of textual relations within the written and spoken modes. Los
Angeles: University of Southern California dissertation.
Biber, Douglas. 1986. Spoken and written textual dimensions in English: Resolving the
contradictory findings. Language 62(2). 384–414.
Biber, Douglas. 1988. Variation across speech and writing. Cambridge: Cambridge University
Press.
Biber, Douglas. 1990. Methodological issues regarding corpus-based analyses of linguistic
variation. Literary and Linguistic Computing 5(4). 257–269.
Biber, Douglas. 1992. On the complexity of discourse complexity: A multidimensional analysis.
Discourse Processes 15(2). 133–163.
Biber, Douglas. 1993. Representativeness in corpus design. Literary and Linguistic Computing
8(4). 243–257.
Biber, Douglas. 1995. Dimensions of register variation: A cross-linguistic comparison. Cambridge:
Cambridge University Press.
Biber, Douglas. 2003. Compressed noun phrase structures in newspaper discourse: The
competing demands of popularization vs. economy. In Jean Aitchison & Diana Lewis (eds.),
New media discourse, 169–181. New York: Routledge.
Register in corpus linguistics 19
Biber, Douglas. 2004. Historical patterns for the grammatical marking of stance: A cross-register
comparison. Journal of Historical Pragmatics 5(1). 107–136.
Biber, Douglas. 2006. University language: A corpus-based study of spoken and written registers.
Amsterdam: John Benjamins.
Biber, Douglas. 2012. Register as a predictor of linguistic variation. Corpus Linguistics and
Linguistic Theory 8(1). 9–37.
Biber, Douglas. 2014. Using multi-dimensional analysis to explore cross-linguistic universals of
register variation. Languages in Contrast 14(1). 7–34.
Biber, Douglas & Federica Barbieri. 2007. Lexical bundles in university spoken and written
registers. English for Specific Purposes 26. 263–86.
Biber, Douglas & Susan Conrad. 2001. Quantitative corpus research – much more than bean
counting. TESOL Quarterly 35(2). 331–336.
Biber, Douglas & Susan Conrad. 2009/2019. Register, genre, and style. Cambridge: Cambridge
University Press.
Biber, Douglas & Jesse Egbert. 2016. Register variation on the searchable web: A multi-
dimensional analysis. Journal of English Linguistics 44(2). 95–137.
Biber, Douglas & Jesse Egbert. 2018. Register variation online. Cambridge: Cambridge University
Press.
Biber, Douglas & Edward Finegan. 1989a. Drift and the evolution of English style: A history of three
genres. Language 65(3). 487–517.
Biber, Douglas & Edward Finegan. 1989b. Styles of stance in English: Lexical and grammatical
marking of evidentiality and affect. Text 9(1). 93–124.
Biber, Douglas & Edward Finegan. 1997. Diachronic relations among speech-based and written
registers in English. In Terttu Nevalainen & Leena Kahlas-Tarkka (eds.), To explain the
present: Studies in the changing English language in honour of Matti Rissanen, 253–276.
Helsinki: Societe Neophilologique.
Biber, Douglas & Bethany Gray. 2010. Challenging stereotypes about academic writing:
Complexity, elaboration, explicitness. Journal of English for Academic Purposes 9. 2–20.
Biber, Douglas & Mohamed Hared. 1991. Literacy in Somali: Linguistic consequences. Annual
Review of Applied Linguistics 12. 260–282.
Biber, Douglas & James K. Jones. 2009. Quantitative methods in corpus linguistics. In
Lüdeling Anke & Merja Kytö (eds.), Corpus linguistics: An international handbook,
1286–1304. Berlin: Walter de Gruyter.
Biber, Douglas & Jerry Kurjian. 2007. Towards a taxonomy of web registers and text types: A multi-
dimensional analysis. In Marianne Hundt, Nadja Nesselhauf & Carolin Biewer (eds.), Corpus
linguistics and the web, 109–132. Amsterdam: Rodopi.
Biber, Douglas & Elena Seoane (eds.). 2021. Corpus-based approaches to register variation.
Amsterdam: John Benjamins.
Biber, Douglas, Susan Conrad & Randi Reppen. 1998. Corpus linguistics: Investigating language
structure and use. Cambridge: Cambridge University Press.
Biber, Douglas, Susan Conrad & Viviana Cortes. 2004a. “Take a look at…”: Lexical bundles in
university teaching and textbooks. Applied Linguistics 25(3). 371–405.
Biber, Douglas, Susan Conrad, Randi Reppen, Patricia Byrd & Marie Helt. 2002. Speaking and
writing in the university: A multi-dimensional comparison. TESOL Quarterly 36(1). 9–48.
Biber, Douglas, Eniko Csomay, James Jones & Casey Keck. 2004b. A corpus linguistic investigation
of vocabulary-based discourse units in university registers. In Ula Connor & Thomas Upton
(eds.), Applied corpus linguistics, 53–72. Amsterdam: Rodopi.
20 Conrad
Biber, Douglas, Mark Davies, James Jones & Nicole Tracy-Ventura. 2006. Spoken and written
register variation in Spanish: A multi-dimensional analysis. Corpora 1. 7–38.
Biber, Douglas, Bethany Gray & Kornwipa Poonpon. 2011. Should we use characteristics of
conversation to measure grammatical complexity in L2 writing development? TESOL
Quarterly 45(1). 5–35.
Biber, Douglas, Jesse Egbert & Daniel Keller. 2020. Reconceptualizing register in a continuous
situational space. Corpus Linguistics and Linguistic Theory 16(3). 581–616.
Biber, Douglas, Jesse Egbert, Daniel Keller & Stacey Wizner. 2021a. Towards a taxonomy of
conversational discourse types: An empirical corpus-based analysis. Journal of Pragmatics
171. 20–35.
Biber, Douglas, Bethany Gray, Shelley Staples & Jesse Egbert. 2022. The register-functional
approach to grammatical complexity. New York: Routledge.
Biber, Douglas, Stig Johansson, Geoffrey Leech, Susan Conrad & Edward Finegan. 1999. The
Longman grammar of spoken and written English. Harlow, England: Pearson Education.
Biber, Douglas, Stig Johansson, Geoffrey Leech, Susan Conrad & Edward Finegan. 2021b. The
grammar of spoken and written English. Amsterdam: John Benjamins.
Conrad, Susan & Douglas Biber (eds.). 2001. Variation in English: Multi-dimensional studies.
Harlow: Pearson Education.
Crystal, David & Derek Davy. 1969. Investigating English style. London: Routledge.
Egbert, Jesse, Douglas Biber & Bethany Gray. 2022. Designing and evaluating language corpora:
A practical framework for corpus representativeness. Cambridge: Cambridge University
Press.
Egbert, Jesse, Stacey Wizner, Daniel Keller, Douglas Biber, Tony McEnery & Paul Baker. 2021.
Identifying and describing functional discourse units in the BNC spoken. Text & Talk 41(5–6).
715–737.
Finegan, Edward & Douglas Biber. 1994. Register and social dialect variation: An integrated
approach. In Biber Douglas & Edward Finegan (eds.), Sociolinguistic perspectives on register,
315–347. New York: Oxford University Press.
Finegan, Edward & Douglas Biber. 2001. Register variation and social dialect variation: The
register axiom. In Penelope Eckert & John Richford (eds.), Style and sociolinguistic variation,
235–267. Cambridge: Cambridge University Press.
Friginal, Eric. 2008. Linguistic variation in the discourse of outsourced call centers. Discourse
Studies 10(6). 715–736.
Friginal, Eric. 2013. Twenty-five years of Biber’s multi-dimensional analysis: Introduction to the
special issue and an interview with Douglas Biber. Corpora 8(2). 137–152.
Halliday, Michael A. K. 1978. Language as social semiotic: The social interpretation of language
and meaning. London: Edward Arnold.
Hymes, Dell. 1974. Foundations in sociolinguistics. Philadelphia: University of Pennsylvania
Press.
Labov, William. 1972. Sociolinguistic patterns. Philadelphia: University of Pennsylvania Press.
Lee, David. 2001. Genres, register, text types, domains, and styles: Clarifying the concepts and
navigating a path through the BNC jungle. Language, Learning and Technology 5(3). 37–72.
Preston, Dennis. 2001. Style and the psycholinguistics of sociolinguistics: The logical problem of
language variation. In Penelope Eckert & John Richford (eds.), Style and sociolinguistic
variation, 279–304. Cambridge: Cambridge University Press.
Quirk, Randolph, Sidney Greenbaum, Geoffrey Leech & Svartvik Jan. 1985. A comprehensive
grammar of the English language. London: Longman.
Register in corpus linguistics 21