Download as pdf or txt
Download as pdf or txt
You are on page 1of 70

Quantifying Language Dynamics On the

Cutting Edge of Areal and Phylogenetic


Linguistics 1st Edition Soren
Wichmann
Visit to download the full and correct content document:
https://ebookmeta.com/product/quantifying-language-dynamics-on-the-cutting-edge-o
f-areal-and-phylogenetic-linguistics-1st-edition-soren-wichmann/
More products digital (pdf, epub, mobi) instant
download maybe you interests ...

Life on the Cutting Edge Sal Rachele

https://ebookmeta.com/product/life-on-the-cutting-edge-sal-
rachele/

Cutting Edge Hubble Telescope Data Christy Peterson

https://ebookmeta.com/product/cutting-edge-hubble-telescope-data-
christy-peterson/

Fifty Key Thinkers on Language and Linguistics 1st


Edition Margaret Thomas

https://ebookmeta.com/product/fifty-key-thinkers-on-language-and-
linguistics-1st-edition-margaret-thomas/

Why Do Linguistics?: Reflective Linguistics and the


Study of Language 2nd Edition Fiona English

https://ebookmeta.com/product/why-do-linguistics-reflective-
linguistics-and-the-study-of-language-2nd-edition-fiona-english/
Treatment of Spine Disease in the Elderly: Cutting Edge
Techniques and Technologies 1st Edition Kai-Ming G. Fu

https://ebookmeta.com/product/treatment-of-spine-disease-in-the-
elderly-cutting-edge-techniques-and-technologies-1st-edition-kai-
ming-g-fu/

Methicillin Resistant Staphylococcus Aureus MRSA


Protocols Cutting Edge Technologies and Advancements
Yinduo Ji

https://ebookmeta.com/product/methicillin-resistant-
staphylococcus-aureus-mrsa-protocols-cutting-edge-technologies-
and-advancements-yinduo-ji/

The Cambridge Handbook of Working Memory and Language


Cambridge Handbooks in Language and Linguistics John
W. Schwieter

https://ebookmeta.com/product/the-cambridge-handbook-of-working-
memory-and-language-cambridge-handbooks-in-language-and-
linguistics-john-w-schwieter/

Low-Dimensional Nanoelectronic Devices: Theoretical


Analysis and Cutting-Edge Research 1st Edition Angsuman
Sarkar (Editor)

https://ebookmeta.com/product/low-dimensional-nanoelectronic-
devices-theoretical-analysis-and-cutting-edge-research-1st-
edition-angsuman-sarkar-editor/

Biologic Therapy for Psoriasis Cutting Edge Treatment


Principles 1st Edition Nicholas Brownstone (Editor)

https://ebookmeta.com/product/biologic-therapy-for-psoriasis-
cutting-edge-treatment-principles-1st-edition-nicholas-
brownstone-editor/
Quantifying Language Dynamics
Quantifying Language Dynamics
On the Cutting Edge
of Areal and Phylogenetic Linguistics

Edited by

Søren Wichmann
Jeff Good

leiden | boston
The following articles published in this paperback originally appeared in Brill's journal Language Dynamics
and Change:

Bentz, Christian and Bodo Winter. 2013. Languages with More Second Language Learners Tend to Lose
Nominal Case. Language Dynamics and Change 3: 1-27.
Hammarström, Harald and Tom Güldemann. 2014. Quantifying Geographical Determinants of Large-Scale
Distributions of Linguistic Features. Language Dynamics and Change 4: 87-115.
Jäger, Gerhard. 2014. Phylogenetic Inference from Word Lists Using Weighted Alignment with Empirically
Determined Weights. Language Dynamics and Change 3: 245–291.
Michael, Lev, Will Chang, and Tammy Stark. 2014. Exploring Phonological Areality in the Circum-Andean
Region Using a Naive Bayes Classifier. Language Dynamics and Change 4: 27-86.

Library of Congress Control Number: 2014944681.

This publication has been typeset in the multilingual “Brill” typeface. With over 5,100 characters covering
Latin, ipa, Greek, and Cyrillic, this typeface is especially suitable for use in the humanities. For more
information, please see www.brill.com/brill-typeface.

isbn 978-90-04-28131-8 (paperback)


isbn 978-90-04-28152-3 (e-book)

Copyright 2014 by Koninklijke Brill nv, Leiden, The Netherlands.


Koninklijke Brill nv incorporates the imprints Brill, Brill Nijhoff, Global Oriental and Hotei Publishing.
All rights reserved. No part of this publication may be reproduced, translated, stored in a retrieval system,
or transmitted in any form or by any means, electronic, mechanical, photocopying, recording or otherwise,
without prior written permission from the publisher.
Authorization to photocopy items for internal or personal use is granted by Koninklijke Brill nv provided
that the appropriate fees are paid directly to The Copyright Clearance Center, 222 Rosewood Drive,
Suite 910, Danvers, ma 01923, usa. Fees are subject to change.

This book is printed on acid-free paper.


Contents

Notes on Contributors vii

Introduction 1
Søren Wichmann and Jeff Good

Exploring Phonological Areality in the Circum-Andean Region


Using a Naive Bayes Classifier 7
Lev Michael, Will Chang and Tammy Stark

Quantifying Geographical Determinants of Large-Scale Distributions


of Linguistic Features 67
Harald Hammarström and Tom Güldemann

Languages with More Second Language Learners Tend to Lose


Nominal Case 96
Christian Bentz and Bodo Winter

Using Phylogenetic Networks to Model Chinese Dialect History 125


Johann-Mattis List, Shijulal Nelson-Sathi, William Martin
and Hans Geisler

Phylogenetic Inference from Word Lists Using Weighted Alignment


with Empirically Determined Weights 155
Gerhard Jäger

Does Structural-Typological Similarity Affect Borrowability?


A Quantitative Study on Affix Borrowing 205
Frank Seifart

Index of Authors 227


Index of Languages, Dialects, Language Groups and Families 231
Subject Index 234
Notes on Contributors

Christian Bentz
studied Germanistics, Macroeconomics, and Philosophy in Heidelberg and
Rome. He was a visiting researcher at the Cognitive Neuroscience Lab at Cor-
nell University and at the Max Planck Institute for Evolutionary Anthropology.
He received his MPhil in English and Applied Linguistics from the University
of Cambridge, where he is currently a Ph.D. student in Computation, Cognition
and Language.

Will Chang
is a graduate student at the University of California, Berkeley. He works on
phylogenetic models and Polynesian languages.

Hans Geisler
completed his Ph.D. on typological evolution from Latin to French at the
University of Munich in 1980. In 1987 he became Private Lecturer with a disser-
tation (Habilitationsschrift) on sound change in Romance languages. In 1996
he received an appointment as Professor at Heinrich Heine University Düssel-
dorf where he is currently Chair at the Department of Romance Languages and
Literatures.

Tom Güldemann
is Professor of African linguistics at the Humboldt University Berlin and is
also associated with the Max Planck Institute for Evolutionary Anthropology in
Leipzig. He specializes in language typology, historical linguistics, and language
documentation and description, with a field research focus on Khoisan and
Bantu languages.

Harald Hammarström
Ph.D. (2009), Chalmers University, is Research Staff at the Max Planck Institute
of Psycholinguistics, Nijmegen. He has published papers and monographs in
computational linguistics and linguistic typology.

Gerhard Jäger
Dr. phil. (1996 at Humboldt University Berlin), is Professor of General
Linguistics at Tübingen University. He has published on formal semantics, opti-
mality theory, game theoretic linguistics and language evolution. Since 2013
he holds an erc Advanced Grant, “Language Evolution: The Empirical Turn
(evolaemp)”.
viii notes on contributors

Johann-Mattis List
Ph.D. (2013), Heinrich Heine University Düsseldorf, is Post-Doctoral Researcher
at Philipps-University Marburg. He has published several articles on compu-
tational methods in historical linguistics, including “Networks of lexical bor-
rowing and lateral gene transfer in language and genome evolution” (Bioessays,
2014).

William Martin
completed his Ph.D. in 1988 in Cologne with Heinz Saedler on molecular genet-
ics and plant evolution. He then joined Rüdiger Cerff at the University of
Braunschweig to work on molecular evolution and endosymbiosis. In 1999 he
received an appointment as Professor at Heinrich Heine University Düsseldorf
where he is currently head of the Institute for Molecular Evolution.

Lev Michael
Ph.D. (2008), University of Texas at Austin, is Associate Professor of Linguistics
at the University of California, Berkeley. He has carried out fieldwork on sev-
eral Amazonian languages, including Iquito (Zaparoan), Nanti (Arawak), and
Máíh󰀰̃ki (Tukanoan), and has published on the anthropological, comparative,
and areal linguistics of Amazonian languages.

Shijulal Nelson-Sathi
Ph.D. (2013), Heinrich Heine University Düsseldorf, is a Post-Doctoral Re-
searcher at Heinrich Heine University Düsseldorf under William F. Martin on
Molecular Evolution.

Frank Seifart
works at Max Planck Institute for Evolutionary Anthropology and the Uni-
versity of Amsterdam and coordinates a project on frequencies of nouns and
verbs cross-linguistically. His main research interests are linguistic typology,
language history and contact, and documentation and description of Bora-
Miraña and Resígaro (North West Amazon).

Tammy Stark
is a graduate student at the University of California, Berkeley. Her research
focuses on morphosyntactic variation in the Northern Arawak languages of
South and Central America.
notes on contributors ix

Bodo Winter
Ph.D. candidate at the Cognitive and Information Sciences group, University of
California, Merced, does research within the domain of experimental cognitive
linguistics, language evolution and statistical methods.

Editor Biographies

Jeff Good
Ph.D. (2003), University of California, Berkeley, is Associate Professor of Lin-
guistics at the University at Buffalo. His research interests include comparative
Niger-Congo linguistics, morphosyntactic typology, and language documen-
tation. He has published in Language, Diachronica, and Morphology, among
others, and serves as General Editor of Language Dynamics and Change.

Søren Wichmann
Ph.D. (1996), University of Copenhagen, is Senior Scientist at Max Planck Insti-
tute for Evolutionary Anthropology and works on historical linguistics, typol-
ogy, and Mesoamerican languages, often applying quantitative, computational
methods. He is founder and General Editor of the journal Language Dynamics
and Change.
Introduction
Søren Wichmann
Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany

Jeff Good
University at Buffalo, Buffalo, New York, usa

With this book, which contains selected papers that are published or forthcom-
ing in the journal Language Dynamics and Change (ldc), we wish to celebrate
that the journal, launched in 2011, is soon to go into its fifth year. When the
first author of this introduction was originally approached in February 2010 by
a representative from Brill, the idea of the latter was to start a new journal for
historical linguistics. The former was interested in this idea, but he had some
additional motivations, which the second author also supported. One motiva-
tion was that, at the time, there was no obvious outlet for papers addressing
questions of historical linguistic relevance through proper statistical hypothe-
sis testing. Another motivation was to create a forum for the wider field of lan-
guage dynamics discussed in Wichmann (2008) and Loreto et al. (2011). Here we
reproduce the tentative definition of language dynamics by Wichmann (2014:
303) with some minor modifications:

The study, through observations, reconstructions or simulations, and,


whenever possible, quantitative methods, of processes of emergence,
change and interaction of languages at any time scale, possibly in relation
to processes within or among human agents and their specific sociohis-
torical and geographic environments.

A portion of this definition would describe historical linguistics in a traditional


sense, which tends to focus on the study, through observations and reconstruc-
tions, of processes of change and interaction of languages at a time scale where
the comparative method can be applied. Shortening the time scale further also
brings in dialectology and sociolinguistics. Widening it opens up to the study
of language evolution and large-scale historical typology. Considering human
agents in interaction brings in the sociology of language and studies of language
competition; and modeling processes within human agents invites psycho-
and neurolinguistic perspectives. Explicitly bringing ‘environment’ into the
picture follows recent shifts in typological investigation that emphasize areal
and historical explanations for contemporary typological distributions over

© koninklijke brill nv, leiden, 2014 | doi: 10.1163/9789004281523_001


2 wichmann and good

universalist ones. Computational simulation is a tool which can aid in all


these enterprises, but one which is alien to historical linguistics as traditionally
practiced. Finally, the definition deliberately avoids viewing the field solely as
a branch of linguistics, since other fields such as archaeology, genetics, and
geography, are clearly important for understanding key aspects of linguistic
diachrony.
Within the wide range of topics which we deem suitable for ldc, there
has—predictably and not at all regrettably—been a clear dominance of papers
relating to linguistic diachrony in a quantitative perspective. This, of course,
reflects the fact that work of this kind has been steadily increasing in the
last decade, but we also hope that the existence of an outlet like ldc has
encouraged investigators to embark upon investigations of language dynamics
knowing that there is a journal which welcomes such work. Here we present
some of these papers, selected on two criteria: quality and thematic coherence.
The papers appear as they did or will appear in various issues of ldc, with the
exception of a few typographical corrections and bibliographical updates.
The first two papers (Michael et al. and Hammarström & Güldemann) are
areal-typological, exploring geographical distributions of linguistic features
and thus emphasize, in particular, the environmental component to language
dynamics. The next two (Bentz & Winter and Seifart) examine possible predic-
tions about the linguistic outcome of language contact. These, too, consider
environmental features as they relate to language change, although it is the
interaction between agents and their languages rather than the geographical
environment which they consider. The last two (List et al. and Jäger) represent
different approaches to phylogenetics. As such, they are interested in one of
the core issues of traditional historical linguistics—language classification—
approaching it with new kinds of quantitative methods. In the following we
provide brief summaries of the papers. These are meant to only give a gen-
eral feeling for the nature of the individual contributions and to provide some
context for the volume. We do not attempt to explain the methods introduced
in anything but very superficial ways since these are amply described in the
papers themselves.
Michael, Chang, and Stark illustrate a novel method for delineating linguistic
areas in a more explicit and principled way than is current in areal linguistics,
where impressionistic methods tend to prevail. Following this method, the
researcher first selects languages that seem like good candidates for belonging
to a certain linguistic area—the core area—as well as a control group for
which there are good reasons to assume that little or no interaction has taken
place with the core group. Additional intermediate languages for which no
hypotheses are made concerning membership in either the core or the control
introduction 3

group can be added to the dataset under investigation. Next, Bayesian logic is
applied in order to estimate the probability that a given language belongs to
the core area group given its inventory of features in some linguistic domain.
This probability is not interpreted as some sort of evidence for the linguistic
area or for the membership therein of individual languages. Rather, it serves as
a descriptive means of distinguishing between more or less focal members. The
paper applies this method using a dataset of phonological inventories of South
American languages, with the aim of delineating a linguistic area in the Andean
highlands and adjacent regions. The method was not only able to identify such
an area but was also able to further delineate two major subareas within the
larger Andean area, in both cases confirming linguists’ intuitions.
Drawing upon data from thousands of languages, Hammarström and Gül-
demann set out to test geographical correlates of the distribution of linguis-
tic features in two domains: numeral systems and basic word order. Testing
whether climatic zones have an effect on their distribution, they find that
homogeneity of numeral systems tends to be higher within different climatic
zones than within zones picked randomly, but the same is not the case for basic
word order. The difference in the two domains is tentatively ascribed to the
possibility that the link between climate and numeral systems is mediated by
subsistence strategies, which has some correlation with numeral systems (as
per earlier, unpublished work by Hammarström). They further test whether
there is a preference for similar features to be distributed along an east-west
axis, rather than a north-south one, as might be expected from the work of Dia-
mond (1997), and find some evidence for this distribution in both domains. A
final test is whether areas defined in terms of languages having identical val-
ues for the linguistic features investigated tend to be bounded by coastlines
and mountains/valleys to a greater extent than randomly-picked, spatially-
coherent geographic areas. The authors do not find such an effect. The paper
takes great care to present every step of its analyses to facilitate replication and
presents neat solutions to practical problems when dealing with the messy real-
ity of geography.
Trudgill (2011) and others have argued that language change induced by
adult l2 learners is expected to lead to reduced grammatical complexity. Bentz
and Winter set as their goal to test this claim statistically. They look specifically
at case distinctions. The authors were able to obtain information on both the
number of cases and the proportion of l2 speakers for 66 languages. The statis-
tically well-controlled analysis revealed that languages with more l2 speakers
tend to have fewer cases and also that there is an inverse relation between the
proportion of l2 speakers and the number of nominal case markers. Because of
the care with which the analyses are carried out and the thorough discussion of
4 wichmann and good

possible confounding factors, this paper serves as a good model for future stud-
ies testing hypotheses concerning the linguistic outcomes of different language
contact situations.
There is a widespread belief among students of language contact, expressed
explicitly by Weinreich (1953) and several others, that structural features of a
borrowing language constrain the kinds of morphemes that can be borrowed.
Seifart’s contribution is the first systematic test of this claim. Using his own
database of morphological borrowing and information on structural features
from Dryer and Haspelmath (2011) supplemented by some additional data
points, the author inspects 78 pairs of donor and borrowing languages to see
whether there is a correlation between structural similarity in general and the
amount of affix borrowing, counted as the number of broad morphological
categories expressed by the borrowed affixes. The straightforward and unam-
biguous result of this test is that there is no such relationship to be found:
the number of affix categories borrowed cannot be predicted from the struc-
tural distance between the donor and the borrowing language. This paper thus
serves as an exemplary case study demonstrating how the availability of large-
scale typological databases combined with quantitative methods allows for the
rigorous examination of long-standing hypotheses previously supported pri-
marily by impressionistic evidence.
List, Nelson-Sathi, Martin, and Geisler present a phylogenetic method for
characterizing relationships among languages that builds both vertical and
lateral transmission (inheritance and borrowing) into its basic design, and they
illustrate this method using Chinese data. The beauty of the method is its small
number of assumptions. Initially no decision is made about what is a borrowing
and what is not. It is only decided which words are homologs—i.e., whether
they are related in one or the other way. For instance, English mountain is
homologous with French montagne and Spanish montaña, but only the latter
two are inherited vertically. If mountain were inherited in the same way that
montagne and montaña are, the proto-language from which Germanic and
Romance both descend would have to have had two synonyms for mountain,
one for the proto-form giving rise to mountain, montagne, etc., and one for the
proto-form giving rise to German Berg, Danish bjerg, etc. With the additional
assumption that a proto-language should not have many more synonyms than
its descendant languages, forms such as mountain, which would contribute
to the proliferation of proto-synonyms and whose distributions also have a
poor fit with the structure of a reference tree (developed independently), are
singled out as candidates for being loanwords. In a network of languages where
the skeleton is the reference tree, such lateral connections can be depicted
as edges connecting languages or intermediate proto-languages criss-crossing
introduction 5

the reference tree. This method thus extends popular existing phylogenetic
techniques which are more strongly oriented towards modeling vertical over
lateral transmission.
Jäger’s paper represents a significant step forward in the application of
distance-based methods to lexical language data for the purpose of producing
phylogenies. In dialectological studies, it has repeatedly been found that simple
Levenshtein distance (ld) as a tool for producing an overall measure of distinc-
tiveness between forms is reasonably accurate, even if the ld simply counts
the number of operations needed to transform one string into another with-
out taking into account differences in the phonetic interpretations of aligned
symbols via some sort of system weighting some changes as more likely than
others. A possible reason why elaborate weighting schemes have not been very
influential is that assumptions about how to define weights are likely to be
controversial if they are based solely on theoretical criteria such as the nature
of phonetic features involved in a change. Instead, Jäger determines weights
empirically, first by developing a conservative criterion for cognacy and then,
after aligning cognates, estimating the weights based on the alignments found.
A large-scale test of the results shows that weights, when properly assigned, do
in fact improve the accuracy of classifications.
What unites the papers in this volume is that they bring large datasets and
sophisticated statistical arguments to bear on simple and fundamental ques-
tions in language dynamics. The advantage of hypothesis-driven statistical
inquiry over common-sensical reasoning supported merely by cherry-picked
examples is that the former greatly helps us to distinguish between produc-
tive avenues of investigation and dead ends, whereas the latter only sets up a
kind of magnetic field where opinions will fluctuate between opposite poles
without any detectable progress. For instance, the results of the paper by Ham-
marström and Güldemann favor a model of population interaction on a large
scale where geographic axes should be taken into account, while other possi-
ble geographic parameters, such as the presence of geographical boundaries,
can largely be ignored. Bentz and Winter show that the hypothesis that sim-
plification of grammatical systems can be a consequence of interference from
l2 speakers is worth further pursuing through studies on other domains of
grammars and with an extended empirical base. In contrast, Seifart’s results
strongly suggest that another popular idea, that there are grammatical con-
straints on borrowing, is basically a dead end. There is progress in uncovering
the lack of productivity in a particular avenue of investigation, just as there
is in suggesting new methodologies with provable advantages—with the lat-
ter eminently embodied in the contributions by Michael et al., List et al., and
Jäger.
6 wichmann and good

Launching Language Dynamics and Change has been a rewarding experi-


ence because of the innovative character and high quality of the research that
we have had the good fortune to be able to see into print. We are very grateful
to the publisher, Brill, for having been open to an editorial policy challenging
traditional notions about the discipline of historical linguistics and its bound-
aries. First and foremost, however, we are grateful to all the authors who have
entrusted us with the results of the best of their research at a stage where our
journal was still in its infancy.

References

Diamond, Jared. 1997. Guns, Germs and Steel: The Fates of Human Societies. London:
Cape.
Dryer, Matthew S. and Martin Haspelmath (eds.) 2011. The World Atlas of Language
Structures Online. Max Planck Digital Library. Accessible at http://wals.info/.
Loreto, Vittorio, Andrea Baronchelli, Animesh Mukherjee, Andrea Puglisi, and Fran-
cesca Tria. 2011. Statistical physics of language dynamics. Journal of Statistical Me-
chanics: Theory and Experiment P04006 (doi: 10.1088/1742–5468/2011/04/P04006).
Trudgill, Peter. 2011. Sociolinguistic Typology: Social Determinants of Linguistic Complex-
ity. Oxford: Oxford University Press.
Weinreich, Uriel. 1953. Languages in Contact. New York: Linguistic Circle of New York.
Wichmann, Søren. 2008. The emerging field of language dynamics. Language and
Linguistics Compass 2: 442–455.
Wichmann, Søren. 2014. The challenges of language dynamics. Comment on “Mod-
elling language evolution: Examples and predictions” by Gong, Shuai & Zhang.
Physics of Life Reviews 11: 303–304.
Exploring Phonological Areality
in the Circum-Andean Region
Using a Naive Bayes Classifier
Lev Michael, Will Chang and Tammy Stark
University of California, Berkeley, usa

Corresponding author:
levmichael@berkeley.edu

Abstract

This paper describes the Core and Periphery technique: a quantitative method for
exploring areality that uses a naive Bayes classifier, a statistical tool for inferring class
membership based on training sets assembled from members of the classes in ques-
tion. The Core and Periphery technique is applied to the exploration of phonological
areality in the Andes and surrounding lowland regions, based on the South American
Phonological Inventory Database (SAPhon 1.1.3; Michael et al., 2013). Evidence is found
for a phonological area centering on the Andean highlands, and extending to parts of
the northern and central Andean foothills regions, the Chaco, and Patagonia. Evidence
is also found for Southern and North-Central phonological sub-areas within this larger
phonological area.

Keywords

naive Bayes classifier – linguistic areality – Andean languages – South American


languages

1 Introduction

The goals of this paper are twofold: first, to describe the Core and Periphery
technique, an intuitively appealing quantitative method for exploring large
linguistic datasets for evidence of linguistic areality; and second, to illustrate
the utility of this technique by applying it to a dataset of South American
phonological inventories, focusing on the evidence of phonological areality in
the Andes and surrounding lowland areas.

© koninklijke brill nv, leiden, 2014 | doi: 10.1163/9789004281523_002


8 michael et al.

Core and Periphery is a method that uses as a starting point linguists’ knowl-
edge of the languages and history of a region to generate initial hypotheses
regarding ‘cores’: sets of languages that constitute possible linguistic areas
(Campbell et al., 1986; Thomason, 2000; Muysken, 2008), or parts of such areas.
These hypotheses serve as the seed for the application of a statistical technique,
naive Bayes classification (nbc), which determines what features, if any, dis-
tinguish the core languages from other languages in the region, and also to
what degree languages outside the proposed core resemble the core languages.
Those languages deemed core-like, together with the proposed core, constitute
a candidate linguistic area, to be evaluated against pertinent sociohistorical
and geographical facts. If the languages deemed core-like fail to make sense
geographically, then the Core and Periphery technique has failed to identify a
linguistic area around the proposed core.
The Core and Periphery technique improves on conventional practices of
‘eyeballing’ areas in three ways. First, it provides a quantitative evaluation of the
degree to which the languages of a proposed area in fact exhibit features that
distinguish them from the languages of the larger region containing the pro-
posed area. Second, it provides a quantitative measure of similarity between
languages that can be applied to large datasets, allowing linguists to locate
unexpected similarities that help identify new areas or redefine accepted ones.
And third, quantitative measures of similarity also make it possible to visualize
and cogently discuss the structure of linguistic areas whose boundaries are gra-
dient in nature. Note, however, that Core and Periphery is not strictly speaking
a statistical test of areality, a point we return to in Section 6.
In this paper, we carry out two different Core and Periphery explorations
of phonological areality in the circum-Andean region, first treating the entire
Andean highlands from northern Chile to northern Ecuador as a single core,
and then treating the Andean highlands as consisting of two cores, a Southern
Andean core and a North-Central Andean core. The dividing line between the
latter two cores runs through the southern Peruvian Andes, grouping Cuzco-
Collao Quechua and Jaqaru with the Southern Andean core, while the remain-
ing Quechuan languages constitute the North-Central core. This dual core anal-
ysis is motivated by the qualitative observation that the Southern Andean lan-
guages, delimited in this way, share a number of phonological characteristics
otherwise rare in South America, including a three-way contrast between plain,
aspirated, and ejective stops.
The single core analysis reveals several clusters of languages in the Andean
foothills and adjacent lowland regions that pattern more strongly with the
languages of the Andean core than other lowland languages, including an
Ecuadorean Andean foothills cluster, a Huallaga River valley cluster, a cluster
exploring phonological areality in the circum-andean region 9

of Arawak languages of the southern Peruvian Andean foothills, and a cluster


of Chacoan and Patagonian languages. These results support the existence
of a large South American phonological area that encompasses the Andean
highlands and parts of the Andean foothills regions, with a tongue that extends
from the Southern Andes into the Chaco and Patagonia.
The dual core analysis builds on the single core analysis by revealing a
finer structure to this area, showing that the non-Andean languages which
exhibit similarity to Andean languages generally resemble those of the core to
which they are most proximally located. The relevant Chacoan and Patagonian
languages resemble those of the Southern core, and the relevant languages of
Peru and Ecuador resemble those of the North-Central core.
This paper is organized as follows: Section 2 presents a qualitative overview
of the Core and Periphery technique, and Section 3 presents the data to which
this technique is applied, as well as the overall goals of the analysis. A more
technical description of the statistical method underlying the Core and Periph-
ery technique, the naive Bayes classifer, is provided in Section 4, with additional
details provided in Appendix b.1–b.3. The results of single and dual core analy-
ses are presented and examined in Section 5, and Section 6 evaluates the Core
and Periphery technique, discussing its strengths and weaknesses.

2 The Core and Periphery Technique: A Qualitative Overview

The basic strategy for exploring phonological areality implemented by the


Core and Periphery technique is to use a measure of inter-language similarity
to bootstrap from a given set of geographically clustered and phonologically
similar languages (the ‘proposed core’) to a larger set of similar languages
(the ‘core and periphery’) that are deemed to form a quantitatively consistent
linguistic area.
In a one-core analysis, the first step is to divide the languages of a region
(South America, in our case) into three sets: a proposed core, a control class,
and an equivocal class. The proposed core is a set of languages that are
hypothesized to form a part of a larger linguistic area. The control set consists of
languages that are unlikely to have been in contact with the core languages, and
are therefore deemed unlikely to belong to the core or periphery ex hypothesi.1
The equivocal class is the one about which nothing is claimed in advance.

1 As one reviewer suggested, even languages on another continent could serve as control
languages.
10 michael et al.

Motivations for choosing a proposed core may include ethnographic or


historical observations that suggest the existence of a culture area, intuitions
regarding areality based on ‘eyeballing’ the linguistic data, or even previous
proposals that the core constitutes a linguistic area. As will become clear,
the original rationale for selecting a particular core is unimportant for the
operation of the quantitative analysis described below, since the results of
that analysis will indicate whether the proposed core does in fact constitute
a distinctive and homogeneous sub-area of a larger linguistic area.
Choosing the control class entails identifying a set of languages that are
unlikely to have been influenced by contact with the core languages. The
ultimate choice of non-core or ‘control’ languages depends a great deal on the
analyst’s knowledge of the history and geography of the region, but we have
generally allowed the possibility of quite distant linguistic influence, leading
us to select control regions that are remote from the cores. In the case of the
single Andean core that we discuss in Section 5.1, for example, we define the
control languages as consisting of all languages further than 1500 kilometers
from the Andean core.2
After determining the three sets (core, control, and equivocal), a naive Bayes
classifier is trained on the proposed core and the control class. These two
classes serve to exemplify the opposite ends of an axis along which the clas-
sifier will then score the languages, including, again, those from the proposed
core and the control class. The highest-scoring languages constitute a refined
hypothesis for a linguistic core, which likely includes most or all of the pro-
posed core, providing it was well chosen to begin with. At the opposite end of
the spectrum, there will be languages with very low scores, most of which will
be non-core languages, if the proposed core was well chosen. Finally, in some
analyses such as ours, there will be languages with intermediate scores that are
geographically clustered near the proposed core. These constitute the periph-
ery.
With the nbc analysis complete, the final step of the Core and Periphery
technique is to evaluate whether the peripheral languages with relatively high
nbc scores were ever plausibly involved in a donor relationship with core
languages, in light of available geographical, ethnohistorical, and archeological
data. If such a relationship is plausible, we attribute the high nbc score to

2 The Core and Periphery results actually suggest that in most cases, the range of phonological
influence of the Andes into the surrounding lowlands does not exceed a few hundred kilo-
meters, but by choosing so distant a control class, we allow for the possibility of more distant
influence.
exploring phonological areality in the circum-andean region 11

‘linguistic admixture,’ i.e. the diffusion of linguistic features between one or


more of the core languages and the high-scoring peripheral language, with
the result that it exhibits a mixture of core and non-core features. If the dis-
tribution of high-scoring languages makes no sense geographically or other-
wise, then Core and Periphery essentially fails to support the proposed lin-
guistic area. Note that even when Core and Periphery is successful, the prob-
abilistic nature of nbc, and the limitations of using phonological inventories
as evidence for contact, may yield ‘false positives,’ i.e. languages that exhibit
high nbc scores despite there being no plausible basis for contact between
those languages and core languages. Such languages should be discarded, leav-
ing a phonological area that is defensible both quantitatively and qualita-
tively.
A two-core analysis, in contrast, produces a four-way division of languages
(Core 1, Core 2, the control class, and the equivocal class). The naive Bayes clas-
sifier is trained on each core and the control set, and a three-way classification is
then performed, yielding three scores for each language of the equivocal class,
which indicate similarity to each of the cores and to the control class. Those
languages that obtain high scores for either of the two cores are then evaluated
for plausibly having been in contact with a core language.

3 Dataset and Analytical Goals

3.1 SAPhon
The quantitative exploration of phonological areality presented in this paper
is based on the analysis of the phonological inventories found in the South
American Phonological Inventory Database, version 1.1.3 (SAPhon 1.1.3; Michael
et al., 2013).3 In this section we briefly describe the structure of the database,
and discuss particular decisions that we made in populating the database and
preparing it for quantitative analysis.
SAPhon 1.1.3 incorporates 359 phonological inventories that have been har-
vested from published sources, or contributed by linguists currently working on
the languages in question. This represents over 95 % coverage of South Amer-
ican languages for which phonological descriptions are known to exist in one
form or another.4 The vast majority of inventories in the SAPhon database

3 Available online: http://linguistics.berkeley.edu/~saphon


4 This estimate is based on Fabré’s (2005) extensive bibliography of publications on South
American languages, from which our list of languages is largely drawn.
12 michael et al.

belong to living languages, but SAPhon also includes inventories from recently
extinct languages, such as Chamicuro (Parker, 1991), as well as inventories based
on the careful interpretation and re-analysis of older resources, as in the case
of Cholón (Alexander-Bakkerus, 2005).
To facilitate quantitative analysis, the phonological inventory of each lan-
guage is coded in a comprehensive phonological feature matrix, with lan-
guages along the y-axis and features along the x-axis,5 with a column for every
phoneme and contrastive supersegmental feature (e.g. nasal harmony) attested
in a South American language. Each phonological inventory is coded as a row
of ones and zeros in the table, where the presence of a given segment for a
given language is coded as 1 in the appropriate column, and absence coded
as 0. Exhaustively coding the inventories in this fashion relieves us of having to
decide in advance which segments or contrasts are relevant to the exploration
of areality.6
We now turn to a number of methodological and analytical issues posed
by the nature of the data on which SAPhon is based. Since SAPhon draws
data from a considerable range of published and unpublished sources, issues
of heterogeneity in those sources pose challenges for the development of the
database, and for the analytical purposes to which we put that data.
The first type of heterogeneity we must contend with is the existence of mul-
tiple, sometimes incompatible, phonological descriptions for a given language.
Since allowing multiple inventories for a given language poses significant ana-
lytical difficulties, we typically select one inventory from among the various
proposed for a given language, preferring those given in works that present
considerable supporting data and analytical detail, and prepared by authors
with substantial linguistic training. We also typically prefer inventories based
on more recent work, on the grounds that recent work takes into account both
previous analyses and new data. To improve the quality of our judgments in
evaluating conflicting analyses, we also consulted specialists in particular lan-
guages, language families, and known linguistic areas in South America. In
cases where there is compelling evidence that the differences between inven-
tories proposed for a given language are due to dialectal differences, we include
both dialects in the database.
The second type of heterogeneity stems from the divergent ways in which
different linguists treat the same empirical phenomena. In particular, different

5 In this article, feature always refers to a feature of a language as a whole (such as the
presence or absence of a particular phoneme in the phonological inventory) rather than to
phonological features such as labial or unrounded.
6 We thank Mark Donohue for sharing this very useful coding technique with us.
exploring phonological areality in the circum-andean region 13

representational choices can lead to differences between the inventories given


for different languages that do not reflect real differences between the inven-
tories in question. To remove these spurious differences, we subject the coded
inventories to phonological regularization prior to quantitative analysis (while
leaving the original coding intact in SAPhon).
To understand the motivation for phonological regularization, and to dem-
onstrate how it is carried out, it is useful to consider some concrete exam-
ples. We first discuss the treatment of non-high front vowels in Tupí-Guaraní
(tg) languages. All tg languages exhibit two contrastive front vowels, repre-
sented in descriptions as /i/ and either /e/ or /ɛ/ (and in one case, /ɪ/). In some
of the languages where the symbol chosen to represent the front mid vowel
phoneme is /e/, the description explictly indicates that this vowel is phonet-
ically realized as [ɛ] (e.g. Kamaiurá; Seki, 2000), and in other tg languages
the symbol chosen for the mid front vowel phoneme is /ɛ/ (e.g. Nhandeva;
Costa, 2003). In addition, there are several tg languages where the symbol used
to represent the non-high front vowel phoneme is /e/, but no information is
provided as to its phonetic realization. Crucially, no tg language exhibits two
contrastive front mid vowels: we never encounter a contrast between /e/ and
/ɛ/.
For purposes of the analysis presented in this paper, we treat all tg languages
as having the same two front vowels phonologically: a high front vowel /i/
and a mid front vowel /e/. We implement this regularization by recoding the
phonemes given as /e/ or /ɛ/ in these languages as {e} (leaving the phonemes
in the underlying database untouched). The result of this normalization is to
recast the inventories of tg languages as exhibiting no difference in their front
vowels for the purposes of our quantitative analysis. This treatment of vowel
systems of these types is extended to all languages in our dataset. That is, we
treat all languages that exhibit only /i, e/ or /i, ɛ/ in their inventory of front
vowels as exhibiting /i, {e}/. Of course, in languages in which /e/ and /ɛ/ do
contrast, as in the majority of Macro-Ge languages, no regularization of these
segments is carried out.
The motivation for regularization as outlined thus far stems from the fact
that linguists vary in their choices of symbol to represent a given phoneme,
but there are also methodological and typological motivations for regulariza-
tion. First, given the phonetic similarity of [e] and [ɛ], it is likely that not
all field linguists systematically distinguish the two phones in languages in
which they do not contrast. Moreover, one would expect to often find non-
contrastive variation between these two phones within such languages, based
on a variety of phonetic and sociolinguistic factors. This means that using both
/e/ and /ɛ/ to represent the single mid front vowel present in different lan-
14 michael et al.

guages suggests a greater degree of phonetic precision than is probably war-


ranted.
Second, it is clear that in cases like that of Kamaiurá, mentioned above,
linguists choose the phoneme label that represents not the precise phonetic
value of its basic allophone (i.e. [ɛ]), but the typologically expected phoneme
in that area of the phonemic space (i.e./e/), as delimited by the phonemes
with which it contrasts. Therefore, phoneme representations of this sort are not
directly comparable to those which opt for a representation that is more pho-
netically faithful to the basic allophone of the phoneme (i.e. /ɛ/). Regularization
resolves the discrepancy between these two principles for choosing phoneme
symbols by converting all ‘phonetically faithful’ phoneme symbols to ‘typolog-
ically unmarked’ ones.
A second phenomenon that illustrates a more analytically profound moti-
vation for regularization comes from the treatment of contrastive nasality in
Southern American languages, as exemplified by the treatment of surface nasal
vowels in Tukanoan languages. Briefly, surface nasal vowels are accounted for
in two ways in these languages: as the surface realization of underlying nasal
vowels, or as vowels that have undergone nasalization due to a morpheme-
level nasalization feature that spreads nasalization onto the vowels in question
(see, e.g. Gomez-Imbert, 1993 and Stenzel, 2004). The former analysis tends to
be common in earlier works on languages of this family, and the morpheme-
level nasal spreading analysis is typical of more recents works. In general, these
appear to be two different ways to analyze materially similar distributions
of nasal features, and we regularize the phonological systems in question by
including the nasal counterparts of all oral vowels in the phonological invento-
ries of languages that have been analyzed as exhibiting morpheme-level nasal
spreading.
We list the regularization rules and discuss how they are applied to the
SAPhon dataset in Appendix a.

3.2 Applying Core and Periphery to Andean Languages


In this paper we illustrate the Core and Periphery technique by using it to
explore the Andean phonological area, and two phonological sub-areas within
this larger area: the Southern Andean phonological area and North-Central
Andean phonological area. In doing so, we exemplify how the technique works
when selecting cores of varying degrees of initial insightfulness.
The choice of the Andean highlands as a candidate core is an obvious one
for areal specialists. Büttner (1983: 179), for example, observed that South-
ern Andean languages exhibit similar phonological inventories, and obser-
vations by linguists like Dixon and Aikhenvald (1999) regarding the phono-
exploring phonological areality in the circum-andean region 15

logical distinctiveness of the Andean and Amazonian regions are generally


deemed uncontroversial (even if detailed evidence for such claims is not pre-
sented). Similarly, the Andes is generally recognized as a culture area which
has, at different points in time, been dominated by large empires or poli-
ties, including Wari, Tiwanaku, and the Inkas (Steward and Faron, 1959: 5–
16).
In our first Core and Periphery analysis, we operate on a proposed Andean
core that consists of the 23 languages located in the contiguous mountainous
region of western South America above 2,000 meters in elevation, from Patag-
onia in the south to the Ecuadorean Andes in the north. The 2,000 meter limit
clearly separates Amazonian groups whose territory extends into the Andean
piedmont from Andean peoples, and the northern limit of the Ecuadorean
Andes corresponds to the extent of the Andean culture area, as defined by
the northernmost limit of Quechuan expansion. In the control set we include
the 113 languages of the region beginning at 1500km from the nearest Andean
language, extending to the furthest limits of the continent. The remaining 223
languages in the the 1,500 kilometer-wide strip between the core and control
languages make up the equivocal set of languages about which we posit noth-
ing in advance.
Our second Core and Periphery analysis is motivated by the observation
that, although all Andean languages share features that distinguish them from
non-Andean languages, the Southern Andean languages exhibit distinctive
features not found in most Central or Northern Andean languages (e.g., a
series of ejective consonants) while the latter group of languages exhibits
distinctive features not generally found in the former group (e.g., retroflex
affricates). These facts suggest that it may be useful to treat the Andean area
as comprising two subcores: a Southern core and a North-Central core. There
are also sociohistorical facts suggesting that it may be useful to distinguish two
cores in this way, namely, the fact that the Southern core corresponds roughly
to extensions of the Tiwanaku empire (approximately the area of modern
highland Bolivia) and that the North-Central core corresponds roughly to the
extension of the Wari horizon (Isbell, 2008). For the purposes of this analysis,
we posit a Southern Andean core of 10 Andean languages south of the line
that separates languages with ejectives from those without ejectives, with the
remaining 19 Andean languages constituting the North-Central core.
16 michael et al.

4 Exploring Language Contact with a Naive Bayes Classifier

4.1 Overview
A naive Bayes classifier is a probabilistic model that classifies objects into K
classes. Such a classifier is first trained on many examples, each labeled by a
human expert with the class to which it belongs. Thereafter, when presented
with a novel object, the classifier will report with what probability the object
belongs to each of the K classes.7
A common application of this technology is spam filtering. An e-mail
account may receive dozens of unwanted messages every day, but a typical clas-
sifier is smart enough to put almost all of them into a spam folder, saving the
user the trouble of ever having to look at them. In this application there are
two classes: spam and non-spam. The classifier is trained on messages that it
knows to be spam (such as those the user manually flags) and those it knows
to be non-spam (such as those that the user does not flag after reading). This
continuously-trained classifier is applied to incoming messages, and usually
works very well.8
A naive Bayes classifier analyzes each object in terms of features that char-
acterize it. In the case of e-mail, the features are the words that a message
contains. When an incoming message is analyzed, each word will push the
classification toward spam or non-spam, depending on how strongly the word
is associated with spam or non-spam in the messages on which the classifier
has been trained. A word such as Viagra is a strong indicator of spam, whereas
most low-frequency words (such as analysis or linguistics) are weak indicators
of non-spam. The classifier combines the evidence from each word to reach a
verdict about the message as a whole.

7 The origin of the naive Bayes classifier is obscure. It is a straightforward but non-trivial
application of Bayes’ Theorem, which dates from the 18th century. Widely-used texts such as
Mitchell (1997), Manning and Schütze (1999), Bishop (2007), and Jurafsky and Martin (2009)
discuss it without commenting on its origin. Gale et al. (1992), cited in Manning and Schütze
(1999), apply a naive Bayes classifier to the problem of word-sense disambiguation in natural
language processing, without referring to it as such. That paper, in turn, cites Mosteller and
Wallace (1963), a famous paper that used a naive Bayes classifier (also not referred to as such)
to determine the authorship of twelve of the Federalist Papers. We suspect that naive Bayes
classifiers were used in diverse settings before the name itself caught on.
8 The first academic papers to discuss Bayesian spam classifiers appeared in 1998 (Pantel and
Lin, 1998; Sahami et al., 1998). However, it was an essay from 2002 titled A Plan for Spam that
popularized the concept and made specific proposals to lower the rate of false positives to
the point where the technology became usable (Graham, 2008).
exploring phonological areality in the circum-andean region 17

Adapting this technology to classifying languages is straightforward: we train


a classifier on training languages from K classes of interest, and use it to prob-
abilistically classify a test language. (If there are multiple test languages, the
classifier is run once for each test language.) The analyst provides a featural
specification for each language, and a class label for each training language.
As explained in Section 3.1, the featural specification is an encoding of the
phonological inventory in which each feature is a phoneme or a suprasegmen-
tal feature that is either present or absent in the language. During training,
the classifier calculates how strongly each phoneme is associated with each
class. Then, in order to classify a test language, the classifier combines the
evidence from each feature and assigns K probabilities to the test language—
these are the probabilities that the test language belongs to each of the K clus-
ters.9

4.2 Two-Way Classification


A two-way classifier is a special case of a general K-way classifier that can
be explained in simpler terms, so it will be discussed first. Training a classi-
fier with two classes entails calculating a feature weight for each feature that
expresses how strongly each feature is associated with each class. The weight
for feature l is

ul = log ( ÷ ).
N1l N1
[provisional]
N2l N2

N1l is the number of training languages in class 1 that have feature l, and N1

quantities for class 2. The first ratio N1l /N2l is a comparison of the counts
is the total number of training languages in class 1. N2l and N2 are analogous

9 When we were devising the Core and Periphery technique, we tried using other kinds of
classifiers besides nbc, such as support vector machines and logistic regression. The latter two
are most often presented as classifying objects into two classes, but multiclass versions exist.
All three classifiers are supervised learners, in that they classify based on examples provided
by the analyst. In practice, nbc worked better than the other two methods, perhaps because
it is a generative model, whereas the other two are discriminative models. Generative models
tend to work better when the number of data points in the training data is relatively small
and the dimensionality of the data is large (Ng and Jordan, 2001).
As for unsupervised analyses such as principal components analysis or multidimensional
scaling, these are certainly useful as exploratory data analyses, and they may even identify
potentially interesting linguistic areas. But since they are unsupervised, they cannot be
directed by an analyst to examine an areal hypothesis that the analyst is specifically interested
in. We thus omit mention of these analyses in discussing the Core and Periphery technique.
18 michael et al.

N1 /N2 , which expresses the relative sizes of the two classes. The logarithm
of feature l in the two classes. This is counterweighted by the second ratio

has the effect of causing the weight to be zero when the feature is neutral,
positive when it is associated with class 1, and negative when associated with
class 2.
One problem with this formula is that when any of the counts are zero, the
feature weight ul ends up at either positive or negative infinity. To prevent this,
we inflate the counts by a small amount in order to regularize the result:

𝛼 + N1l 𝛼 + β + N1
ul = log ( ÷ ).
𝛼 + N2l 𝛼 + β + N2

For many applications it suffices to set 𝛼 = β = 1/2, but in our analyses we fit
these parameters to the data, as explained in Appendix b.3.
Strictly speaking, the above expression gives the feature weight for the pres-
ence of a feature. It is also necessary to calculate weights for the absence of a
feature, via

β + N1 − N1l 𝛼 + β + N1
vl = log ( ÷ ).
β + N2 − N2l 𝛼 + β + N2

have been replaced by counts for the absence of the feature N1 − N1l and
The main difference is that counts for the presence of a feature N1l and N2l

N2 − N2l . Once feature weights (for both present and absence features) have
been calculated, the classifier is ready to classify.
For the test language, the classifier produces a score

s = ∑{
L
ul if feature l is present in the test language,
(1)
l=1
vl if feature l is absent in the test language.

This score is a summation over all features (numbered from 1 to L) of feature


weights, using ul if feature l is in the test language, or vl if feature l is not. The
interpretation of the score is similar to that of the weights. A score of zero
means that the test language is equally likely to belong to either class; a positive
score means that it is more likely to belong to class 1; and a negative score means
that it is more likely to belong to class 2.

4.3 Underlying Model and K-Way Classification


The previous section discussed naive Bayes classification from a procedural
perspective. Now we engage in a brief discussion of the model that underpins
exploring phonological areality in the circum-andean region 19

the procedures. The model posits that our data, which comprise the training
languages, the test language, and the labels for the training languages, were
generated via a set of random events, which are as follows.10

– Randomly generate a feature frequency θkl for each feature l and each class
k. This is the probability that a language in class k will have feature l. Feature
frequencies are unobserved.
– Assign each language, including the test language, to one of K classes with
probability 1/K.11 The assignments of the training languages are observed.
The assignment of the test language is unobserved.
– For each language, endow it with feature l with probability θkl , where k is the
class of the language. Each feature is generated independently of the others,
conditional on k. The features that a language has are all observed.

With this as the premise, the classifier seeks to infer the class of the test lan-

would be generated by the feature frequencies θk1 , …, θkL of class k. From this
guage. It calculates, for each class k, the probability f(k) that the test language

it infers that the test language belongs to class k with probability

pk = .
f(1) + f(2) + ⋯ + f(K)
f(k)
(2)

If the feature frequencies were known, the formula for f(k) would be straight-
forward:

f(k) = ∏ {
1 − θkl
L
θkl if feature l is in the test language,
[provisional]
l=1
if feature l is not in the test language.

The classifier is essentially calculating the likelihood of each choice f(k) by


taking the product of the probability of generating each feature value (present

10 When thinking about such models, W.C. finds it helpful to imagine a deity generating the
data according to the procedure given, with some of the deity’s choices hidden from view.
What is not hidden comprises the data. On the basis of this data, we infer some of the
hidden things.
11 In a more sophisticated variant of this model, each language is assigned to class k with

the data. In two-way classification, this adds a term such as log[N1 /N2 ] to the score of the
some probability πk . The random variable πk is not observed, and must be inferred from

test language. When the number of training languages is fixed (as in our analyses), this
term moves all scores up or down by a fixed amount and does not alter any conclusions.
20 michael et al.

or absent) in the test language. We do not know what these feature frequencies
are, but we can obtain some insight (albeit not exactly the right answer)

θkl = Nkl /Nk , where Nkl is the number of times feature l exists among training
by estimating the feature frequencies directly from the data via the formula

languages of class k, and Nk is the total number of training languages of class k.


We get:

⎧ NNkl
{ k
f(k) = ∏ ⎨
L if feature l is in the test language,

l=1 { Nk −Nkl
⎩ Nk
[provisional]
if feature l is not in the test language.

The correct equation, obtained by integrating over all possible values for all

feature frequency θkl ∼ Beta(𝛼, β) we get the following expression for the
feature frequencies, is similar. If we posit a beta distribution prior for each

likelihood:


{
u�+Nkl
{ u�+β+Nk
f(k) = ∏ ⎨
L if feature l is in the test language,

l=1 {
{ β+Nk −Nkl
⎩ u�+β+Nk
(3)
if feature l is not in the test language.

This, along with Eq. 2, yields the probabilities p1 , …, pK for K-way classification.
Appendices b.1 and b.2 restate the contents of this section more formally and
expand on it.

4.4 Probabilistic Interpretation of nbc Weights and Scores

a test language is classified when K = 2. However, Eq. 2 in Section 4.3 derives


Eq. 1 in Section 4.2 describes how to compute an nbc score, which indicates how

a different indicator of classification: pk , the probability with which the test


language belongs to class k. How do these two kinds of indicators relate to each

It turns out that when K = 2, there is a straightforward mapping between


other?

are related by the function S(o) = 1/(1 + e−o ). This function is plotted here:
the score s and p1 (the probability that a test language belongs in class 1). They
exploring phonological areality in the circum-andean region 21

[−∞, ∞] to a probability, which has a range of [0, 1]. The argument o is a


In general, this sigmoid function translates from values that have a range of

When K = 2, s is the log-odds that corresponds to the probability p1 , i.e.


log-odds, so-called because it is the log of an odds ratio.

S(s) = p1 . Conversely we can apply the inverse function S−1 (p) = log p/(1 − p)

to class 1 has the form p1 = f(1)/[f(1) + f(2)]. Converting this probability to a


to p1 to get s. According to Eq. 2, the probability that the test language belongs

log-odds yields a score s = log f(1)/f(2), which expands to


{
u�+N1l
÷
u�+β+N1
{ u�+N2l u�+β+N2
s = ∑ log ⎨
L if feature l is in the test language,

{
{
⎩ β+N2−N2l ÷ u�+β+N2
l=1 β+N1 −N1l u�+β+N1
if feature l is not in the test language.

We see here that each feature value (present or absent) contributes in an


additive way to the score. Comparing this to Eq. 1 shows how the feature weights

When K > 2, the structure of the computation in Section 4.3 does not result
were derived.

feature weights before computing p1 , …, pK for the test language, there is no


in additive feature weights for each feature, and, since we do not compute

distinct training stage. Also, since the classification results in more than two
probabilities, it is no longer possible to indicate the classification of the test
language with a single score. We can, however, convert each pk into a log-odds
and indicate the classification with K scores. When reporting the results of
3-way classification in Appendix c.2, this is what we do.

4.5 Feature Non-Independence and the Interpretation of Results


The name of the naive Bayes classifier derives from the naive assumption that
the features in a language are generated independently, given the class of the
language. In reality, however, the existence of one phoneme in an inventory is
often strongly correlated with the existence of other phonemes in that inven-
tory. For example, a language with /e/ often tends to have /o/, and vice versa.
Similarly, a language with an ejective stop at one place of articulation also
tends to have ejective stops at other places of articulation. In this respect,
having multiple mid-vowels or having multiple ejective stops is a single ‘fact’
about a language, but a naive Bayes classifier will treat each fact of this sort
as a set of multiple, independent facts. That is, the presence of mid-vowels
is treated as two facts: the presence of /e/ and the presence of /o/. Similarly,
the presence of ejective stops is treated as multiple facts about the presence
of ejective stops at each place of articulation. This kind of multiple count-
ing results in inflated scores, producing an effect of exaggerated certainty in
22 michael et al.

classifying a language. All languages will suffer from this effect to some extent
when undergoing classification, since feature non-independence (or, more col-
loquially, feature clumping) occurs frequently. Vowels of a given height, nasal
vowels, long vowels, voiced stops, aspirated stops, ejective stops, etc.: each of
these classes of sounds tends to be a clump. The presence or absence, in a test
language, of any of these clumps exaggerates classification probabilities, ren-
dering a literal probabilistic interpretation problematic. In our analyses, we
sidestep this problem by disregarding the literal interpretation of the classifica-
tion probabilities and reinterpreting them as measures of linguistic admixture.
This interpretive leap calls for a careful explanation of admixture and how it is
that admixture is not directly modeled by a naive Bayes classifier, to which we
now turn.
By the term ‘admixture’ we refer to the phenomenon where the features of
a language derive from two or more sources. This is analogous on some level
to genetic admixture, where a person inherits certain genes from one parent
and certain genes from the other; or, more abstractly, where a person inherits
features from each of the K distinct ancestral populations in his or her ancestry.
If we were to posit admixture for circum-Andean languages, one way to do this
would be to posit two sources, one for the Andean core and one for the control
class, described in Section 2. Each source is a hypothetical ancestral population
in which there is a certain amount of linguistic diversity. A source does not
have to be an actual set of precursor languages, though this is a good way to
conceptualize it.12 Each modern language descends from one or more sources.
A pure language derives its features from just one source. If, for example, all
of the languages in the ancestral population have /p/, then a descendant of that
source will also have /p/. If 60% of the languages in the ancestral population
have /x/, then a descendant of that source will have /x/ with 60 % probability. In
general, the probability that a descendant has a feature matches the probability
that a randomly-chosen constituent of the ancestral population has it.13 Since
there is some diversity in any ancestral population, one pure descendant does
not have to be identical to another, but it will in almost all cases be classified
as descending from that population with little ambiguity, when all features are
taken into account.

Source k is represented by feature frequencies (θk1 , …, θkL ), where θkl is the frequency of
12 Formally, a source is represented by a bank of feature frequencies, one for each feature.

feature l among the languages of ancestral population k. This is formally identical to how
a class is modeled in nbc; see first bullet in Section 4.3.
13 This is formally identical to how languages are generated in nbc; see third bullet in Section
4.3.
exploring phonological areality in the circum-andean region 23

A mixed language derives its features from more than once source. If, for
example, two ancestral populations are involved, then a certain fraction of the
mixed language’s features may derive from one, while the rest derive from the
other.14 It is often much more reasonable to posit that a language is mixed
rather than pure. For instance, if a language has many distinctively Andean
features and also many distinctively non-Andean features, then it is, on an
intuitive level, best to posit admixture. (Just as, if a dog has many poodle
features and many labrador features, one surmises that it is a mixed breed.)
When a language is mixed, it is often possible to infer the extent to which it
drew from each ancestral population. For circum-Andean languages, such a
statistic would indicate how core-like or control-like a language is.
However, as previously mentioned, the naive Bayes classifier is not a model
of admixture. Rather unrealistically, every test language is assumed to be a pure
language. Classification involves determining not to what extent the language
descended from each ancestral population, but with what probability. Our inter-
pretive leap is to use the latter as an indicator of the former. Unfortunately,
the coarseness of this method of interpretation does not allow us to infer the
absolute proportions of admixture in a language. If the model reports that a
language belongs to class k with probability 0. 7, that is by no means the same
as indicating that 70% of the phonemes of the language are from the source
identified with class k. We can only conclude that, if pk is higher for language
X than for language Y, then X probably derives more of its phonemes from the
source corresponding to class k than Y. This relativistic interpretive strategy,
whatever its drawbacks, has the benefit that it allows us to work around the
fact that feature clumping exaggerates classification probabilities and deprives
them of their usual interpretation.

4.6 Details in Applying the Model


4.6.1 Feature Culling
We previously assumed that feature clumps tend to be of limited size, so
that there is a limit to how much a single clump can affect classification
probabilities. In general this seems to be true, but there is a notable exception:
rare features. Rare features tend to occur together in very large clumps. For
instance, there are 112 features in our dataset that occur in exactly one language,
but twelve of them occur in the same language, Paez, causing the classification
of Paez to be greatly exaggerated. To prevent outcomes of this sort, we discarded

14 For an example of a model that implements admixture in exactly this way, see Pritchard
et al. (2000).
24 michael et al.

from our analyses all features that occur in five or fewer training languages. This
amounted to discarding 225 of the 304 features in the dataset, leaving 79.
To be consistent with culling rare features, we have also culled near-universal
features on the theory that, when absences are rare, the absences can clump
together just like rare features. Thus, we discarded any feature that is present
in all but five or fewer training languages. This resulted in discarding /t/, /k/,
/i/, and /a/ from our analyses, leaving 75 features.

4.6.2 Measuring Admixture in a Training Language


When using a naive Bayes classifier to measure admixture, we should not
exempt the training languages from scrutiny. However, it would prejudice the
model for a test language to be a training language too. When we wish to apply
the classifier to a training language in class k, we remove it from the set of
training languages first. This lowers the count Nk in Eq. 3 by one, and lowers
Nkl by one for each feature l present in the language. After this adjustment,
classification proceeds as before.

4.6.3 Feature Deltas


In a two-way classifier, the feature weights ul and vl give measures of the
association between class 1 and, respectively, the presence or the absence of
feature l. Having two weights for each feature is cumbersome if all we wish
to know is the degree of association between a feature and a class. Using the
formulas in Section 4.2, we define a measure called delta:

𝛼 + N1l 𝛼 + N2l
δl = ul − vl = log ( ) − log ( ).
β + N1 − N1l β + N2 − N2l

This measure is zero if the feature is neutral, positive if it is associated with class
1, and negative if it is associated with class 2. We can generalize delta to K-way
classification by defining a set of K deltas for each feature:

∑j≠k hjl
δkl = log ( ) − log ( ),
1 − hkl ∑j≠k 1 − hjl
hkl

where hkl = (𝛼 + Nkl )/(𝛼 + β + Nk ). The summations are from 1 to K, exclud-


ing k. The element δkl is zero if feature l is neutral with respect to class k, and
positive or negative if feature l is positively or negatively associated with class
k, respectively. A feature that is neutral with respect to all K classes will have
zeros for all K deltas.
exploring phonological areality in the circum-andean region 25

5 Results

5.1 Single Andean Core


The feature deltas (henceforth ‘deltas’) resulting from the nbc analysis of the
Andean core are given in Fig. 1. Positive deltas contribute to the classification of
the languages that bear them as Andean, while negative deltas contribute to the
classification of the languages that bear them as non-Andean. The presence of
phonemes like /q/ and /ʎ/ in the inventory of a given language thus contributes
strongly its classification as Andean, while the presence of /ɨ/ or /ã/ strongly
contribute to its classification as non-Andean.
The deltas given in Fig. 1 yield the distinctive phonological profile for the

delta of ±2 (p = 0. 88) as the cutoff for segments whose presence or absence is


Andean core given in Table 1. In these tables we (somewhat arbitrarily) select a

strongly characteristic of the Andean core, and deltas between 1 and 2 (0. 73 <
p < 0. 88) and −1 and −2 as the range for segments whose presence or absence,
respectively, are moderately characteristic of the Andean core. Strongly char-
acteristic segments are printed in bold, while moderately characteristic ones
are printed in normal weight.
The distinctive phonological profile of the Andean core languages, i.e. the set
of segments that distinguish the Andean core languages from control languages
in terms of either their presence or their absence, is large. The size of this
distinctive phonological profile strongly suggests that the chosen core forms
part of a phonological area distinguishable from the set of control languages.
The distinctive Andean consonantal profile can be positively characterized
as exhibiting contrastive aspirated and ejective stops (a contrast found also in
the postalveolar affricate), as well as a comparatively large number of affricates,
fricatives, and liquids. Less common places of articulation that contribute
positively to the profile include palatal (nasal and liquid) and uvular (stop and
fricative). The consonantal profile can be negatively characterized as excluding
the voiced alveolar stop and affricate, the labialized velar voiceless stop and
nasal, voiced bilabial and voiceless labiodental fricatives, and the glottal stop
and fricative. The distinctive Andean vocalic profile is positively characterized
by /u/ and /iː, uː, aː/, but negatively by the absence of mid vowels, non-low
central vowels, nasal vowels, and long versions of many of these vowels.
The nbc score of each language is given in Appendix c.1 and is plotted on a
map in Fig. 2, where the orange line is a smoothed version of the 2000-meter
elevation contour. Languages with nbc scores near zero, and hence, difficult
to classify as either Andean or non-Andean, appear in light gray. Higher nbc
scores for a language correspond to greater red saturation, while the lower (i.e.
negative) nbc scores correspond to greater blue saturation.
26 michael et al.

figure 1 Feature deltas for single core Andean analysis

table 1 Distinctive features of the Andean core languages. Left: distinctive phonemes
(positive feature deltas). Right: distinctive absences (negative feature deltas).

ph p’ th t’ kh k’ q qhq’ d kw ʔ
tʃ tʃh tʃ’ ʈʂ dʒ
s ʃ x χ βf h
ɲ ŋw
lɾ ʎ

iː u uː ĩ ĩː ɨ ɨː 󰀰̃ 󰀰̃ː ũ ũ ː tone


e ẽ ẽː ɛ ɛ̃ ə əː ə̃ ə̃ ː o õ õ ː ɔ ɔ̃ ɤ
aː ã

Inspection of Fig. 2 reveals that a penumbra of languages with high nbc


scores surrounds the posited Andean core, which is dense with languages with
very high nbc scores. Following our discussion of the interpretation of nbc
scores in Section 4.5, the high nbc scores of many of the languages in the
circum-Andean peripheral region indicate that their phonological inventories
much more closely resemble those of core Andean languages than those of
the control languages, suggesting phonological admixture with Andean lan-
guages.
exploring phonological areality in the circum-andean region 27

figure 2 Languages of South America (two-way Andean core nbc scores)


28 michael et al.

Figure 2 also shows that the nbc score tapers gradually with distance from
the Andean core. The periphery of this phonological area is thus diffuse, lack-
ing a clear boundary separating peripheral languages that are unambiguously
members of the phonological area, such as Yanesha’ [ame], from those that
are clearly not, such as Aguaruna [agr]. If we consider any language with an
nbc score greater than zero to be a candidate for membership in the area, and
(somewhat arbitrarily) any language with an nbc score in the 95th percentile
or greater to be a strong candidate for membership in the area, we obtain a
partitioning of the periphery into ‘strong’ and ‘weak’ members of the linguistic
area. These peripheral members of the Andean core mostly cluster geograph-
ically, as indicated below, and are displayed in the more detailed maps in Figs
3–5.

ecuadorean foothills
Strong: Cha’palaa [cbi] (Barbacoan)
Weak: Kamsá [kbh] (isolate)
huallaga valley
Strong: Chamicuro [ccc] (Arawak), Cholón [cht] (isolate)
Weak: Shiwilu [jeb] (Cahuapanan), Candoshi [cbu] (isolate)
southern peruvian foothills
Strong: Yanesha’ [ame] (Arawak)
Weak: Ashéninka (Apurucayali [cpc] and Pichis [cpu] dialects) (Arawak)
chaco
Strong: Vilela [vil] (isolate), Maká [mca], Chulupí [cag] (both Matacoan)
Weak: Wichí [mtp] (Matacoan), Toba Takshek [tob_tks], Toba Lañagashik
[tob_lng], Mocoví [moc] (all three Guaicuruan)
patagonia
Strong: Ona [ona], Haush [ona_mtr], Puelche [pue], Tehuelche [teh] (all
Chon)
Weak: Northern Alacalufan [alc_nth], Central Alacalufan [alc_cnt], and
Southern Alacalufan [alc_sth] (Alacalufan)
miscellaneous
Weak: Arabela [arl] (Zaparoan), Leko [lec] (isolate)
lowland quechuan languages
Strong: Ferreñafe Quechua [quf], Inga (Jungle dialect) [inj], Napo Quichua
[qvo], San Martín Quechua [qvs], Santiago del Estero Quechua [qus]

In several of these regions, such as the Ecuadorean foothills, the Huallaga


River valley region, and the Southern Peruvian Foothills regions, significant
contact between speakers of Andean languages and the relevant non-Andean
exploring phonological areality in the circum-andean region 29

figure 3 Languages of the North Andes and Circum-Andean regions (two-way nbc scores).
See Fig. 9 for language names.

languages is either known to have taken place (see, e.g. Adelaar and Muysken,
2004, 411–413; Payne, 1990, 1–10), or such contact is generally plausible, due to
geographical proximity and the ubiquity of trade between adjacent highland
and lowland regions.
Somewhat more surprising is the fact that Patagonia and the Chaco con-
stitute an essentially contiguous phonological area with the southern Andes.
30 michael et al.

figure 4 Languages of the Central Andes and Circum-Andean regions (two-way nbc scores).
See Fig. 10 for language names.

Although there is evidence of trade between the Tiwanaku polity and the
inhabitants of the Chaco between approximately ad 100 and ad 1100 (Angelo
and Capriles, 2000; Lecoq, 1991; Torres and Repke, 2006), it is unclear whether
those relations would have been sufficiently intense to produce the kind of con-
vergence we see between the southern Andean languages. Nevertheless, one
Chacoan linguistic isolate (Vilela) and several Chacoan languages of the Mata-
coan and Guaicuruan families exhibit features strongly statistically associated
with the Andean highlands, including ejectives, uvular consonants, and the
palatal lateral. Evidence of contact between Patagonian and southern Andean
peoples is even sparser, but the former languages likewise exhibit features
characteristic of the Andean core languages. It should be noted that in Pre-
Colombian times, the territory occupied by speakers of Patagonian languages
was contiguous with that occupied by Chacoan peoples (Viegas, 2005: 30),
raising the possibility that the similarity between Andean and Patagonian
exploring phonological areality in the circum-andean region 31

figure 5 Languages of Patagonia (two-way nbc scores). See Fig. 11 for language names.
32 michael et al.

languages arose not from direct contact between the languages of these two
regions, but was mediated by Chacoan languages.
Admixture between circum-Andean languages and more northern lan-
guages of the Andean core appears to involve relatively local and recent con-
vergence of these peripheral languages to Andean core ones, but the pho-
nological convergence evident among Chacoan, Patagonian, and southern
Andean languages does not exhibit clear directionality. The circumstances
that led to this broader areal convergence are less clear, suggesting that much
older, possibly multilateral, processes of phonological borrowing are respon-
sible for the large-scale phonological areality we see in the South American
Cone.
In addition to the languages enumerated above, which comprise an essen-
tially contiguous region with the Andean highlands, we find three other lan-
guages with positive nbc scores whose participation in the Andean and circum-
Andean phonological area is dubious. These languages, listed below as out-
liers, obtain their high nbc scores due, in large part, to having aspirated stops
and/or a palatal lateral in their phonological inventories. Given the probabilis-
tic nature of nbc results and the great distance of these languages from the
Andean core, which renders historical contact with the Andean core languages
extremely unlikely, we conclude that these languages simply bear a chance
resemblance to the languages of the Andean core.

outliers:
Strong: Yawalapití [yaw] (Arawak)
Weak: Yucuna [ycn] (Arawak), Yaathe [fun] (Macro-Ge)

5.2 Southern and North-Central Cores


Although there are sound reasons for positing a single Andean core, there
are also linguistic and socio-historical reasons to suspect that the Andean
highlands exhibit linguistically distinguishable sub-areas. For example, simple
inspection of Andean phonological inventories reveals that southern Andean
languages exhibit a three way aspirated/ejective/plain stop contrast and uvu-
lar consonants, whereas these features are rare or entirely absent in central or
northern Andean languages. The social histories of the two regions are also
quite different, with the southern Andes historically dominated first by the
Tiwanaku polity and then by Aymaran peoples, who only partially penetrated
into the central Andes (Adelaar, 2012: 578). The central and northern Andes, in
contrast, were dominated first by the Wari horizon and later by Quechuan peo-
ples, who penetrated into the southern Andean region only shortly before the
arrival of Europeans.
exploring phonological areality in the circum-andean region 33

These observations motivate a dual core analysis that distinguishes South-


ern and North-Central cores, where the division is defined by a line that groups
Jaqaru and Cuzco-Collao Quechua with all Andean languages to their south,
and Ayacucho Quechua with all Andean languages to its north.15 The deltas for
the Southern core are given in Fig. 6, and its distinctive phonological profile
in Table 2. The deltas for the North-Central core can be found in Fig. 7, and its
distinctive phonological profile in Table 3.
The deltas and distinctive profiles for the two cores exhibit significant dif-
ferences, while sharing some characteristics that distinguish them both from
languages outside either cores. Consonants that positively characterize the dis-
tinctive phonological profiles of both Andean cores include /s l ʎ ɲ/, and those
that negatively characterize both cores include /β f ɲw/. Vowels that positively
characterize both cores include /iː uː aː/, while those that negatively character-
ize them include the absence of mid vowels, non-low central vowels, and nasal
vowels. Both cores also lack tone.
Other features yield large positive deltas for one core but negative ones
for the other, distinguishing the cores not only from control languages, but
from each other. Ejective and aspirated consonants yield positive deltas for
the Southern Andean core, as do uvular stops and the lateral fricative /ɬ/, but
negative deltas for the North-Central Andean core. The converse holds for
/tʂ g z ʃ/.
Yet other features yield large positive or negative deltas for one core, but do
not yield a large deltas for the other. For the Southern core these include /x χ/
and the absence of /d ɸ w ɤ/. In contrast, /ts/ is positively associated with the
North-Central core profile and /ɣ/ negatively with it, but neither is salient for
the Southern core. Turning to the vowels, both cores are negatively associated
with central vowels, but the North-Central core exhibits a stronger negative
association with short mid-vowels, as /e o/ are not significantly negatively
associated with the languages of the Southern core.
The three-way nbc scores are plotted on a map in Fig. 8. Whereas the two-
way, single core results provide a one-dimensional measure of how core-like
or control-like a given language is, the three-way, dual core results indicate
to what degree a given language resembles the languages of either of the two
cores, as well as the non-core languages. We interpret this as different degrees
of admixture between the Northern-Central core, the Southern core, and the
control class. The amount of yellow, red, and blue in the color of each dot

15 This line was chosen to group together the Andean languages with a three-way contrast
between plain, aspirated, and ejective stops.
34 michael et al.

figure 6 Southern Andean core feature deltas

table 2 Distinctive features of the Southern Andean core languages. Left: distinctive
phonemes (positive feature deltas). Right: distinctive absences (negative feature
deltas).

ph p’ th t’ kh k’ q qhq’ d ɡ ʔ
tʃh tʃ’ dʒ ʈʂ
s x χ ɸβf z ʃ
lɬ ʎ ŋw
ɲ w

iː uː ĩ ĩː ɨ ɨː 󰀰̃ 󰀰̃ː ũ ũ ː tone


eː oː eː ẽ ẽː ɛ ɛ̃ ə əː ə̃ ə̃ ː õ õ ː ɔ ɔ̃ ɤ
a: ã ãː

encodes the proportion of those three components, respectively. In point of


fact, there are no instances of significant admixture between just the two
Andean cores, and all cases of significant admixture involve sizeable non-core
components.
The qualitatively most significant result of the dual core analysis is that
the majority of the languages of the Andean periphery identified in the single
core analysis do in fact align with one of the two sub-cores, and do so in a
geographically plausible manner. Languages which exhibit high Southern Core
exploring phonological areality in the circum-andean region 35

figure 7 North-Central Andean core feature deltas

table 3 Distinctive features of the Northern-Central Andean core languages. Left: distinctive
phonemes (positive feature deltas). Right: distinctive absences (negative feature
deltas).

ts tʃ ʈʂ ɡ ph p’ th t’ kh k’ kw q qh q’ ʔ
sz ʃ tʃh tʃ’
βf ɣ
l ʎ ɬ
ɲ ŋw

iː uː ĩ ĩː ɨ ɨː 󰀰̃ 󰀰̃ː ũ ũ ː
e eː ẽ ẽː ɛ ɛ̃ ə əː ə̃ ə̃ ː o oː õ õ ː ɔ ɔ̃ ɤ
aː ã ã ː

nbc scores are generally closer to the Southern Core than to the North-Central
Core, and conversely for languages with high North-Central nbc scores. The
fact that Andean-like languages in the peripheral region pattern with the near-
est core, rather than being randomly associated with either sub-core, indicates
that convergence between circum-Andean languages and Andean languages is
a relatively local effect, attributable to language contact between the Andean
languages of each sub-core and their circum-Andean neighbors.
Another random document with
no related content on Scribd:
steun en de schutsengel van Nederland moet blijven, en wie anders
denkt, is mij een vijand!...”
Eenigen der Geuzen stonden verbaasd en sprakeloos; de
meesten nogtans luisterden met geklemde tanden en met eene
uitdrukking van misprijzen.
“De wind is wat spoedig gekeerd!” riep Van der Voort, “gisteren
Geus, heden Paapsch!”
“Neen, neen,” riep Lodewijk, “ik ben nooit veranderd. Ik heb
gezworen met u tegen de Spanjaarden samen te spannen; dit was
onder de voorwaarde, dat men niets van mij tegen den godsdienst
vergen zou, en ik hadde hem niet gedaan dien eed, die mij zoo
zwaar op het hart gelegen heeft, ware het niet geweest om aan de
begeerte van Godmaert te voldoen. Gij zijt het, mijne heeren, die
veranderd zijt; gij hebt het geloof uwer voorvaderen verzaakt om
eene nieuwe gezindheid aan te kleven.”
“Dit is niet waar,” viel Van Halen hem in de rede. “Ik ben getrouw
aan den godsdienst.”
“Wat zult gij morgen dan doen?” vroeg de jonkheer.
“Morgen,” antwoordde Van Halen, Lodewijks hand drukkende,
“morgen zal ik aan uwe zijde staan, en ik zal strijden met u tegen de
scheurders.”
Een algemeene schreeuw van verontwaardiging ging op onder de
Geuzen:
“Nog een lafaard! nog een verrader! Gebannen, de dwepers! Weg
met de Spaanschgezinden! De deur uit!”
De geheele vergadering stond in rep en roer. Dolken werden
vooruitgebracht, en men ging de bedreiging van “de deur uit!”
werkstellig maken, wanneer moeder Schrikkel, vol benauwdheid en
met de armen opgeheven, binnen de zaal kwam geloopen en huilde:
“Gauw, gauw, mijne heeren, vlucht weg! op den zolder, in de goot,
— in den kelder! De wacht is dáár, — het huis is omringd van
gewapende mannen! Gauw, gauw!”
De Geuzen wierpen eenen gloeienden blik op Lodewijk, alsof zij
hem nu van een waar verraad beschuldigden; geen van hen deed
wat moeder Schrikkel zoo angstig aangeraden had. Integendeel, zij
schaarden zich allen in een halfrond, bereidden hunne pistolen,
trokken hunne degens of dolken, en bleven staan met het
voornemen om zich dapper te verweren.
De deur der kamer ging open. Een man van uitnemende lengte en
sterkte trad binnen. Zware knevels daalden hem langs de wangen,
wapenen van allerhanden aard hingen aan zijnen gordel.
“Wolfangh!” riepen de Geuzen verbaasd uit, terwijl zij hunne
degens en dolken weder instaken.
“Heeren,” sprak Wolfangh, zijnen hoed afnemende, “wat is dit?
waartoe die krijgsorde?... Komt op dan!” riep hij, zich naar de
trappen keerende, “komt op, mannen!”
Een twintigtal roovers drongen de zaal in en bevonden zich te
midden der Geuzen, die zich met afkeer van hen verwijderden.
De lastige stappen van menschen, welke iets zwaars geladen
hadden, deden zich nog op de trap hooren.
“Wat brengt gij ons dan, Wolfangh?” vroeg Lodewijk.
“Wat ik u breng, jonkheer? — Godmaert.”
“Godmaert!!” riepen allen met verwondering.
Vier mannen droegen den grijzen Geus op een vederen bed, en
plaatsten hem zachtjes op den vloer neder.
“Vrienden!” sprak hij, “het verheugt mij, dat ik u nogmaals
wederzie. Wie wil mij de hand drukken?”
Lodewijk had deze reeds vast en kuste ze met liefde. De Geuzen
kwamen, de een na den ander, den grijsaard met medelijden in
hunne armen drukken. Allen stonden stilzwijgend en met verbaasde
blikken op hem te staren.
“Wolfangh,” vroeg Schuermans, “hoe hebt gij toch onzen meester
verlost?”
“Heeren,” antwoordde de roover, “dit heeft weinig moeite gekost.
Ik had het gisteren al in den zin, en wilde u eene aangename
verrassing toebrengen. Ik dacht nogtans, dat wij Godmaert in eenen
beteren toestand zouden gevonden hebben.... Nu dan, ik kwam met
mijne makkers zachtjes aan het Steen. Wie is daar?” riep een
schutter, die met vele anderen bij de poort stond. “Wolfangh!”
antwoordde ik met eene donderende stem; en eer ik bij het Steen
naderde, waren zij allen de Palingbrug over en den Vischberg
afgeloopen. De Steenwarer wilde niet opendoen, doch wanneer hij
de poort onder de slagen onzer voorhamers en onder het geweld
onzer hefboomen zag waggelen, liet hij ons ras binnen en smeekte
om zijn leven. Wij gingen dan, door hem vergezeld, tot in de
moordenaarsputten, waar wij Godmaert vonden liggen. Voorts
hebben wij den edelen gevangene van zijn stroo opgelicht en, het
bed van den Steenwarer tot draagbaar nemende, hebben wij hem op
zijne vraag tot hier gebracht.”
Wolfangh keerde zich naar Lodewijk en vroeg met stille stem:
“Jonkheer, hoe heet de priester, die bij Godmaert was?”
“Pater Franciscus uit het Predikheerenklooster.”
De roover bracht den vinger aan zijn voorhoofd, als iemand, die
een woord in zijne hersens wil drukken om het niet te vergeten.
“Oh, wist de dochter van Godmaert, dat haar vader uit de
gevangenis geraakt is, wat vreugde zou het haar zijn!....” zuchtte
Lodewijk.
“Pater Franciscus heeft zich met deze boodschap belast,”
antwoordde Wolfangh. “Mannen!” ging hij voort zich tot zijne
makkers keerende, “ieder ga naar zijne legerplaats. Morgen te acht
uren! Gij blijft hier,” sprak hij tot de vier, die het bed gedragen
hadden.
De roovers ruimden de zaal en, na de Geuzen Godmaert vele
teekens van vriendschap en medelijden gegeven hadden, werd er
gevraagd of men beginnen zou. De stoelen werden binnengebracht
en zoo wel geplaatst, dat allen zich om den grijsaard konden
nederzetten. Deze, door de rust en het bijzijn zijner vrienden een
weinig krachtiger geworden, kon zijne armen reeds verroeren, en
Lodewijk bemerkte met uiterste blijdschap, dat de dood hem niet
treffen zou. Zijn hart vloog naar zijne beminde Geertruid. Nijdig was
hij, dat dit nieuws haar door een ander was gedragen geworden.
“Mijne heeren,” sprak Godmaert, na met een teeken der hand de
stilzwijgendheid gevorderd te hebben, “ik heb mij naar deze
vergadering doen brengen, om met u te beraadslagen over hetgeen
er moet gedaan worden. Hebt gij reeds over de zaak gehandeld?”
Houtappel bezag Lodewijk met eene spottende uitdrukking en
kwam vooruit tot bij Godmaert, dan sprak hij:
“Morgen zullen wij om acht uren ons op de Groote Markt
bevinden. Dit is vastgesteld. Het volk zullen wij door den kreet:
Leven de Geuzen! tot woelen opmaken; het sermoen van Herman in
de hoofdkerk zal eene groote beroerte in de stad verwekken; wij
zullen deze ten onzen voordeele wenden. Dan naar het stadhuis;
alwat Spaansch of Spaanschgezind is, gevangen; de stad met
gewapende mannen bezet, en onzen vrienden van Brussel en van
de Noordergewesten kennis gegeven van den goeden uitslag. Dan
nieuwe wethouders benoemd, het volk uitgezonden om de steden en
vlekken van het markgraafschap te doorloopen en de Spanjaarden
overal te verdrijven. Ik ben zeker, dat dit ontwerp uwe goedkeuring
zal bekomen.”
Godmaert bleef een oogenblik in diep gepeins. Terwijl wachtten de
Geuzen op een antwoord, alhoewel zij niet twijfelden of de oude
krijgsman zou hunne onderneming toejuichen.
Maar hoe stonden zij verslagen, wanneer Godmaert hun zeide:
“Neen, ik kan dit ontwerp niet goedkeuren. De tijd is niet gekomen.
Wij mogen nu tegen de Spanjaarden niet strijden.”
“Hij ook!” riep Houtappel, als vervoerd door razenden toorn.
“Welaan, broederen, wij zijn verraden, maar niet geleverd. Laat ons,
zonder die lafaards langer te kennen, ons werk voortzetten. Zij
mogen alleen met de Spanjaarden, nonnen en papen naar den
hemel gaan!”
Die scherts ontroerde Godmaert; een lichte gloed van gramschap
kleurde zijn bleek voorhoofd, en hij sprak met een streng gelaat:
“Dank moogt gij zeggen, Houtappel, dat mijn lichaam door lijden
uitgeput is, of ik zou uwe goddelooze spotternij op uwen mond doen
sterven. Stil, Lodewijk, word bedaard, mijn zoon.”
Houtappel dorst den grijsaard niet meer hoonen, en ging voort met
tusschen zijne makkers in stilte de verwijtingen en den haat uit te
strooien.
“Ha, nu begrijp ik het!” sprak Godmaert in zich zelven, “nu ken ik
u. — Het is waar, wat pater Franciscus mij zeide: er zijn ketters
onder ons. — Mijne heeren,” ging hij met meer kracht voort, “aan u,
die mijne vrienden zijt, ben ik de uitlegging van mijn gedrag
verschuldigd. Wij haten altemaal de Spanjaarden, eenigen om
persoonlijke redenen, allen omdat zij vreemdelingen zijn en ons
hoonen. Ik heb veel bijgebracht om dien haat onder u aan te stoken;
doch nu betreur ik het.... Mijne oogen zijn opengegaan, en ik heb
met pijn bevonden, dat al onze pogingen, zonder dat ik en velen
onder ons het wisten, tegen onzen godsdienst gericht waren. Dan,
hoe vurig ook mijn haat tegen de Spanjaarden zij, nimmer zal ik met
de vijanden van mijn geloof samenspannen.”
“Wat heeft de biecht gemeens met de omwenteling van morgen?”
schreeuwde Houtappel van uit eenen hoek der kamer.
“Wat zij er mede gemeens heeft, weet gij best,” hernam Godmaert.
“Gij weet, dat Herman Stuyck en zijne aanhangers de kerk van Onze
Lieve Vrouw willen ontheiligen: gij weet, dat de scheurders eene
gelegenheid zoeken om al onze tempels te verwoesten en de
beelden te breken; en gij hoopt, dat de beroerten van morgen die
gelegenheid van zelf zullen doen geboren worden. Ik beklaag mij,
dat ik machteloos ben.... want anders zou ik u misschien kunnen
ontmoeten en bestrijden, in uwe goddelooze aanvallen. En gij, mijne
vrienden, die mij altijd met achting aangehoord hebt, ik bezweer u,
helpt de ketters niet; stelt de omwenteling uit. Verlaat de zijde
dergenen, die zich niet schamen, in deze vergadering zelve met
spotternij te spreken van voor ons heilige zaken.”
Eene merkbare scheuring was er onder de Geuzen gebeurd. In
het diepe der kamer, rond Houtappel en Van der Voort, stonden die,
welke van geen uitstel wilden hooren. Omtrent Godmaert bevonden
zich Lodewijk, Van Halen, De Eydt en bijna de eene helft der
Geuzen. Schuermans liep over en weer, en wist niet bij wat gedeelte
hij zich voegen zou, terwijl Wolfangh zich als een vreemdeling in
deze onderhandeling gedroeg.
Nadat Houtappel met eenigen zijner makkers gesproken had,
kwam hij in het midden der kamer staan, als iemand, die eene
uitdaging gaat doen, en, de hand in de hoogte heffende, riep hij:
“Wij scheiden ons af van de bevreesden! Al wie den naam van
Geus liefheeft, al wie met ons tegen de Spanjaarden strijden wil, dat
hij ons volge.... Wij gaan in eene andere plaats onze
beraadslagingen voortzetten! Verraders mogen ons niet hooren!”
Omtrent de helft gingen de deur uit en verlieten vloekend de
kamer. Houtappel vond zich niet weinig bedrogen, wanneer hij zag,
dat Wolfangh geene beweging deed om met hem te gaan.
“Kom aan, Wolfangh,” riep hij. “Wat kont gij bij deze vreedzame
menschen doen? Gij behoort er bij als een hond in een kegelspel!”
De roover sloeg zijne hand aan een pistool en wilde Houtappel die
scherts met het leven doen betalen; maar Lodewijk belette hem dit
met een teeken.
“Gij zijt gelukkig,” riep Wolfangh. “Ga, ik heb met u niets gemeens,
en laat mij met vrede, of ik zal u leeren spotten!”
Houtappel ging morrend de trappen af. Er bleef dan in de kamer
nog één Geus, die niet wist wat hij doen zou; hij sloeg zich met de
handen tegen het hoofd om een besluit er uit te krijgen; eindelijk riep
hij:
“Zult gijlieden morgen niet vechten?”
“Ja, Schuermans,” antwoordde Van Halen, “tegen de ketters zullen
wij strijden.”
“Ha, dan blijf ik nog liever met u.”
“Ik versta de vreeze van den edelen Godmaert zeer wel,” sprak De
Rydt. “Die vervloekte predikers hebben den haat van een deel des
volks tot hun voordeel gekeerd en hen tot beeldenstormen
opgemaakt. Daar zij in ’t eerst, evenals wij, de Spanjaarden alleen
als vijanden aanzagen, hebben die aanbrengers eener nieuwe leer
het volk haat voor den godsdienst ingeboezemd, en nu denkt het,
dat beelden en Spanjaarden één zijn.”
“Ik heb gehoord,” sprak Van Halen, “dat zij morgen iets tegen
Onze-Lieve-Vrouwekerk willen ondernemen. Zij spreken niet meer
dan van branden en verwoesten. Hoe gaan wij die heiligschenderij
beletten?”
“Ik heb twintig uitgelezene mannen,” zei Wolfangh; “dezen zullen
uwe bevelen stiptelijk ten uitvoer brengen.”
“Meester,” viel een der vier roovers hem in de rede, “zoo wij niets
stelen mogen, zullen die heeren Geuzen hunne beloften ook moeten
volbrengen, of....”
“Zwijg, kerel!” riep Wolfangh.
De roover zweeg en gaf zijne wezenstrekken eene zeer
wantrouwende uitdrukking. Vele Geuzen waren over zijne woorden
verbaasd; want zij wisten niets van deze beloften. Godmaert alleen
kende ze, mits hij ze gedaan had.
“Onze zaak,” sprak de zieke, “is te edel en te verheven geworden
om nog betaalde mannen er toe te gebruiken. Ik zal u het beloofde
loon doen geven. Maar van nu af aan zijt gij ontbonden. Keert terug
naar Zoersel, indien gij wilt.”
“Zij zullen blijven!” riep Wolfangh met een bliksemenden oogslag.
“Ik zal hen dwingen tot goeddoen.... Geen woord meer, kerel!”
De roover sloeg zijne oogen nederwaarts voor de bedreiging van
zijnen meester.
“Luistert, mijne heeren,” hernam Godmaert. “Ziet hier wat gij zoudt
kunnen doen: er zijn nog genoeg getrouwe burgers in onze stad; wij
kennen er veel, die tegen de ketters zijn. Roept die morgen bij
elkander, en gebruikt hen om alle beroerte te beletten en de kerken
te beschutten. Dat Schuermans het volk van het Klapdorp met zich
brenge, De Rydt, gij de trouwe burgers der Nieuwstad, Lodewijk,
onze vrienden van het Kipdorp, Van Halen, de bootsliên van den
Burcht, enzoovoorts, ieder van ulieden degenen, die hem toegedaan
zijn. Gij zult u dan morgen op de Groote Markt bevinden en de
wapenbroeders helpen, indien het noodig is. Op de plaats zelve zult
gij misschien betere maatregelen uitvinden. Alles zal wel gaan.”
Godmaert had tweemaal eenen schotel wijn tot den bodem
geledigd, en dit had hem wonderlijk versterkt, want zijne wangen
waren reeds zacht gekleurd. Lodewijk zag met opgetogenheid den
verbeterden staat des grijsaards: hij verliet hem geen oogenblik en
scheen ten uiterste voor hem bezorgd; op het minste teeken vloog
hij Godmaerts wenschen vooruit, lichtte zijn hoofd op, dekte zijne
ledematen of reikte hem het drinkvat, om zijnen vrienden bescheid te
doen.
Nu hoorde men de voordeur opengaan, en het gerucht van een
krijschend zijden kleedsel deed zich op de trap hooren. Na eenige
oogenblikken lag Geertruid op de borst haars vaders te weenen, niet
van droefheid, maar van verrukking en blijdschap.
“Vader, vader!” riep zij, “ziet gij wel, dat gij genezen zult? O, gij
bloost reeds! En uwe armen kunnen zich om mijnen hals drukken,
laat mij u kussen; gij weet wel, dat de zoenen uwer dochter warm en
krachtig zijn. Vader, lieve vader, gij lacht mij toe!...”
En hare handen lagen plat op des grijsaards wangen. Deze
genoot met verrukking de liefde zijner dochter.
“Lief kind!” zuchtte hij, “gij zijt mij een zegen des hemels!”
Hij knelde haar met teederheid op zijne borat.
De omstanders schouwden in godsdienstig stilzwijgen op dit
tooneel. Schuermans en vele anderen leekten warme tranen van de
wangen. Wolfangh, die nu de belooning eener weldaad smaakte,
had zijne oogen met de handen bedekt en stond in eenen hoek der
zaal geweken. Lodewijk, die geenen enkelen oogwenk van zijne
Geertruid ontvangen had, was half treurig; doch die aandoening was
kort, want Geertruid vatte hem de hand en drukte ze teederlijk. De
jongeling verstond het meisje; een heldere glimlach rees over zijn
gelaat.
“Wolfangh, waar zijt gij?” riep Geertruid, de kamer rondziende.
“Ha, daar zijt gij, verlosser mijns vaders! Dank moet gij hebben; — ik
zal voor u bidden....”
De oogen des roovers blonken van ontroering.
“Ik ben uwe erkentenis onwaardig, edele jonkvrouw,” sprak hij.
“Niettemin acht ik mij gelukkig, iets te hebben kunnen doen, dat u
aangenaam is. Uwe blijdschap is mij eene zoete belooning.”
“Heer Wolfangh,” hernam Geertruid met eene droeve, doch
vriendelijke uitdrukking, “O, het spijt mij, dat een moedig mensch als
gij....”
“Ik versta u, jonkvrouw,” antwoordde de roover, “maar alle hoop is
niet verloren.... Gedenk mijner in uwe gebeden.”
Terwijl Geertruid voortging met Wolfangh te spreken, stond de
oude Theresia, die met de jonkvrouw was binnengekomen, bij haren
grijzen meester te weenen. Duizend uitroepingen kwamen haar uit
den mond, en zij vervulde de kamer met droefheidsgillen; want zij
zag hem voor de eerste maal en kon des meisjes blijdschap niet
begrijpen. Had zij hem zoo nabij het graf gezien als zijne dochter, zij
zou zeker ook wel verheugd zijn geweest. Op Lodewijks bevel
zweeg zij, doch weende voort met doffe snikken.
“Vader,” sprak Geertruid, “laat mij u in onze woning brengen, opdat
gij rusten moget en morgen welgemoed onder mijne zoenen
ontwaket.”
“Heeren,” riep Godmaert, “ik verlaat u. Maakt, dat de dag van
morgen geene gruwelen zie.... Komt, uwe hand nog eens gedrukt,
mijne vrienden, en blijft met God!”
Allen kwamen hem beurtelings de hand drukken en een eerbiedig
vaarwel zeggen.
Wolfangh deed de draagbaar naderen.
“Mannen,” sprak hij tot zijne makkers, “dat men den edelen
Godmaert naar zijne woning drage! Gij allen zult bij het huis blijven
waken en mij op uw leven voor al wat hem geschieden kan,
verantwoorden.”
“Ik dank u, heer Wolfangh,” zei Geertruid, zich voor hem buigende.
De grijsaard werd voorzichtig door de vier roovers opgelicht en
verliet de zaal onder het gejuich zijner vrienden.
“Lodewijk, als gezegd is, heden te acht uren!” riep Schuurmans.
In min dan een oogenblik was de kamer ledig; de stappen der
heengaande personen weergalmden op de trappen, en de voordeur
werd achter hen gesloten.
“Jezus, Jezus! wat zal er vandaag nog gebeuren!” zuchtte moeder
Schrikkel.
En zij schoof den laatsten grendel toe.
IX
...onedele gemeente,
Wat bitse nyd verteert het merch in u gebeente?
Wat dolheid u vervoert?

joost van vondel.

Alles was bereid gemaakt tot het omverwerpen der Spaansche


beheersching. Eenigen der Antwerpsche Geuzen, die meest allen
edellieden waren, wilden slechts tegen den vreemdeling strijden;
doch er heerschte nog eene andere en veel talrijkere gezindheid
onder de woelende scharen. Dit was de haat, dien menigeen den
beelden toedroeg. Pieter Herman was de prediker, die toen ter tijd bij
Antwerpen met den grootsten nijd tegen deze uitvoer. Hij had zich
door eene misbruikte welsprekendheid veel invloed bij de
misnoegden verworven, en zich daarvan bediend om hen aan den
Roomschen godsdienst te onttrekken. Dat het gemeene volk zich
door zijnen haat tegen de Spanjaarden had laten verleiden, hebben
de navolgende jaren bewezen; want de menschen kwamen allen, de
eene vóór, de andere na, van hunne dwaling terug. Op dit tijdstip
waren er evenwel zeer vele en vurige voorstanders der hervormde
leer.
Den negentienden Augustus, dag van gisteren, had er eene
buitengewone preek bij Borgerhout plaats gehad. Eene groote
menigte volk was er tegenwoordig. De regen, die bij groote vlagen
op het veld nederstortte, deed hen allen de plaats verlaten. Er werd
dan onder hen gezegd, dat zij ook eenen tempel hebben moesten;
en met vloeken en zweren werd deze begeerte nog sterker
uitgedrukt. Herman, die gevoelde, dat de tijd gekomen was om zijn
doel te bereiken, hield zijne aanhoorders een weinig buiten de
Kipdorppoort staan, en klom op de trap van eenen windmolen. Het
volk luisterde met angstige nieuwsgierigheid. Herman riep hun deze
roekelooze woorden toe:
“Morgen, te acht uren, preek in Onze-Lieve-Vrouwekerk!”
En hij kwam onder het gejuich: Leven de Geuzen! de molentrap
af.
Nu begon de schrikkelijke dag van morgen in het Oosten zich als
eene schemering te vertoonen. Een dikke grauwe nevel rees uit het
Westen het morgenlicht te gemoet en bedekte de zon met een
ondoordringbaar floers. Het scheen, dat die heerlijke parel van Gods
kroon hare stralen niet over zulke gruwelen zenden wilde on de
koude dampen als een scherm tot zich had geroepen. Dezen
ganschen dag bleef het blauwe hemelwelfsel onzichtbaar; de lucht
was met stofregen als bezwangerd, en de natuur kreeg eenen dier
dagen, op welke de dieren der aarde zich, alsof het nacht ware,
verschuilen.
De deuren en vensters werden krakend geopend. De vreedzame
daglooner ging met haast aan zijn werk, zijnen knapzak met het
dagelijksch brood gevuld; de kooplieden zette hunne waren uit, de
huisvrouw strooide met zorg het witte zand voor hare deur, want
geen van hen wist wat er gebeuren zou.
Om acht uren veranderde de rustige stand der stad in een woelig
tooneel, waarop het volk als de baren eener onstuimige zee
rondstroomde. Door nieuwsgierigheid aangedaan, verlieten de
werklieden hunne winkels, de bootslieden hunne schepen, de vaders
hunne huisgezinnen; en boven deze duizenden vlottende hoofden
staken de vuurroeren der wapenbroeders blinkend uit. Niets
voorspelde, dat er gruwelen zouden begaan worden; want zulke
rondstrooming van volk werd er in die tijden meest alle dagen in de
stad gezien. Bij afwisseling kwam het geroep: “Leven de Geuzen!”
eenen onvoorzichtigen mond uit, en dan ging een nare schreeuw ten
hemel op, en verlengde zich door al de straten der stad. De meeste
toeloop was op de Groote Markt; daar stonden talrijke schutters voor
het stadhuis geschaard. Zeker hadden de weldenkende wethouders
iets van der Geuzen opzet vernomen, want nooit was het stadhuis
zoo wel met krijgslieden bezet geweest.
Lodewijk, Van Halen, Schuermans en hunne vrienden waren daar
ook tegenwoordig. Eenigen van hen hadden zich onkennelijk
gemaakt. Schuermans had het dikke wambuis en de blauwe broek
eens schippers aan, de anderen droegen den wijden mantel op de
schouders en den breeden hoed op het hoofd.
Juist waren zij bezig met te beraadslagen, hoe zij zich gedragen
zouden, wanneer zij al het volk naar de hoofdkerk zagen loopen.
Angstig voor hare behoudenis, drongen zij met geweld door de
dichtgeslotene scharen, tot in het midden des tempels. Gods woning
werd door vloeken en zweren van het grauw onteerd, de wapens
klonken tegen de marmeren pilaren, en de graven der heiligen
werden van goddelooze voeten vertreden.
“Het sermoen! de predikatie!” werd er geroepen.
Dokter Herman klom op den predikstoel, met den bijbel in de
hand. Hij dacht zeker, dat hij daar niet rustig zou geweest zijn, want
in de andere hand nam hij een geladen pistool en riep, dat hij het op
degenen, die hem durfden storen, zou losbranden.
Lodewijk en zijne makkers hadden dit met ongeduld aangezien.
“Daar hebt gij een der voornaamste opstokers,” sprak de
jongeling.
“Wilt gij eens zien, Lodewijk, dat ik hem op het oogenblik doe
zwijgen?” vroeg Schuermans.
Op een bevestigend teeken van den jonkheer liep hij driftig den
predikstoel op. Eer Herman hem bemerkte, had Schuermans hem
reeds het pistool uit de handen gewrongen en het verre van hem op
den tempelvloer geworpen.
“Ga hier af, ketter!” riep hij, “of ik werp u, als eenen hond dat gij
zijt, ten gronde!”
Dokter Herman wilde niet afgaan. Op zijn gezelschap steunende,
poogde hij Schuermans vast te grijpen; doch deze, den prediker om
de middel vattende, wierp hem als eenen steen te midden in het
volk, dat schreeuwend achteruitdeinsde. Vele gewapende mannen
vielen op Schuermans aan, om den hoon, dien hij hunnen meester
had aangedaan, te wreken. Misschien zouden zij den moedigen
Antwerpenaar wel onbarmhartiglijk gedood hebben, waren zijne
vrienden hem niet ter hulp gevlogen.
Hier begon nu eene hevige worsteling. De beeldenstormers wilden
den predikstoel hebben, en schreeuwden den anderen toe, dat zij
Spanjaarden waren. Echter, zij het tegendeel wetende, werd er van
de dolken geen gebruik gemaakt. De krachtige spieren en de zware
vuisten alleen dienden hun tot wapen. Dit worstelen had nu al
eenigen tijd geduurd, wanneer een moedwillige vreemdeling
Schuermans eenen dolksteek toestuurde en hem een weinig aan
den arm kwetste. Eenige droppelen bloeds rolden hem over de
vingers. Zijne vrienden werden op dit gezicht verbolgen en trokken
hunne dolken. Een bloedig gevecht scheen onvermijdelijk; velen
liepen vervaard en schreeuwend de kerk uit.
Op eens werd het volk, dat bij den ingang stond, met
onweerstaanbaar geweld tempelwaarts ingedreven; de predikstoel
scheen onder de drukking der achteruitdeinzende schaar van zijne
grondvesten te worden gerukt.
Wolfangh kwam aan het hoofd van twintig welgewapende roovers
als uitzinnig de kerk binnen. Op het gezicht dezer onbekende
mannen, die met zulke dreigende blikken op het volk staarden en
den tempel tot een moordkuil schenen te willen maken, werd het
worstelen geëindigd. Niemand durfde zich nog roeren.
“Lodewijk,” vroeg Wolfangh, “wat gebiedt gij?”
Hij zwaaide zijn rapier met vlammende oogen tusschen de
beeldenstormers. Eer Lodewijk een woord gesproken had, lagen er
reeds drie gewond op den vloer.
“Houd op! houd op!” riep de jongeling, “stort geen bloed! Wij zijn te
gering in getal om de predikatie te beletten; laat ons liever naar het
stadhuis loopen om hulp te vragen. Wij zullen terugkomen met eene
goede bende schutters en deze goddeloozen de kerk doen
ontruimen. Komt aan met spoed!”
Zij gingen ter tempeldeur uit, in de gedachte dat men gedurende
hunne afwezigheid zou voortgaan met prediken. Maar niet zoodra
hadden zij de plaats verlaten, of een lang geschreeuw van “de
afgoden aan stukken! de afgoden aan stukken!” vervulde den tempel
als een vernielingskreet.
De ketters begonnen dan tegen de beelden allen smaad te
roepen, en wierpen ze met vuiligheden in het aangezicht. Zij hadden
evenwel nog niets gebroken, toen een van hen, voor St. Rochus
staande, luidop riep, dat er geene beesten in Gods tempel zijn
mochten. En hij rukte den marmeren hond van de voetzuil ter aarde.
Een ander vatte den heilige bij de voeten, en daar het beeld in den
muur vast was en niet onder zijn geweld breken wilde, trok hij met
zulke kracht er aan, dat de twee voeten hem in de hand bleven. De
ketter stortte achterover op den grond. Het bloed liep hem langs
mond en ooren uit.
“De afgoden aan stukken! De afgoden aan stukken!” riepen
duizenden stemmen. “Leven de Geuzen!” en in een oogenblik
hadden zij zich van koorden, bijlen, houweelen en ander werktuig
voorzien.
Nu liepen zij razend naar de tempelmuren, en hakten met geweld
alles, wat maar een beeld gelijk was, ter neder. De menigvuldige
kostelijke altaren, de schilderijen, de marmeren versiersels, alles
werd onder het uitbraken van godlasterende woorden ten gronde
gesmeten en met hamers verbrijzeld. Het heilig lichaam onzes
Heeren eerbiedigden zij niet meer dan het gevoelloos marmer. Zij
smeten de hostiën op den vloer en vertraden ze onder hunne
voeten.
Het scheen, dat de almachtige God Zijnen arm wederhield om
hunne gruwelen des te zwaarder te laten worden, en hun de straffen
boven het hoofd te verzamelen.
Tot hiertoe hadden zij de beelden en alles, wat zij bereiken
konden, ontleed en verbrijzeld. Één tafereel hing nog aan den muur.
Christus, voor ons allen aan het kruis stervende, was er kunstig op
afgemaald. Velen der stormers hadden reeds hunne oogen met
nijdige blikken er heen gewend; doch geen van hen dorst het
overgeblevene tafereel genaken. Een man, wiens grijze haren over
zijne schouders in wanorde hingon, stond voor de schilderij, de kolf
van een zinkroer tegen de borst, en bereid om zijn wapen los te
branden op dengene, die hem zou naderen.
De heiligschenders kwamen eindelijk in groot getal naar den
grijsaard, en wierpen hem met de stukken der beelden, om hem te
doen wijken; doch hij bewoog zich niet en scheen ongevoelig aan
hunne boosaardige woorden en daden. Op eens kwam er één
behendiglijk achter hem en trok hem achterover op den vloer. Het
roer ging af, en een der stormers kreeg het lood in de borst.
Nu galmde de schreeuw “slaat dood! slaat dood!” door de gansche
kerk.
“Mijn tafereel!” schreide de schilder, “o, mijn Christus!”
En hij reikte de armen smeekend ten hemel. Hij zag het tafereel,
gebroken en aan flarden gescheurd, nevens zijne zijde vallen op
hetzelfde oogenblik, als een Geus hem met eenen dolksteek het hart
doorboorde. De ongelukkige kunstenaar sprong op door eene laatste
zenuwspanning en viel, zoolang hij was, op de stukken der schilderij.
Zooals hij weleer aan Lodewijk gezegd had: zijn bloed stroomde, der
kunst ten offer, over het werk zijner handen.
De beeldenvijanden lieten het lijk van Van Hort liggen en begaven
zich opnieuw aan het breken. De twaalf apostelen stonden eerlijk en
verheven boven de pilaren, die het welfsel ondersteunden. Hooge
ladders werden er tegen gesteld, en met haken en koorden werkten
de schenders zoolang, totdat deze marmeren beelden alle op den
grond verbrijzeld lagen. Velen werden door den val gewond, en
kermen hoorde men de gansche kerk door. Doch niets kon hen
wederhouden; zij waren uitzinnig geworden. Alles was nu aan
stukken, en de vloer met hoofden, voeten en andere deelen der
beelden dusdanig bedekt, dat men met moeite er over kon.
Een prachtig beeld alleen stond nog ongehinderd boven deze
puinhoopen van heilige zaken. Dit was het miraculeus beeld van
Onze-Lieve-Vrouw van Antwerpen. Zij was nog in plechtgewaad,
zooals zij twee dagen te voren in den ommegang was rondgedragen
geworden. Eene kroon van de kostelijkste diamanten versierde haar
hoofd. Een mantel van goudlaken, met schitterende parelen
doorwrocht, viel achter haar in kunstige vouwen neder. Het goddelijk
kind Jezus droeg den zilveren wereldbol op zijne vingers.
Waarom dit beeld nog niet gebroken was, is moeilijk te zeggen.
Allen hadden het gezien, mits het in ’t midden der kerk op eene
prachtige draagbaar geplaatst was. Het is denkelijk, dat geen dezer
goddeloozen het op zich dorst nemen, den andere tot het breken
dezes beelds op te maken.
Nu alles verbrijzeld was, en de haken en bijlen stil lagen,
begonnen zij allengskens de moeder Gods te naderen en zagen
elkander in de oogen met ondervragende blikken. Op dien stond
kwam een van hen, die dronken was, want hij kon zich nauwelijks
recht houden, toegeloopen.
“Wel, mannen!” riep hij, “zijt gij vervaard van dit stuk hout, of zijt gij
bang van de bellekens, die haar aan ’t lijf hangen? Kom, kom, smijt
die.... maar op den grond!”
En een zoo schrikkelijk smaadwoord viel van zijne lippen, dat het
zijne makkers verbaasde.
“Roep, vivent les Gueux! of gij moet aan stukken,” brulde hij
nogmaals.
Willende de daad bij de woorden voegen, vatte hij met zijne twee
handen de armen der draagbaar, en deze omkeerende, wierp hij de
Lieve Vrouw op den vloer. De juweelen werden ontroofd, de mantel
gescheurd, de kroon verbrijzeld, en het beeld bleef naakt en
geschonden liggen.
Hadden de mannen van Wolfangh hunnen meester verlaten, om
zich onder de beeldenbrekers te vermengen? Dit was waarschijnlijk,
want onder die, welke eerst de hand legden aan de juweelen der
moeder Gods, waren vier of vijf kerels, die een uur vroeger met
Wolfangh waren uitgegaan.
Wanneer de ketters eenigen tijd nutteloos hadden rondgezien
naar beelden, die konden gebroken worden, begaven zij zich tot
rooven. Zij namen de gewijde kelken, remonstrantiën, kandelaren en
kruisen; alles wat maar eenige waarde had, werd gestolen. De deur
der sacristie werd opengeloopen, en de booswichten, niet
vergenoegd met rooven en stelen, kleedden zich spotsgewijze als
priesters en zongen vuile liedjes, als lofpsalmen, beschimpend ten
hemel op.
Dit alles gebeurde zonder eenigen tegenstand. Lodewijk was met
Wolfangh naar het stadhuis geloopen, en had den burgemeester
verzocht een deel schutters met hem naar de kerk te sturen; maar
een ander gevaar belette de overheid dit verzoek toe te staan. Men
hoorde in de richting van het Spaansch kwartier een hevig geschut
van zinkroeren, een verward krijgsgeroep en al de kenteekenen van
een bloedig gevecht. Vele schutters hadden hunne gelederen
verlaten, om zich naar huis te begeven en hunne eigene goederen
voor plundering te beschermen, zoodat de burgemeester de
weinigen, die overbleven, niet van het stadhuis durfde wegzenden.
Het gerucht en het geschiet, dat men hoorde, was veroorzaakt
door eenen aanval van Houtappel en zijne vrienden tegen het
Spaansch kwartier.
De Spanjaarden hadden zich aan dien aanval verwacht en hunne
dienstboden gewapend langs hunne huizen in de Kloosterstraat
geschikt. Ook, wanneer de Geuzen zich eerst vertoonden, vonden zij
eenen goeden tegenstand en moesten met verlies van vier mannen
terugwijken. Maar hunne razernij werd heviger door dit ongeval.
Houtappel sprak zijne makkers aan en liep met hen opnieuw vooruit.
Nu hoorde men op de Groote Markt de afwisselende schoten der
roeren en het geraas der smaadkreten, die de twee benden elkander
vechtend toestuurden. De Geuzen behaalden ditmaal een groot
voordeel op hunne vijanden, doordien zij moediger en in grooter
getal waren; zij smeten zich weldra te midden der Spanjaarden,
vermoordden al degenen, die tegenweer boden, en dreven de
anderen op de vlucht, zoodat zij zich eindelijk meester van het
slagveld zagen.
De lijken en gekwetsten werden opgenomen en bij de Hoogstraat
in het Paardeken gebracht. Wanneer de gewonden verbonden
waren, begaven zich de overblijvende Geuzen terug naar de
Kloosterstraat en liepen er de deuren der Spaansche woningen
open, welke bezigheid zij bleven voortzetten totdat er geen enkel
vijand meer te vinden was.
Gedurende dien tijd waren de beeldenstormers nog bezig met in
de kerk van Onze-Lieve-Vrouw alles aan stukken te slaan of te
rooven. Doctor Herman, die hen niet had verlaten, wakkerde hen
aan om in het breken der afgoden, zoo hij zeide, voort te gaan, en
deed hen het voornemen opvatten om de andere parochiekerken der
stad op dezelfde wijze te ontheiligen.
Zij trokken dan met kruisvanen, standaarden, zilveren lantaarnen
en kruisen, welke zij geroofd hadden, als eene processie de kerk uit.
Een groot getal onder hen hadden kazuifelen, stolen en ander
geestelijk plechtgewaad aan. Zij zongen met verwarde stemmen de
psalmen, door Clement Marrot op rijm gesteld. De kostelijke
kruisvanen wentelden zij ten schroom der verbaasde burgers, in het
slijk, en hieven ze dan weder vuil en onkennelijk in de hoogte.
Het geschreeuw: Leven de Geuzen! herhaalden zij onophoudelijk.
Lodewijk met Wolfangh en een tiental hunner vrienden stonden bij
het stadhuis en staarden met wanhoop op die verfoeilijke
heiligschending, zij poogden nogmaals de wethouders over te halen
tot eenen aanval tegen de beeldenstormers; doch zij gelukten hierin
niet, aangezien de overheden het voorzichtiger oordeelden, de
weinige krijgsknechten, die hun getrouw gebleven waren, niet in
gevaar te stellen.
Lodewijk leunde moedeloos en bijna weenend tegen eenen paal
der markt; zijne oogen dwaalden met afgrijzen en met toorn
tusschen de ontheiligde standaarden. Misschien ware hij in die
beweegloosheid zeer lang verzonken gebleven; maar iets, dat hij nu
zag, deed hem opspringen als iemand, die door eenen pijnlijken slag
getroffen wordt. Hij bracht de twee handen voor de oogen, om niets
meer te zien; weldra nogtans hief hij het hoofd op en riep tot zijne
vrienden:
“O, hemel! Ongehoorde boosheid! Ziet, zij hebben het heilig
sacrament! Onzen levenden God zelven durven zij bespotten! Nu
weerhoudt ons niets meer.... Sterven wij als ware Christenen, indien
het zijn moet! Ontrukken wij hun ten minste het allerheiligste!”
Met deze woorden trok hij zijnen degen uit de scheede en wilde
zich vooruitwerpen, om te midden der schenders te loopen; maar
Wolfangh weerhield hem en sprak met doffe stem:
“Bezie mij, Lodewijk. Is er bloed in mijne oogen of niet? Brandt in
mij de razernij als een verslindend vuur? Ja, niet waar? Nogtans,
ditmaal zal ik mijne drift overwinnen. Aan mij zal de eer toekomen
van het uitvoeren dezer taak. Gijlieden kunt ze niet volbrengen; gij
zijt te woedend, te onvoorzichtig, met geweld is hier niets te
winnen.... Laat mij doen; blijft hier stil staan,... verroert u niet....”
Wolfangh haalde bij deze woorden eenen moordpriem van onder
zijnen mantel en beproefde met den vinger of de punt nog scherp
was. Dan ging hij met sluipende stappen tot tusschen de schenders
en naderde allengs tot bij dengene, die het allerheiligste droeg. Maar
hoe ontvlamde hij in gramschap, wanneer hij in dezen spotter eenen
roover zijner bende herkende! Hij bleef staan, stak de hand onder
zijnen mantel en vatte den moordpriem, maar eene plotselijke
gedachte deed hem dien weder loslaten. Hij bracht zijnen mond aan
het oor des roovers en sprak met eenen nadrukvollen klem:
“Gij gaat sterven, Bernhard. Mijn moordpriem weet reeds de
plaats, waar hij u doorboren zal.”
De roover werd bleek als een doode; hij had de stem, die in zijn
oor sprak, herkend. Eene siddering liep hem over het lichaam.
“Luister,” hernam Wolfangh, “ik zal u genade schenken, ik zal u
niet vermoorden, indien gij hetgeen gij draagt mij overgeeft, zonder
dat het iemand bemerke.”
De roover bukte zich alsof hij iets, dat aan Wolfanghs voeten lag,
wilde vatten. Hij stond op: de remonstrantie was verdwenen....
Alleenlijk kon men bemerken, dat Wolfangh met den linker elleboog
de eene zijde van zijnen mantel omhoog hief, eene houding, die men
in hem zeer zelden bespeurde. Hij ging niet rechtstreeks tot
Lodewijk, maar draaide langs de Handschoenmarkt af en kwam zoo

You might also like