Professional Documents
Culture Documents
(Download PDF) Statistical Universals of Language Mathematical Chance Vs Human Choice 1St Edition Kumiko Tanaka Ishii Online Ebook All Chapter PDF
(Download PDF) Statistical Universals of Language Mathematical Chance Vs Human Choice 1St Edition Kumiko Tanaka Ishii Online Ebook All Chapter PDF
https://textbookfull.com/product/human-universals-donald-e-brown/
https://textbookfull.com/product/mathematical-foundations-of-
infinite-dimensional-statistical-models-1st-edition-evarist-gine/
https://textbookfull.com/product/mathematical-literacy-on-
statistical-measures-christian-buscher/
https://textbookfull.com/product/a-metaphysics-of-platonic-
universals-and-their-instantiations-shadow-of-universals-jose-
tomas-alvarado/
Philosophy and Linguistics Kumiko Murasugi
https://textbookfull.com/product/philosophy-and-linguistics-
kumiko-murasugi/
https://textbookfull.com/product/mathematical-and-statistical-
applications-in-food-engineering-1st-edition-surajbhan-sevda-
editor/
https://textbookfull.com/product/choice-words-how-our-language-
affects-childrens-learning-2nd-edition-peter-johnston/
https://textbookfull.com/product/the-business-of-choice-how-
human-instinct-influences-everyones-decision-2nd-edition-matthew-
willcox/
https://textbookfull.com/product/mathematical-and-statistical-
applications-in-life-sciences-and-engineering-1st-edition-
avishek-adhikari/
Mathematics in Mind
Kumiko Tanaka-Ishii
Statistical Universals
of Language
Mathematical Chance vs. Human Choice
Mathematics in Mind
Series Editor
Marcel Danesi, University of Toronto, Canada
Statistical Universals
of Language
Mathematical Chance vs. Human Choice
Kumiko Tanaka-Ishii
Research Center for Advanced
Science and Technology (RCAST)
The University of Tokyo
Tokyo, Japan
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Contents
v
vi Contents
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
Part I
Language as a Complex System
Chapter 1
Introduction
1.1 Aims
For nearly hundred years, researchers have noticed how language ubiquitously
follows certain mathematical properties. These properties differ from linguistic
universals that contribute to describing the variation of human languages. Rather,
they are statistical: they can only be identified by examining a huge number of
usages, and none of us is conscious of them when we use language.
Today, abundant data is available in various languages, and it provides a
clearer picture of what these properties are. They apply universally across genres,
languages, authors, and time periods, in a range of sign-based human activities,
even in music and computer programming. Often, these properties are called scaling
laws, but the term is not applicable to all of them. Because they are both statistical
and universal, we call them statistical universals. This book’s aims are to provide
readers with a review of recent findings on these statistical universals and to present
a reconsideration of the nature of language accordingly.
A key representative of previous literature on statistical universals is Zipf (1949).
In that study, George K. Zipf described certain statistical universals and considered
them evidence of the efficiency underlying language. Prior to that book were other
important works such as Yule (1944). After Zipf’s book, Herdan (1964, 1956) and
Thom (1974) also showed the mathematical nature underlying language. Baayen
(2001) presented an important overall analysis of rare words in relation to Zipf’s law.
Recently, Kretzschmar Jr. (2015) considered the law as evidence of the emergent
nature of language and argued its relation to complex systems within the field of
linguistics.
Numerous researchers on specific themes related to statistical universals, in
the fields of statistical mechanics and computational linguistics, have discovered
other statistical qualities of language that go beyond Zipf’s law. Nevertheless, the
reports to date are relatively short individual papers and focus on specific topics. A
book chapter by Altmann and Gerlach (2016) presented an overview of statistical
universals, but it was too brief to cover the complete, scattered nature of the studies.
Therefore, the relations among the various findings have not yet been clarified, and
the frontier of research into statistical universals remains obscure.
Against that background, this book provides an up-to-date argument on the statis-
tical universals in a larger volume than a research article. Specifically, the aim is to
provide researchers in computational linguistics with a consistent understanding of
statistical universals. The argument is based on analyzing the mathematical behavior
of large numbers of samples, as is often done in fields such as statistical mechanics
and complex systems theory. The reason for studying large amounts of data is that
it can reveal properties that are invisible when we only study smaller samples. For
example, a few tosses of a die result in a short sequence of numbers, but a billion
tosses show a new picture of the die’s nature. In the long run, an ideal die should give
almost the same number of results for each of the six faces. A real die, however, is
not a perfectly cubic shape, and therefore, the distribution of tosses will eventually
show this bias.
Hence, this book considers how the statistical universals stipulate the char-
acteristics of language. At the same time, it highlights how language deviates
from standard statistical behaviors. Indeed, certain statistical universals confirm the
expected mechanics, as if language behaves like an ideal die. In these cases, the
statistical universals are trivial, yet an important question remains: how does this
mechanics stipulate the nature of language? In contrast, other universals deviate
from the expected mechanics, reflecting how language also behaves like a biased
die. In those cases, the bias might represent some human factor requiring further
study to reveal its origin.
The poet Stéphane Mallarmé once compared the action of composition to a throw
of dice (Mallarmé, 1897).1 He pointed out that our use of language can never be
free from chance, and his poetic composition highlighted the challenge of this fact.
A linguistic act could perhaps be a mixture of chance and choice (Herdan, 1956).
Statistical mechanics should partly reveal the nature of the first factor, of chance,
as linguistic acts proceed while partly sampling past phrases. The second factor, of
choice, is then what drives us to speak: it derives from human intention, as opposed
to chance. If that is the case, then identifying the statistical component of the entire
phenomenon would reveal the nature of intention.
Figure 1.1 situates this book at the intersection of four subjects. The left part of the
figure shows various factors related to language, in particular those underlying the
usage of a language system. These factors include the language faculty, cognition
1 Mallarmé wrote a composition entitled Un coup de dés jamais n’abolira le hasard (A Throw of
of language, and intention, and the social foundation necessary for language to
work. This book mainly considers language usages accumulated in the form of
a corpus, a large quantity of language data, which Chap. 3 defines in detail.
The right part of the figure shows the statistical universals of language, which
are revealed by certain computational procedures, and the mechanics of chance
underlying them. The mechanics of chance is an inevitable consequence of a large
number of events involving chance. The book is thus an overview of the statistical
universals underlying language, especially in regard to human factors, and how
those universals stipulate language.
After Part I, which positions this book within a multidisciplinary academic
context, the chapters establish relations between the left and right parts of Fig. 1.1.
The first half, in Parts II and III, explains different statistical universals obtained
with large corpora, as represented by the rightward arrow in the figure. The first
half also discusses the different characteristics of statistical universals for language
sequences artificially created by chance.
As represented by the leftward arrow in the figure, the second half of the
book considers how the statistical universals explain the nature of language. It
focuses on the reasons for the statistical universals, namely how they stipulate
the nature of language, or what mathematical and human factors underlie these
phenomena. In particular, it examines whether random processes can fulfill the
statistical universals of language. A random process approximates language as a
sequence of chance events. The results show that certain state-of-the-art processes
have the potential to reproduce the statistically universal nature of language, but
their capability is currently still limited. This state of affairs implies future directions
for understanding language from a mathematical perspective.
6 1 Introduction
As Hey et al. (2009) assert, data science has become the fourth paradigm of science,
and it holds the key to better engineering of large-scale data. Language data is one
of the largest and most important forms of big data. Such big data is now being
processed to support human linguistic activities. The techniques and methods of
language engineering are studied in the field of natural language processing. The
primary target of the field is thus engineering to provide people with computational
assistance for processing language.
Issues in engineering often attract scientific interest. The scientific view of
natural language processing is highlighted by the term computational linguistics.2
Moreover, the intersection of computation and language includes other fields that
study language by means of computers, including quantitative linguistics and
corpus linguistics.
In computing with language, we must understand the properties of language from
a computational perspective. This book attempts to provide one such perspective.
I believe that it not only fulfills a scientific aim but also contributes to the goals
of language engineering. One possible engineering objective would be to build
computational language models that exhibit those properties. That is, good language
models should reproduce the properties of natural language, to better assist language
processing.
Over the years, Zipf’s law and other related laws have been incorporated in
language models. Part II discusses the frontier of studies addressing those traditional
laws. The main focus of this book, however, lies rather in the properties described in
Part III. Specifically, a property of language called long memory has been quantified
more recently by borrowing concepts from complex systems theory. Whether
computers can reproduce long memory is an open question in machine learning.
Therefore, this characteristic of language must be computationally quantified to
clarify the frontier of more advanced language computation. Reproducing only
Zipf’s law with a language model is not especially difficult, but reproducing all the
properties, including those of Part III, remains challenging. Part V describes these
issues.
2 In
this book, the term computational linguistics stands for both natural language processing and
computational linguistics, following convention.
1.3 Position of This Book 7
The properties considered in this book hold universally across languages. The
fields of study dedicated to language are broader than those using computational
means, and the question of universal properties that hold across a variety of
languages, or even all languages, has been an important one in the long history
of linguistics. Therefore, the statistical universals must be considered in relation to
linguistic universals. Thus, Chap. 2 positions the statistical universals within the
history of linguistic universals.
Gaining an understanding of universals involves other factors besides the quality
of data, because such data is generated by humans. Therefore, this book also takes a
cognitive approach in places, by showing how the statistical properties of language
relate to linguistic universals and the findings of recent cognitive studies. In this
sense, the book partly involves cognitive linguistics, too. Various researchers have
chosen approaches based on their own interests and backgrounds. Nevertheless,
given the common target of language, the essential questions should be common,
irrespective of the disciplines in which they arise. Hence, this book provides one
perspective on language, gained through my learning from previous studies bridging
those divisions.
On the other hand, this book takes a holistic approach by examining language
through the holistic properties of corpora. The father of modern linguistics, de Saus-
sure (1916), suggested this approach, as follows:
We should not start from words, or terms to deduce the system. This would assume that
terms have absolute values, and the system is acquired only by constructing the terms one
with the others. Conversely, we should start from the < system > that works altogether; this
last decomposes into certain terms, although this is not at all so easy as it seems.
1.4 Prospectus
The goals of this book are thus to summarize the current understanding of statistical
universals and to consider how they might function as a precursor to language,
stipulating both its elements and individual linguistic acts. In other words, this book
is about the structural, holistic properties of language systems, as found empirically
in data. The content is interdisciplinary: it treats computational linguistics from a
perspective of language as a complex system.
The book is based on the great insights of various forerunners, with additional
findings from my previous studies. Although it is limited by the current state of our
knowledge about language, and by my capability of communicating with different
audiences, I have tried to cross borders between disciplines.
The prospective audience includes the following readers. For those who study
language with computers, the book provides an overview of the global properties
of language and how they relate to important notions gained through computing.
For linguists, it provides a macroscopic perspective that differs from the perspective
of traditional linguistics. For physicists who are interested in language, it provides
basic examples showing how the methods of physics can be applied to language and
how language is yet another complex system. Finally, for general readers who are
interested in language, the book explains the new, emerging frontier of using big
data to study and understand language.
For those who are at ease with mathematical formulas, I formally define
properties when necessary. Some content involves rigorous formulations, for which
the theoretical mathematical background, including proof summaries, is given in
Chap. 21. To make the book self-contained, summaries are provided for most of
the theoretical rationales. Theorizing through mathematical contemplation often
requires making assumptions about the object of interest. As Part III demonstrates,
however, language is likely not conducive to simple assumptions. Hence, the book
does not presume that arbitrary properties underlie language.
Questions about language tend to attract researchers and students in the human-
ities. Although this book must invoke mathematical concepts, I have made all
22 3 Language as a Complex System
3.1.3 On Infinity
3 There are computational means of evaluation, such as the BLEU score (Papineni et al., 2002),
that compare machine-translated text with human-translated text. Therefore, a human must first
translate the text, which of course presumes a human process based on meaning. Furthermore, to
evaluate the true quality of the machine-translated text, people must still evaluate the translation
quality manually.
4 Although this book assumes a finite number of types of linguistic sounds, Kretzschmar Jr. (2015)
advocates that a phonetic system must be analyzed as a complex system. Research along this line
was reported in ?
Another random document with
no related content on Scribd:
O thou blossom of eastern silence,
Take thy ancient way untrodden of men.
Go on thy dustless path of mystery,
Reach thou the golden throne of God,
And be our advocate
Before His Silence and His compassionate speechlessness.