Professional Documents
Culture Documents
(Kyo Kageura) The Dynamics of Terminology A Desc PDF
(Kyo Kageura) The Dynamics of Terminology A Desc PDF
Series Editors
Marie-Claude L’Homme
Ulrich Heid
Consulting Editor
Juan C. Sager
Volume 5
Kyo Kageura
KYO KAGEURA
National Institute of Informatics, Tokyo
Acknowledgements vii
Introduction 1
Appendices 273
Appendix A. List of Conceptual Categories 275
Appendix B. Lists of Intra-Term Relations and
Conceptual Specification Patterns 279
Appendix C. List of Terms by Conceptual Categories 281
Appendix D. List of Morphemes by Conceptual Categories 295
Bibliography 303
Index 315
Acknowledgements
I started the research described in this book around 1990, when I began
working on conceptual descriptions of term formation patterns with Pro-
fessor J. C. Sager at the Centre for Computational Linguistics, UMIST.
Although the direction of my research has changed gradually and gained
in more mathematical technicality since then, all the theoretical aspects of
this book have their roots, either explicitly or implicitly, in my intensive
and intriguing discussions with Professor Sager. Just how indebted I am to
him for his constant encouragement and understanding, I find it difficult to
express.
From around 1995, as my theoretical standpoint on the study of termi-
nology took shape, I felt the necessity to augment my conceptual approach
with a quantitative approach on the descriptive front. I had the luxury of
thinking intensively about this in 1996, when I stayed with the Natural Lan-
guage Processing group at the Department of Computer Science, the Uni-
versity of Sheffield. I would like to thank Professor Yorick Wilks and all
the staff there.
One of the most important triggers for the development of an interpre-
tative framework for the quantitative part of my work was the International
Quantitative Linguistics Conference held in Helsinki in 1997, where I had
the opportunity to learn LNRE models from Dr. R. Harald Baayen of the
Max Planck Institute of Psycholinguistics, Nijmegen, the Netherlands. This
helped me to finalise the work described in Part III of this book. For that, I
owe him a great deal.
I knew the work of the "French connection" of computational termi-
nology well from their publications, but after meeting them face-to-face
in 1998 benefited greatly from the stimulation their intellectual input gave
to my work. Among them, I especially wish to thank Professor Chris-
tian Jacquemin of CNRS-LIMSI and Professor Béatrice Daille of IRIN, the
viii
31 May 2002
Tokyo, Japan
Introduction
In recent times there has been a growing interest in the study of technical
terms (henceforth simply "terms" for succinctness), as can be witnessed by
the publication of textbooks (Cabré 1993; Felber 1984; Picht & Draskau
1985; Sager 1990), of collections of papers (Rey 1995; Sonneveld & Loen-
ing 1993) and of a journal (Terminology), as well as regular conferences on
terminology such as Terminology and Knowledge Engineering and Com-
puterm. Despite this fact, the study of terminology, i.e. the theoretical and
applied study of terms as coherent systems of lexical items endowed with
a singular creative dynamism, is as yet neither clearly defined nor is there
general agreement about its scope.
A related problem is the fact that, while work concerning what is tra
ditionally known as the "theory" or "principles" of terminology is pursued
simultaneously with, but independently of, terminology-related NLP appli
cations, little effort is being devoted to theories underlying the descriptive
analysis of terms. Besides, most of what currently passes for a theoretical
foundation of terminology amounts to little more than a simplified, a priori
theory of conceptual structures supported by largely prescriptive principles
of what "should be" rather than what is the actual usage of terms.
This situation seems to reflect basic characteristics of terms, i.e. terms
manifest themselves as concrete linguistic objects within a specialised dis
course and their number is constantly growing. The fact that terms are first
and foremost concrete linguistic objects makes it difficult to define the the
ory of terms at a proper level of abstraction. Many so-called theories about
terms are really only theories of something — for instance, of concepts —
that can be used to describe terms. In addition, many studies treat only
a very limited number of terms, mostly for exemplification. The fact that
terminology (and the number of terms) is constantly growing, on the other
2 Dynamics of Terminology
The book is divided into four main parts. Part I (Chapters 1 and 2) is de
voted to clarifying the author's view of the object of the study as well as
defining his theoretical standpoint. On the basis of close observation of
terms and examinations of the existing theoretical studies of terms, it will
be argued that a theory of terms or terminology should deal with the ter
minology of a domain in its totality, because it is only with respect to indi
vidual domains that the very concept of "term" is consolidated. It will also
be argued that a theoretical study of a terminology should be accompanied
by the descriptive study of a terminology, for proper descriptive studies are
theories of terms. The concept of the dynamics of terminological growth
is then introduced, and the overall framework is illustrated by means of the
description of conceptual patterns of term formation, complemented by the
analysis of quantitative regularities of terminological growth.
Parts II and III are devoted to the detailed development of the
theoretico-descriptive framework for the study of the dynamics of termi
nology and the required concrete description of the actual manifestation of
this dynamics. Throughout these two parts, the Japanese terminology of
documentation (introduced below) is used in the analyses.
As a first step towards the characterisation of the dynamics of terminol
ogy, Part II (Chapters 3-6) is devoted to the description of the conceptual
patterns which determine the formation of terms within the chosen domain.
The basic aspects to be observed at the conceptual level are the relationships
Introduction 3
between terms and their constituent elements, the relationships among the
constituent elements, as well as the type of conceptual combinations used
in the construction of the terminology. In Chapter 3, the basic descrip
tive framework, as well as the elements necessary for the description of the
conceptual patterns underlying term formation, are discussed. Chapter 4
is devoted to a presentation of the conceptual categories discovered in the
analyses of terms. Chapter 5, then, examines the conceptual relationships
between constituent elements of terms. Chapter 6 describes the character
istics of term formation patterns in the field of documentation, which are
based on the concrete descriptive devices introduced in Chapters 4 and 5.
The description of conceptual patterns of term formation, to be exam
ined in more detail, has a logical limitation. If, by describing conceptual
patterns, we try to give necessary and sufficient conditions for the formation
of terminology, we end up listing all the combinations of linguistic items in
the terminological data used in the study. This, however, obscures the ob
servation of the dynamics and would reduce the study to a natural history
of existing terms. The description of conceptual patterns, therefore, must
necessarily remain somewhat general, at the level where the broad regulari
ties of term formation patterns in a domain can most properly be described.
This immediately leads to the loss of the fine granularity of the description.
Part III (Chapters 7-9) explores the quantitative analysis of the patterns
of terminological growth, which compensates for the limitation of the de
scription of conceptual patterns of term formation and thus completes the
description of the dynamics of terminology. There is a statistical method to
observe the growth patterns of lexical items in their entirety within a certain
category or set of lexical items. Applying this method, it becomes possible
to give a detailed description of the dynamics of the potential growth pat
terns of the terminology of a domain. In Chapter 7, the statistical method
is presented together with factors that should be taken into account in the
application of the method. Chapter 8 is, then, devoted to describing the
growth patterns of constituent elements within the terminology. Chapter 9
details the growth patterns of terms for each subset of terms, the formation
patterns of which were observed in Chapter 6.
Part IV (Chapter 10) concludes the study. By examining the theoretical
standpoint introduced in Part I with respect to the findings of the concrete
description carried out in Parts II and III, it evaluates what was achieved
and what should be explored further.
4 Dynamics of Terminology
The data
The present study takes the position that a theoretical work on terminology
should, as a logical requirement, be accompanied by a concrete description
of the terminology of a domain. As such, Parts II and III, together with
theoretical and methodological discussions, present the results of analyses
of terminology. For the concrete analyses, Japanese terminological data in
the field of documentation are used, a field of which the author has an in-
depth knowledge. The data are taken from Wersig & Neveling (1984), the
Japanese version of Wersig & Neveling (1976), a small but representative
terminological glossary in the field of documentation1. This is a multilin
gual glossary with the indication of related terms.
A few normalisations were applied to the Japanese entries of Wersig
& Neveling (1984), such as the normalisation of orthographic and minor
notational variants. Also, a single non-noun simple entry, (automatic),
was omitted. As a result, 1,228 terms were obtained.
There are no mechanically applicable criteria for delimiting constituent
elements or morphemes in Japanese, as Japanese lacks boundaries between
linguistic units such as spaces or hyphens. So the morphemes of terms
were identified manually, based on the criterion originally introduced by
Nomura & Ishii (1988), which reflects an average Japanese speaker's intu
ition about morphemes and was successfully applied to large-scale analysis
of Japanese terms. The method is briefly described as follows:
1. A minimal element is defined as the minimal linguistic element which
bears a meaning in current Japanese.
2. According to the origin of linguistic elements, i.e. wago (original
1
The idea of collecting terms from articles and texts was examined but discarded because
the available texts depended too much on the circumstantial tendencies and the data was
sparse from the point of view of collecting a representative set (sample) of terms. Among
reference sources such as glossaries, Wersig & Neveling (1984) was a semi-optimal choice
from the point of view of representativeness and balance of terms, partly because no better
alternatives existed. There were three other glossaries when the study was started: Mon-
busyou (1958) was too old and too prescriptive; Young (1988), being a translation of Young
(1983), was too biased to the U.S. library services and some Japanese terms were artificially
coined; and JIS Series (1989) was incomplete and, again, too prescriptive. A new glossary,
JSLIS (1997), which is better in its coverage and timeliness, has appeared since, but by
the time of its publication, the author's analysis based on Wersig & Neveling (1984) had
already been completed.
Introduction 5
Theoretical Background
Chapter 1
In this chapter, the basic status and nature of terminology within language is
clarified. Then, the traditional approach to the study of terminology is sum
marised and critically examined, and some recent developments in the study
of terms are briefly introduced. This chapter is intended to give readers the
basic background against which the theoretical position of the present study
is outlined. This will be elaborated in Chapter 2.
Any discussion about the basic status, nature and function of terms within
language must start with a provisional definition of "term" and immediately
related concepts. According to Bessé, Nkwenti-Azeh & Sager (1997) — a
compact and convenient glossary of the expressions of this field — "term"
and "terminology" are defined as follows:
term : A lexical unit consisting of one or more than one word which rep
resents a concept inside a domain.
terminology : The vocabulary of a subject field.
Two sets of expressions are important in these definitions, i.e. "lexical
unit" and "vocabulary" on the one hand and "concept inside a domain"
on the other. "Lexical unit" and "vocabulary" are conventional linguistic
notions defined adequately in many dictionaries of linguistics. They need
not concern us further at the moment.
10 Dynamics of Terminology
terms to be the study of terms. For instance, observing the ratio of various
forms of terms in a representative corpus of terminology, though simple,
would be useful to attribute the observation of forms of terms to the study of
terminology (cf. Pugh 1984). Any theoretical work on terminology should
be at least conscious of the fact that the level at which the category "term"
and "terminology" is consolidated is different from the levels at which the
category "word" is recognised, even though as empirical objects terms are,
like words, manifested as lexical items. We will come back to this point
later in 1.4.
As the concept "term" belongs to the level of parole, the basic characteris
tics of terms should be and can only be observed at the level of parole. So
let us examine what seem to be the basic characteristics of terms in actual
usage.
"As linguistic signs, terms are a functional class of lexical units" (Sager
1998), and the basic function of terms is to express more sharply delineated
meanings identified as necessary within a particular domain by the com
plexity and number of concepts that have to be clearly distinguished. From
the angle of specialised discourse, we can state that some meanings of lex
ical units are consolidated by clarification and narrower determination in
order to satisfy the degree of specification required by the domain in which
they are used. Roughly speaking, it is in this way that lexical units become
the 'terms' of the domain.
Thus the division between general words and terms as empirical ob
jects is not rigid. As Sager (1998) states, "it can happen that non-specialists
consider a word to be a term which is, however, only a general word for
the specialist; equally, it can happen that specialists use terms which their
non-specialist audience take to be words in the general language". What is
more, "the possibility of many lexical units to function both as words and
as terms may even be a question of individual choice and interpretation of
the speaker and listener". Individual terms constantly interact and inter
sect with general words because they share the same linguistic forms (see
also Budin & Oeser 1995). So the particular range of terms representing a
domain is fluid.
Basic Observations 15
On the other hand, however, in the modern world where science and
technology become more and more specialised, we tend to regard terms and
terminologies as having a clear and independent status within our language
experience. This tendency is accelerated because scientific and technical
discourse is spreading more and more into our daily communication due
to the arrival of the "information age". That terminology has a clearly ob
servable existence reflects an aspect of the truth, because the terminology
of a domain is a representation of the systematic part of the knowledge
of the domain, as can most typically be observed, for instance, in the ter
minology of mathematics. The terminology of a domain, therefore, taken
independently, can be regarded as having its own structure, representing the
"concepts" of the domain3.
The fact that terms are located within the tension between the need
for efficient communication and the requirement of representing the con
cepts of a domain makes terminology somewhat unique as a linguistic phe
nomenon. To the extent that the functional requirement of terminology is to
gain the precision necessary for expressing restricted meaning, terminology
tends towards stronger systematisation of its internal structure, not only in
its function of creating stable relations between lexical item and meaning,
but also in linguistic form. This makes terminology approach the nature of
the lexicon of artificial languages, which are characterised by rigidity and
systematicity of reference and designation (Hoffmann 1979; Sager 1994).
At the same time, to the extent that terminology shares its linguistic form
with the general vocabulary, it tends towards using the full flexibility of
natural language, not only in its lexical-formal dynamics but also in its ca-
pacity of establishing dynamic relations between lexical items and mean-
ing. This dynamic force, inherited from natural language, is strengthened
by intersecting with general-language words in real discourse. Figure 1.2
illustrates this specific characteristic of terminology within the dimension
of natural/artificial language.
3
As such, from the point of view of linguistics, "concepts" are clearer and more re
stricted meanings specific to a domain, though this alternative to a definition includes an
element of circularity.
16 Dynamics of Terminology
Now that we have seen the basic position and characteristics of terminol
ogy, it is convenient to examine the existing perception of what constitutes
the theory of terminology. We start the discussion with the examination
of the "Vienna" school of terminology (Felber 1984). It is undeniable that
this school, originally based on the work of Wüster (Wüster 1959/60), con
tributed to opening the research field of terminology. The "Vienna" school
has also strongly asserted claims for the independence of terminology as
a separate discipline, with its own theory and methods. As a result, some
textbooks of terminology devote significant space to the explanation of the
"Vienna" school or Wüster's ideas (Picht & Draskau 1985; Cabré 1993). In
addition, many of the current developments started from a critical exami
nation of the propositions made by the "Vienna" school.
1. "Any terminology work starts with concepts. It aims at the strict delimi
tation of concepts. The sphere of concepts is independent of the sphere
of terms."
2. "Only the terms of concepts, i.e. the terminologies, are of relevance to
the terminologist, not the rules of inflections and the syntax."
3. "The terminological view of language is a synchronic one, i.e. for ter
minology the present meanings of terms are important. For terminology
the system of concepts is what matters in language."
In short, the traditional theory of terminology addresses the relation
between concepts and terms, starting from concepts and focusing on the
present state of the conceptual structure and its representation. In this
framework, it is "concept" that takes a crucial role. "Concept" — "the
cornerstone of the GTT (general theory of terminology) and the starting
point of any terminology work" (Felber 1984: 102) — is defined as an
"element of thinking", which "consists of an aggregate of characteristics",
which "themselves are concepts" (Felber 1984: 103).
In support of these claims concerning the theoretical study of terms,
Felber (1984: 98-99) points out three peculiarities of the nature of termi
nology:
1. "Terminologies are deliberate creations. In common language the stan
dard is the usage of language... In terminology the free play of language
would lead to a chaos."
2. "The standardisation of single terms requires unified translinguistic
guidelines."
3. "Preference of the written form to the phonic form."
one aims at. The fact that terms are observed in parole is a clear indication
that they can be studied diachronically.
The three peculiarities of the perception of the nature of terminology
also reveal the excessive restriction of the scope of terminology in the tra
ditional theory.
Among them, the second point, i.e. standardisation, is of little rele
vance to the theoretical study of terminology. The standardisation of terms
is by its very nature prescriptive and cannot be part of what we currently un
derstand by "theory". The process of standardising terms can theoretically
be studied as a(n external) factor to terminological phenomena, probably as
a kind of terminological socio-politics, but there is no room in any theory
of terms to incorporate this sort of study.
The claim that "terminologies are deliberate creations", on the other
hand, reflects an element of truth. As we discussed in 1.1.3, the unique
position of terminology is characterised by the combination of two contra
dictory factors, i.e. the quest for systematicity and flexibility. If, regarding
terminology as being close to artificial nomenclatures, the systematic as
pect of terms is dealt with in the study, it is useful to adopt the operational
characterisation that "terminologies are deliberate creations". Combining
the claim of systematicity in term creation with standardisation as can be
observed in Felber (1984: 98) is, however, not only irrelevant to the scien
tific study of terms but also harmful to the practical aim of standardisation.
This is because useful standardisation should be based on the observation
of the objects to be standardised. Linking the claim of the systematicity of
terminology to the quest for standardisation may well result in an estrange
ment of the standard from the reality. The validity of the assumption of
deliberateness must instead be examined with reference to the actual de
scriptive results of a scientific study.
A preference for the written form also seems to reflect a part of the
nature of terms. In comparison to words, the written forms of technical
terms are relatively more important than the phonic form. But this does
not automatically mean that consideration of the phonic form should not
be included in the theoretical framework. In practice it may be the spoken
language which encourages the creation of terminological variants. Al
though the present study is based on the written form, some interesting and
important aspects of terminological phenomena may be explained by their
phonetic characteristics.
20 Dynamics of Terminology
5
It was on the basis of this recognition that various studies about terms, such as the rules
of inflection or syntax, were defended in 1.2.2. Once it becomes clear that the relationship
between concept and terms in traditional theory does not have any privileged status, we lose
the argument for excluding various other aspects as a study about terms.
24 Dynamics of Terminology
Figure 1.3. The conceptual study of terms and the semantic study of words.
to do with terminology in any inherent sense6; they are not concerned with
the position of concepts within the theory of terminology.
We are here facing an essential problem concerning the theory of ter
minology, as opposed to the theory of something that can be used for the
description of terms or terminology. The use of "concept" in the descrip
tion of terms, though essential, does not automatically make a study about
terms a theory of terms or terminology. The simple incorporation of more
advanced theories of concepts does not help either. The problem resides
in the theoretical framework of terminology in which "concept" is located,
rather than which theory of "concept" is used in the description of terms.
This recognition will be the starting point of the next chapter, where the the
oretical and methodological framework of the present study is elaborated.
6
Interestingly, the title of Temmerman's article (Temmerman 1998/99) is "Why tradi
tional terminology theory impedes a realistic description of categories and terms ...", where
the theory, which in fact is of concepts (which Temmerman compares with a theory of
understanding), is useful for the description of terminological phenomena. It is implied
here that terminological theory is the theory of something which is used for the description
of terminology. Incidentally, Bessé, Nkwenti-Azeh & Sager (1997) explicitly introduce
domain-specificity in their definition of concept (see 1.1.1) to avoid the problem examined
here at the level of definitions, but they do not say anything about how domain-dependent
concepts can be differentiated from ordinary concepts.
Chapter 2
In 1.1.2, we saw that terms are functional variants of words. Thus "for
mally terms are indistinguishable from words" (Sager 1998/99). This of
fers a good, though implicit, starting point for understanding the status of
the "theoretical foundation" of terminology. That terms are functional vari
ants of words reflects the essential nature of terms, not only of terms as
empirical objects but also of "term" as a category, as we postulated from
the viewpoint of the epistemological conditions, on the basis of which a
discussion about terms becomes possible in the first place.
The essential point is that terms (and terminology), being a functional
variant of words at the level of parole, constitute an aspectual category of
the frame category of "lexical unit" or word"1. Thus "formally terms are
indistinguishable from words" simply because the very category of term is
not consolidated from the formal point of view. We may even go so far as
to claim that formally terms are words. This partly explains why the word
"word" was used in the definition of "term" by Bessé, Nkwenti-Azeh &
Sager (1997) (see 1.1.1).
Two problems immediately arise from the fact that terms constitute an
aspectual category. Firstly, the fact that a study is based on lexical items
that fall into the category of "terms" does not mean that any claims drawn
from the data are of terms and terms only. Secondly, some characterisations
of terms, even if they are true concerning terms, may not be meaningful.
For instance, assume that we discovered empirically, on the basis of the
observation of terms, that the part-of-speech categories for words can be
used to describe the formal patterns of terms. This observation, though
true, applies not only to terms but also to words in general. What is more,
it adds nothing new to our current state of knowledge, because this can be
deduced on the basis of the position of the category "term" with respect to
1
The situation is simplified slightly and a small group of terms, including the Latin
names of biology or geology and chemical and pharmaceutical names, is ignored.
Theoretical Framework 27
2
Note that the theory of physics never asks what the physical objects are outside the
theoretical description of the physical phenomena themselves, i.e. how they are. In termi
nology, there has been too much talk about "What is terminology?" or "What is a concept?"
unaccompanied by actual studies of terms.
3
In the traditional theory of terminology, "concept" is regarded as having this privileged
characteristic, which was falsified in 1.4.
Theoretical Framework 29
On the other hand, if the study aims at certain kinds of generalisation, then
it tends to overgeneralise and the results of the study would no longer be
attributable exclusively to terms or terminology. What was discussed in 1.4
about "concept" was in fact an example of this general problem.
One way to deal with this difficulty is to introduce a comparative view
point and discuss and establish the borderline between terms and non-terms
or words, e.g. contrast the difference or non-difference between terms and
words from some common points of view. For instance, Miyajima (1981)
introduces a comparative point of view to the study of terms with general
words. What makes Temmerman (2000) interesting as a study of terminol
ogy is that the concrete descriptions in the study are related to a particular
aspect of the borderline of terms and non-terms.
Still, even in studies concerned with the borderline between terms and
words, the problem of the range of possible generalisations remains. Unless
one knows in advance the range of terminology to which a discussion on the
basis of exemplar terms is relevant, one has to continue analysing concrete
examples ad infinitum.
Would, then, the only task of the "theoretical" study of terms and ter
minology be to keep incorporating, or developing, various viewpoints in
accordance with the development of related fields for describing termino
logical data? And would such a "theoretical" study continue to analyse
individual terms without knowing the range of possible generalisations?
If one is not satisfied with such a prospect, then it is necessary to ex
amine what needs to be taken into account in order for a study based on
terms to become a theoretical study of terms. This should be addressed first
and foremost with respect to the logical status of the theory vis-à-vis ter
minology. To do this, it is necessary to go back to the fundamentals which
determine the status of the theory and to ask "What is terminology?" but
this time from a different perspective. This is our next task.
ogy of a domain4. If one wants to carry out a theoretical study of terms from
the point of view of concepts, the emphasis should be put on the concepts
of the target domain, rather than concepts in general.
Two types of theoretical studies can be distinguished that satisfy the
condition for a theory of terminology:
1. Studies which are concerned with characteristics of individual terms.
In these studies, ideally speaking, the descriptions should be based on
some characteristics which are common to and only to the terms of the
domain. Or alternatively, a comparative point of view should be intro
duced to show that the characterisations of terms are differentiated from
those of lexical items in general.
2. Studies which are concerned with characteristics of the terminology of
a domain as a whole. Though these studies can constitute theories of
terminology de jure, how meaningful they can be depends greatly on the
granularity of the descriptions.
The present study takes the second approach, i.e. to characterise the termi
nology of a domain.
The objective of the present work is the description of the dynamics of ter
minology or, more precisely, the dynamics of the terminology of a domain
in its totality, not of individual terms. Below, the basic perception of the
two key concepts, i.e. "terminology" and "dynamics", are clarified first.
The methodological framework will then be sketched.
ple used in the study; in the process of description, we would obscure the
observation of dynamics.
So the description of conceptual patterns must necessarily remain
somewhat general, at a level where the general regularities of term forma
tion patterns in a given domain can most properly be described. This, how
ever, immediately means that we lose the fine granularity of the description.
As a result, the range of the possible terms expected within the conceptual
patterns of term formation would necessarily remain rather broad. This is
not desirable — even if the form of conceptual description is so defined
that the resultant characterisations can be said to be of terms de jure — for
a proper theory should aim at gaining due granularity.
The natural question that arises at this point is: Beyond the level of
the granularity logically and theoretically required in the conceptual ap
proach, is there any way of describing more fine-grained regularities of the
dynamics of terminology? The answer is yes, though the type of regularity
that can be grasped becomes inevitably different from that observed in the
conceptual approach.
2.2.2.2.2 Quantitative patterns
In order to analyse the fine-grained regularities of the dynamics of terminol
ogy beyond the level that can be addressed by the description of conceptual
patterns, we will explore the quantitative patterns of the occurrence of lex
ical or conceptual items as a mass, within the conceptual patterns.
The current development of quantitative linguistics has shown that,
given a certain amount of data, it is theoretically possible to describe the
growth patterns of lexical items in their entirety within a certain category
(Good 1953; Good & Toulmin 1956; Chitashvili & Baayen 1993; Kageura
1998b), including the potential lexical items. Applying this framework,
and regarding individual lexical items as representing more specific con
cepts within the general conceptual categories, it is possible to give a more
detailed description of the dynamics of the potential directions of growth
of the terminology of a domain.
The quantitative analysis must be based on the basic features or view
points that reflect the conceptual and linguistic categorisations relevant to
the phenomena of term formation of the domain, for it also needs to be
linked with the totality of the terminology of the domain. In the present
study, priority will be given to the conceptual categorisation. In that sense,
38 Dynamics of Terminology
we should keep in mind the following claim that "the quantitative and quali
tative sides of human speech are correlated and interdependent" (Piotrowski
1968; cited in Tuldava 1995: 133). This is not only relevant for the obser
vation of language phenomena but also for the methodological restrictions
of the quantitative analysis.
2.2.2.2.3 A representative sample
Analyses cannot be carried out properly without a representative sample of
the terminology of the target domain. As described in the Introduction, rep
resentative terminological data of documentation will be used in the present
study. So the present work would be a theory of term formation and termi-
nological growth in the field of documentation.
2.2.2.3 Limitations
The current approach has two major limitations. The first is of the kind
discussed in the previous chapter with respect to the traditional approach.
The internal characterisation of terminology — even if it properly describes
the essential nature of the terminology of the domain — does not in itself
provide sufficient conditions to distinguish the terminology of the target
domain from the general vocabulary or the terminologies of some other
domains. This is an essential limitation which is caused by the facts that
terms are a functional variety of lexical items recognised at the level of
parole only and that they constantly interact with general words. What
is necessary in order to differentiate the terminology of one domain from
others, then, is to describe the terminologies of a number of domains and
then compare them. There is no alternative to this. What is important at
the first stage is to carry out the descriptive study of the terminology of a
domain in such a way that the resultant description can at least be said to
be attributed to the terminology of the domain de jure. At a later stage, it
may become necessary to compare the formation patterns of terminologies
of different domains or of general words, but this can only be done after
the formation patterns of the terminology of individual domains have been
clarified.
The second limitation is directly related to the choice of theoretical
position and methodology. We only focus on the systematicity of terminol
ogy, and thus various aspects relevant to the observation of the totality
Theoretical Framework 39
study would deal with the story of the historical origin of the terminology
of a domain and the actual historical evolution of the terminology to the
present state. It might also talk about its possible evolution for the future.
This type of study contrasts with the present study in that the former ad
dresses the diachronic concept of dynamics.
We can also trace the historical profile of individual terms (cf. Sakakura
1966). In this case, the concept of dynamics would be more strongly related
to the actual historical discourse in which a term has been used. When in
history a particular term is created, in what discourse and how the concept
which the term represents has changed, etc., would be some typical ques
tions to be answered in this type of research.
The study of the synchronic dynamics of individual terms in discourse
seems a slightly twisted notion, because the introduction of discourse tends
to imply the introduction of the real-world time scale. We can conceive,
however, the study of a common metaphoric transformation or a meaning
shift observed in individual terms in the present state of discourse as an
example of the study of the synchronic dynamics of terms in discourse, on
condition that these dynamics are attributed to the historically abstracted
structure of discourse.
Figure 2.2 illustrates the distinction between the structural study of the
dynamic potentiality of terminology, the synchronic study of the dynamics
of individual terms, the historical study of the evolution of individual terms
and the historical study of the evolution of terminology. In the reality of ter
minology, these aspects interact with one another. For instance, the actual
use of individual terms in discourse may change the structural dynamics of
terminology, while the structural dynamics in its turn restricts the direction
of the creation of new terms and the use of existing terms.
So the overall dynamics of terminology would only be theorised as the
interaction of these different perspectives of dynamics. Within the over
all configuration of the study of the dynamics and growth of terminology,
the present study, i.e. the study of the structural dynamics of terminology,
constitutes a small but essential part10.
10
Related discussion on the relation between synchronic and diachronic aspects in the
structural approach can be found in Vachek (1983) and Jakobson (1980).
42
Dynamics of Terminology
Although simple and complex terms are historically created not only
by compounding but also by abbreviation, metaphorical transformation,
etc., these methods of individual term formation will not be taken into
consideration. Instead, it is assumed that they are created according to
the simple building blocks of compounding on the basis of the constituent
elements1, while the creation of individual terms is carried out within, and
controlled by, the overall structure of terminology. This is an obvious sim
plification, but we can nevertheless adopt this assumption on the basis of the
rationale that, whatever profile individual terms have, the collective dynam
ics of terminology functions in such a way that the resultant terminologi
cal structures have the overall systematicity as manifested in the building
blocks of their constituent lexical items.
In 1.4, it was argued that in many cases conceptual studies of terms cannot
be distinguished from semantic studies of words. To avoid this problem and
to make a study's descriptions be of terms, a proper descriptive framework
for term formation is required.
Take the worst exemplar case within this framework, in which only the
conceptual restrictions of formation patterns are described. The resultant
description will then consist of a list of statements such as: Constituent
elements that represent such and such a type of concept may be combined
as a nucleus with elements representing such and such a type of concept
as a determinant. Even if the description is obtained through the analysis
of a set of terms of a domain, it cannot be claimed that such a study is
exclusively about the formation patterns of terms of that domain and not
about general word formation that happens to be based on terms.
If some clearly domain-dependent phenomena are observed in the de
scription, it can be seen that the description at least reflects the nature of
term formation as distinct from word formation in general. Even then, the
overall form of description does not guarantee that the resultant description
is of terms.
according to our claim that term formation should be described within the
overall terminological structure, the classificatory aspect of term formation
is essential. The conceptual system — as opposed to fragmentary con
ceptual elements — in the description of term formation patterns formally
allows us to take into account the classificatory aspects of the dynamics of
term formation.
The difference between this descriptive framework and the semantic
description of word formation can be illustrated as in Figure 3.1. Note that
this descriptive framework still presents only the minimum necessary con-
Conceptual Patterns of Term Formation 51
ditions for the study of term formation with respect to the theoretical status
of a descriptive framework. It does does not guarantee a sufficient charac
terisation of terminological phenomena, as distinct from general phenom
ena of word formation, at the level of actual description. This is one of
the reasons why the description of conceptual patterns of term formation
should be complemented by a quantitative description of terminological
growth, which sheds more detailed light on different aspects of the dynam
ics of term formation.
All in all, the formal and linguistic elements to be used in the descrip
tion of term formation patterns are as follows:
1. Distinction between determinant and nucleus.
2. Representation of dependency structure of complex terms. This includes
the distinction between intra-term relations (or conceptual specification
patterns) that form a term and those that do not, the latter of which will
be called local or secondary combinations.
3. Distinction between constituent elements that can be terms and others.
These are subsidiary to the conceptual elements used in the description.
this study should reflect human cognitive activity as limited by the target
domain. Referring to cognitive or psychological studies of concepts is also
justified on this basis.
What is adopted here as a descriptive device is, be it a mental or social
construct, something that "exists", in order that the concept should have a
minimal explanatory substance. Leaving the problem of what "concept"
is to the safe hands of philosophers and psychologists while keeping the
problem of what "concept as used in the study of terminology" is to the
present study, concrete studies concerning concepts will be freely referred
to in the discussion of concept structure and type.
It is relevant here to mention the distinction between the term "con
ceptual" and "semantic". In many cases, these two terms are not distin
guished; thus Collins & Quillian's (1969) "semantic" memory is frequently
discussed in the study of concepts (e.g. Roth & Frisby 1986: 39-43). On
the other hand, Jackendoff (1983: 95), for instance, describes "semantic"
as a term which is used to refer to linguistic properties which explain such
phenomena as synonymy, anomaly, etc. of sentences and "conceptual" as
a term which refers to the level at which linguistic and non-linguistic in
formation are mutually compatible (although he himself does not rigidly
distinguish these two terms in his own writing). Trying to distinguish these
two is neither useful nor necessary from the perspective of the present study,
as should be clear from the discussions in Chapter 1. In this book, the
term "conceptual" is preferred, as the term "semantic" gives the impression
that anything referred to by this term is closely related to or dependent on
language structure as perceived from a "theoretical synchronic microlin-
guistic" point of view (Lyons 1981) and excludes other aspects related to
"concept".
viewpoint and describes the relations as the role of the nucleus with re
spect to the determinant, if only for the purpose of description. Ishii (1986)
takes a verbal morpheme-centred approach, i.e. the relations are seen as
the position or role of the non-verbal morphemes with respect to the verbal
morphemes5.
The intra-term relations are defined in this study as the status or role of
the determinant with respect to the nucleus. This is a natural consequence
of our standpoint that we observe term formation not just from its syntag-
matic combination patterns but from the point of view of the overall system
of terminology formation. To capture the patterns of term formation within
the overall classificatory system of terminology formation, it is a basic tenet
to see term formation as the specification of concepts within a conceptual
class, as represented by the nucleus, by means of modifications represented
by the determinants. Figure 3.3 illustrates this.
5
This is possible because Ishii (1986) dealt only with complex terms that have verbal
morphemes as a constituent element. Note that this verb-centric view is common in auto
matic processing of complex words (e.g. Finin 1980; Fujita 1984; McDonald 1982).
60 Dynamics of Terminology
There are many studies in various fields concerned with conceptual cat
egories. One of the earliest efforts to establish a set of conceptual cate
gories was made by Aristotle, whose categories include Substance, Quan
tity, Quality, Relation, Time, Position, State, Activity and Passivity (Aristo
tle 1963). Since the time of Aristotle, every science has every so often felt
the need to identify the conceptual categories with which it operates, and
philosophers of science, notably Hempel (1952) and Achinstein (1968),
have tried to define the concepts with which natural science operates.
Conceptual structures, however simplified, also underlie documenta
tion thesauri and classification systems built by library and information sci-
62 Dynamics of Terminology
entists (e.g. Aitchson & Gilchrist 1997; Ranganathan 1962; Chan 1994).
Ranganathan's famous PMEST categories in his Colon Classification sys
tem — i.e. Personality, Material, Energy, Space and Time — can be re
garded as the broadest conceptual categories. Datta & Farradane (1978)
propose a simple scheme for general classification consisting of three basic
types, i.e. Entities, Activities and Abstracts, with a fourth type, Proper
ties. Dahlberg (1978), emphasising the verbally observable units of knowl
edge, introduces the following four categories: Entities (principles, imma
terial objects, material objects), Properties (quantities, qualities, relations),
Activities (operations, states, processes) and Dimensions (time, positions,
space).
In linguistics, the conceptual approach has reappeared with the recog
nition of the limitation of formal approaches. For instance, Jackendoff
(1983; 1987; 1990) — who in his conceptual semantics assumes the ex
istence of a unified abstract level (conceptual level) of linguistic meaning,
perception, etc. — introduces the following categories (which he calls onto-
logical category features): Object, Place, Path, Action, Event, Sound, Man
ner, Amount, Number, Property, Smell and Time (Jackendoff 1987: 148-
152). Some linguistic thesauri also introduce conceptual categories. Koku-
ritsu Kokugo Kenkyusho (1961), which is a linguistic thesaurus based on
semantic principles, first divides the categories according to syntactic word
class and then introduces the conceptual categories: Abstract Relations,
Agents of Human Activity, Human Activity - Mind and Action, Products
and Tools, Natural Objects and Natural Phenomena.
Terminology is one of the sub-fields of linguistics which emphasises
the importance of concepts and conceptual structures. Sager (1990), in his
introductory book to terminology, recognises four basic categories, i.e. En
tities, Activities, Qualities and Relations. Pugh (1984) subdivides the entity
concepts into Material, Abstract, and Neutral. In addition, she introduces
two entity categories, i.e. Representational and Software, which are depen
dent on the field of information processing that she investigated. Sager &
Kageura (1995) further elaborate terminological concept systems along the
same line.
In the field of artificial intelligence, computational linguistics and cog
nitive science, many conceptual categorisation schemes are established un
der various names. Most of them are intended for specific systems, but
some are intended for general use. Initial efforts in artificial intelligence
Conceptual Categories for Documentation Terms 63
current study, what is aimed at is to obtain categories that map the concep
tual regularity of formation patterns of documentation terms as closely as
possible. Below, we first introduce broader concept categories in a rather
top-down manner, referring to the schemes briefly reviewed in the previous
section, and then establish finer categories through examining documenta
tion terms and their constituent elements.
1
This can be regarded as a mapping of linguistic function to the conceptual system, i.e.
quasi-concept.
Conceptual Categoriesfor Documentation Terms 67
2
At the same time, however, note also that we are trying to keep our criteria of concept
subcategorisation as general as possible. This is to avoid falling into the trap of tautol
ogy, i.e. establishing convenient categories purely to demonstrate the formation of terms in
the corpus to be consistent and coherent, which would exclude the chance of grasping the
dynamics of term formation.
3
In the process, in fact, it was decided that some ambiguities were collapsed. For in
stance, the place/organisation ambiguity of such morphemes as (library) was
recognised as irrelevant as it does not affect the description of regularities of term formation
patterns. For a related discussion, see 4.3.1.
Conceptual Categories for Documentation Terms 69
4
Though this might be better dealt with as a problem of polysemy, in the end,
(index) was classified under representational entity.
70 Dynamics of Terminology
these criteria were decided at each level of categorisation vis-à-vis the data
and the general rationale of the conceptual categories.
way that the allocation is not totally dependent on their specific use in and
only in the data. Most morphemes clearly fall under one of these cate
gories, but some ambiguities and doubtful cases were recognised7. These
are as follows:
1. For some morphemes, the distinction between material and representa
tional documentation entities was difficult; e.g. (index) and
(bibliography) can be material or representational depending on use. In
stead of assigning two categories, a single category has been allocated as
far as possible, in order not to lose the overall picture of formation pat
terns by being too fine-grained for individual cases. The validity of the
7
The ambiguities and doubtful cases discussed here are only with respect to the concep
tual system established here, i.e. only when they are expected to affect the overall descrip
tion of conceptual regularities of term formation patterns.
82 Dynamics of Terminology
The tenus "type" and "token" here are used in the technical sense of quantitative lin
guistics (Herdan 1960).
Conceptual Categories for Documentation Terms 83
disciplines. They do not constitute a large source for the terminology but
tend to be used repeatedly.
Table 4.12 shows the quantities of morphemes of QL and RL at the first
level of subcategorisation. The similarity of these two categories is obvious.
In both cases, labels of qualities and relations are small in number while
frequent in their use, i.e. in a way, they are closer to entity concepts (see
Table 4.10), while QL2 and RL2 accommodate large numbers of morpheme
types with much lower levels of use.
Conceptual Categories for Documentation Terms 85
mation vis-à-vis conceptual categories. Table 4.16 shows the number and
ratio of simple terms, terms with two morphemes, three morphemes and
four or more morphemes for each broad conceptual category. Some cate
gories, i.e. abstract entity (AE) and, to a lesser extent, material entity (ME),
tend to accommodate longer terms, while some, such as classificatory entity
(CE) or relation (RL), accommodate a higher ratio of shorter terms.
complex terms with more than three constituent elements. Formally, there
are 458 embedded combinations in the corpus, i.e. 283 in three-morpheme
terms, in four-morpheme terms, in five-
morpheme terms, and in six-morpheme terms.
As the present study concerns the term formation patterns and not the
combination patterns of items, embedded combinations are, in principle,
regarded as single units, just like simple morphemes. However, not all
the embedded combinations can be regarded as such. Firstly, as will be
shown in Chapter 5, there are non-conceptual combinations, which will be
ignored. Secondly, some embedded combinations constitute a part of term-
forming conceptual specification patterns (see 3.3.2.2 and the next chapter).
Excluding these, there are in total 281 tokens of embedded combinations
that should be treated as complex morphemes. They comprise 201 types.
Table 4.17 lists the number of these embedded combinations by concep
tual category. The categories with an asterisk show that the combinations
consist of units which represent different subcategories of equal status. The
table also shows the number of types that occur as terms. In the descriptions
of term formation patterns in Chapter 6, these embedded combinations are
regarded the same as simple morphemes.
90 Dynamics of Terminology
As the second and last step of preparation for the observation of the concep
tual patterns of term formation, this chapter introduces intra-term relations
and conceptual specification patterns. As in the previous chapter, some
related studies are briefly reviewed first, and then the binary intra-term re
lations and conceptual specification patterns arrived at for the corpus of
documentation terms are explained.
There have been many studies of Japanese complex nouns. While many
analyse and describe complex words at the grammatical level (e.g. Nomura
1973; Saito 1981; Nagashima 1980; Okutsu 1975; Kageyama 1982; 1989;
Tamamura 1985), only semantically- or conceptually-oriented analyses and
descriptions are reviewed. Note that the studies reviewed are not limited to
the study of terms.
Saiga (1957), in his examination of the grammatico-semantic patterns
of complex words based on data collected from Japanese magazines, intro
duced the following relations:
- Coordinate relation, in which two constituent elements are coordinated
with the same status. This is further divided into synonymous coordina
tion and antonymous coordination.
- Subject-predicate relations, in which the first constituent functions as the
subject of the second element.
92 Dynamics of Terminology
For the compounds which designate objects, the following nine rela
tions are defined: (a) comparison, i.e. the determinant compares the nucleus
to another object; (b) material, i.e. the determinant specifies the material of
which the nucleus is made; (c) inherent property, i.e. the determinant ex
presses an inherent property of the new concept which is not inherent in
the nucleus; (d) use, i.e. the determinant specifies the use to which the nu
cleus is regularly put; (e) origin/product, i.e. the determinant expresses the
product which is regularly associated with the nucleus, and the determinant
indicates the origin of the nucleus by specifying the primary product; (f)
instrument, i.e. the nucleus expresses the instrument which operates on the
determinant; (g) mode of operation, i.e. the determinant specifies the mode
of operation of the nucleus; (h) whole/part, i.e. the determinant is the whole
of which the nucleus is a part; and (i) identity, i.e. the determinant asserts
identity with the nucleus.
They also describe briefly the compounds designating properties and
those designating processes or operations. Compounds designating proper
ties are formed by the determinant specifying the concept which the prop
erty term is related, and compounds designating processes or operations are
usually created by the determinant specifying the subject, the object or the
instrument of the corresponding action.
Pugh (1984), analysing formation patterns of terms of information pro
cessing, introduced 18 intra-term relations:
- Destination: The new concept is a class of (the concept represented by)
the nucleus which is used for or intended for (the concept represented
by) the determinant1. This relation can be classified into direct and indi
rect function. An example of direct destination is "checking program",
and of indirect destination is "computer instruction".
- Means/Mode of Operation: The new concept is a class of the nucleus
which functions or occurs by means of or in the mode of the determinant.
This also can be divided into direct, e.g. "dichotomising search", and
indirect, e.g. "asynchronous computer".
- Affected Object: The new concept is a class of the nucleus which acts
on the determinant. This can also be divided into direct, e.g. "data pro
cessing", and indirect, e.g. "data processor", relations.
- Partitive: The new concept is a part represented by the nucleus of a
1
Henceforth the phrase "the concept represented by" is omitted for succinctness.
Conceptual Specifìcation Patterns 95
2
Ishii (1986) is excluded here as his verb-centric viewpoints are limited in range and are
very different from the other five.
Conceptual Specifìcation Patterns 97
The type of diversity in USE, AFO, PRO, MEA, MAN, ICR and ORI,
which are all related to case-like roles centred on activity-type concepts,
is related to the direct and indirect relations explained in Pugh (1984). In
all these cases, they can be clarified from the point of view of the roles
of the nuclei. For instance, both (reference book) and
(call number) are recognised to have intra-term relations of USE, though
the role of the nucleus vis-à-vis the determinant in the former case is that
of direct object, while the role of the nuclei in the latter is that of tools or
instruments. Other relations have the same sort of direct and indirect varia
tions. Collapsing the distinction between the direct and indirect relation is
justified because the intra-term relations are defined in terms of the role of
the determinant vis-à-vis the nucleus, which in turn is anchored to the over
all conceptual system; the term formation patterns are observed within the
conceptual system to which the different categories of nuclei are anchored.
The internal diversities of FOA and NAT come from the generic nature
assigned to them. FOA can, for instance, be further divided into size, shape,
composition, etc., but we did not adopt it here for the practical reason that
it was not necessary to go down to this level for clarifying the characteristic
patterns of term formation. NAT can include internal diversities because of
its dustbin nature.
5.2.2.1.2 Affnity of relations
Though at the level of definition the ranges of the relations are reasonably
clear, in the process of assignment it was recognised that some relations are
rather close to each other. The three pairs, i.e. FUN and USE, PRO and
PAR, and PAR and LOC, deserve some comments.
Firstly, the following examples show the affinity and the difference be
tween FUN and USE:
FUN : (retrieval system)
USE : (copying technique; reprography)
In the second case, the nucleus (technique) is used for (copying).
On the other hand, (retrieval system) can be interpreted in
two ways: (a) a system which retrieves something and (b) a system which
is used by somebody for retrieving something. In the first interpretation, the
nucleus is the central focus from where the role of the determinant is inter
preted, while in the second, a third party is implied. Because the naming
process in principle starts from the core concept (nucleus), the first inter-
104 Dynamics of Terminology
pretation is theoretically better than the second one, a criterion which was
consistently maintained in the assignment.
The following example shows the affinity of PRO and PAR:
PAR : (entry word)
Taken independently, the determinant, (entry) is a product made up
of the nucleus (word), thus it is a product. On the other hand, the prod
uct and its constituent elements naturally make up a part/whole relation.
In such cases, the relation PAR is preferred because the dynamic aspect
implied by PRO is weak.
In a few terms in the corpus, PAR and LOC became potential can
didates, as in (library person; librarian). The relation PAR was
preferred in this and similar terms, as in the field of documentation "library"
is understood to be an organisation rather than a simple place.
MAN (manner) in the terms that represent activity concepts were to
gether recognised as constituting a single specification pattern, in which
a function concept is determined by such complementary elements as
object, product, means and manner. Other than the intuition that they
constitute a natural conceptual link of functions - objects/products -
means/manner, this is supported by the observation that (i) the com
bination of MEA or MAN always precedes AFO and PRO when they
co-occur in a complex term; (ii) they co-occur frequently; and (iii) even
when MEA or MAN is used without AFO or PRO in a term, the ex
istence of the objects or products are implied. For instance, we regard
(data processing: AFO), (index making: PRO),
(questionnaire survey: MEA), (multi
ple access: MAN) and (automatic data processing:
[MAN]AFO) as belonging to this same specification pattern. This rela
tion will be referred to as FCOM.
Specification of functional link: The intra-term relation FUN (function),
combined with the specification FCOM constitutes an integrated con
ceptual specification pattern, establishing a link of agents - functions -
objects/products - means/manner. This pattern is called here functional
link (FFUN). As the difference between FCOM and FFUN is the role of
the nucleus, FCOM can be regarded as the sub-pattern of FFUN, if we
follow the same argument put forward in the integration of direct and
indirect relations. These two were nevertheless distinguished, mainly
for convenience. Some examples of this pattern are:
(output unit: FUN), (information system: AFO),
(analogue computer: MAN), (in
formation transmission system: [AFO]FUN), (magnetic
storage device: [MEA]FUN) and (electric
data processing system: [MEA[AFO]]FUN). Many of the frequent com
binations of intra-term relations in Table 5.3 fall under this specification
pattern.
Specification of use: Another specification pattern identified as a func
tional or related type is the specification of use (FUSE), which is based
on USE. In accordance with the distinction of direct and indirect USE
relation, FUSE can also be subdivided into two: (a) the terms whose
nuclei have the role of the objects of the determinants; and (b) the terms
whose nuclei have the role of the tools or instruments of the determi-
108 Dynamics of Terminology
nants. Though these two are clearly different from the viewpoint of
the role of the nuclei with respect to the determinants, the basic aspect
of the nuclei specified by these two are the same, i.e. both specify the
use of the core concepts represented by the nuclei. On the other hand,
the actual specifications differ between these two variations. When
the nuclei represent the tools or instruments of the determinants, the
intra-term relation USE can be extended with the specification pattern
FCOM, i.e. tools/instruments/methods - activities - objects/products -
means/manner, while such a link is not established when the nuclei have
the role of the objects. Some examples are: (deposit
collection: USE), (classifying method: USE) and 8
(octave expression method: [MAN]USE).
Specification of destination: The intra-term relation DES was recognised
as constituting a specification pattern by itself.
Among these functional specifications, FUSE and FDES are consid
ered to represent passive functional relations with respect to the status of
the nucleus.
3) Specification from part/whole viewpoints
The following three, each having one-to-one correspondence to an intra-
term relation, were recognised. These patterns include the prefix "P" for
part.
Specification of whole or affiliations: This specification pattern, referred
to as PPAR, corresponds to PAR (partitive).
Specification of constituent elements: This corresponds to CON (con
stituents). This pattern is called PCON.
Specification of information content and representation: This pattern
(PICR) corresponds to ICR (information content & representation).
4) Specification from the viewpoint of internal attributes
The following specification patterns were recognised. The patterns are re
ferred to by adding "I", standing for internal attributes, at the head of the
corresponding intra-term relations.
Specification of formal attributes: The intra-term relation FOA (formal
attributes) constitutes this pattern (IFOA).
Conceptual Specifìcation Patterns 109
Table 5.4. Relations between intra-term relations and conceptual specification patterns.
Specification Intra-term relation Specification Intra-term relation
EJXT JXT INAT NAT
FROL ROL IACO
FCOM AFO/PRO, MEA/MAN IQUA QUA
FFUN FUN, AFO/PRO, MEA/MAN RSTA STA
FUSE USE, AFO/PRO, MEA/MAN RLOC LOC
FDES DES RTIM TIM
PPAR PAR RORI ORI, MAN
PCON CON RSCO SCO
PICR ICR RSUB SUB
IFOA FOA ODTA DTA
IMAN MAN
Now that all the necessary information has been gathered, it is possible
to describe the formation patterns of terms in the field of documentation.
In this process, the appropriateness of the descriptive devices, i.e. concep
tual categories and specification patterns, will also be evaluated. In the
following, the major tendencies of term formation in important conceptual
categories are observed first. This part introduces term formation patterns
specific to documentation terms with the full presentation of the data. The
description comprises the first major part of the pursuit of the dynamics of
the terminology of documentation. Then, the methodological aspects are
examined, both with respect to the description of documentation terms and
modelling term formation in general.
concepts. The determinants of AFO and PRO relations are mostly subject-
specific, such as AE212 (specific subject field) or RE311 (complex docu
ment entities). As for PPAR, all the determinants represent organisations.
In fact, all of them represent subject-specific organisations, i.e.
(book=house; library) and its variations.
In short, the documentation terms representing people are formed by
specifying their function — either directly by subject-specific activities or
indirectly by subject-specific objects or products (this also applies to 3 sim
ple terms, all of which incorporate functions, e.g. [programmer])
— or by specifying the organisations (more specifically libraries) to which
the people belong.
mobile library). On the other hand, all but two terms constructed by other
specification patterns take (book=house; library) as a nucleus.
We can clearly observe two different levels at which term formation
is active: (i) when forming subject-specific organisations by taking general
organisation concepts and specifying their function; and (ii) when forming
various subcategories of libraries by specifying scope, the whole or origin.
Figure 6.1 illustrates this. Taking into account the fact that the concept
"library" takes a notable role also in the formation of ME111 terms, this
category may better be treated as an independent subcategory of the or
ganisation concepts in the field of documentation so as to more explicitly
reflect the conceptual regularities of term formation.
120 Dynamics of Terminology
become the nuclei, with the common characteristic being that both content
and structure are clearly specified. This pattern takes a wide range of de
terminants. The formation of ME21112 terms by RSCO is active only for
specific central categories of documentation entities.
We can also observe interesting characteristics in IFOA and FFUN
terms. The nuclei of IFOA terms are limited to a subset of ME21112, m the
same manner as RSCO terms. This and the fact that IFOA are rarely used
in ME21111 terms indicate that formal attributes of ME2111 concepts oc
cur only after the content is specified. The FFUN terms follow a relatively
Formation Patterns of Documentation Terms 123
Though they are all formed by FFUN, the differences in their actual
representation, including that of linguistic items, reflect different levels of
term formation; terms with FUN or AFO are considered to be one step
broader than those with MEA or MAN.
Figure 6.3. Hierarchy of RE1 concepts as seen from the viewpoint of term formation.
in the field of documentation. All but 3 nuclei belong to RE22. The patterns
of 2 terms with CE (classificatory entity) nuclei are similar to those of RE1
(broad representational entity) terms with CE nuclei.
130 Dynamics of Terminology
filiations (PPAR). Thus, one of the main patterns of forming RE3121 con
cepts is to indicate functional aspects (or, in a few cases, "whole"), on the
basis of broad representational or linguistic entities. This contrasts with the
fact that PICR, IFOA, RSCO (scope) and RSTA (status) are only applied
to RE3121 nuclei. This and the fact that no independent terms represent
RE3122 indicates that simple units of documentation entities (RE312) are
first identified from a functional point of view. Thus, RE3121 terms are
formed: (i) at the level of broad representational or linguistic entities to
RE3121 by specifying functional aspects; and (ii) at the level of general
RE3121 to specific RE3121 by various specifications.
6.1.2.3.3 Documentation entities - parts (RE32)
There are 18 terms (11 simple and 7 two-item) in this category. Table 6.11
shows the patterns of complex terms. Among the complex terms, 3 take
PPAR (whole or affiliations), which specifies the whole of the part repre
sented by the nuclei. The low ratio of complex terms shows that RE32
terms are not very productive.
Formation Patterns of Documentation Terms 133
is FFUN (functional link: 12). The nuclei are either CE1221 (independent
compound classificatory entities) or RE41 (programme units). RE42 is too
small for any meaningful generalisation. The specification patterns of the
8 RE43 complex terms are FUSE (use), FDES (destination), IFOA (for
mal attributes) and INAT (nature). Compared to RE41 terms, RE43 terms
lack the FFUN relation; the former are defined as having a positive role or
function by themselves, while the latter have only a passive role.
Formation Patterns of Documentation Terms 135
and 7 terms (all simple) in AE13 (operational information entities). The for
mation patterns of complex terms belonging to AE11 and AE12 are listed
in Table 6.13. AE13 terms are not treated here as they only accommodate
simple terms.
INAT (nature), which is a dustbin category, is dominant in AE11, and
FUSE (use) is dominant in AE12. Although we cannot see positive ten
dencies in these terms with respect to conceptual specification patterns, the
absence of the specification of functional aspects is notable for term
formation. This is because, in documentation, AE11 concepts (information,
meaning, concepts, etc) are regarded as abstract theoretical objects. An
other interesting phenomenon is that AE11 terms with PCON (constituent
elements) or PPAR (whole or affiliations) take CE (classificatory entity) and
RL1 (labels of relation) elements as nuclei and AE11 elements as determi
nants. This reveals the role of some of the classificatory entity elements in
term formation, as well as the affinity between RL1 and abstract or classi
ficatory entities. The roles of CE in term formation will be discussed later.
AE12 terms can be divided into three groups, i.e. those related to natu
ral language, to documentation languages (for indexing and retrieval), and
to programming languages. Though we cannot observe any consistent char
acteristics of this category as a whole, it is noted that specification of func
tional aspects (namely FUSE) is used to form documentation or program
ming languages only.
are clearly more specific than the terms in (ii). Figure 6.5 illustrates this
situation.
We can also observe some tendencies of the determinants. Many de
terminants of USE belong to the activity concepts which imply types of
products or objects, e.g. representational entity production (AE232) or ma
terial or representational entity state change (AE242). Most determinants
of AFO or PRO, on the other hand, belong to material or representational
document-related entities (ME21 or RE31). This corresponds to the expla
nation for the coherence of the groups (i) and (ii) above1.
1
FUSE is expected to be the main specification pattern in the formation of methodology
terms irrespective of the domain, and the formation pattern of this category specific to the
field of documentation may reside more in the types of determinant categories. To confirm
Formation Patterns o f Documentation Terms 141
sive functional aspects. This contrasts with such categories as AE11, where
the lack of specification of functional aspects is noteworthy.
pattern of IMAN; with only two exceptions, the terms of this category rep
resent various subconcepts of (relation), which is itself a term in the
field of documentation. Like CE terms, these terms might also be regarded
as being very close to subtypes of abstract entity concepts, together with
terms representing quality concepts.
6.1.7.2 Action(AC21)
The conceptual category of action (AC21) accommodates 97 (14 simple,
80 two-item and 3 three-item) terms. Table 6.22 lists the formation patterns
of complex AC21 terms. The majority (69) of action terms take FCOM
(complementary elements of functions) as the specification pattern, the ac
tual linguistic representations of which vary: affected object (AFO: 40),
manner (MAN: 23), means (MEA: 3), MAN + AFO (2) and MEA + AFO
(1). The lack of PRO is basically due to the definition of the subdivisions of
the category AC, where a concept category AC23 (production) is separately
defined.
The terms with other than FCOM specifications tend to take indepen
dent terms as nuclei. Though the nuclei of many FCOM terms are them
selves terms, we can recognise two levels of term formation; the general
non-terminological action concepts are first restricted by the specification
of complementary elements of the action concepts, while other specifica
tion patterns are mostly applied to already restricted concepts.
pattern. The intra-term relations are product (PRO: 4), manner (MAN: 5)
and means (MEA: 2). AC231 nuclei tend to require PRO, while AC232 or
AC233 nuclei take MAN or MEA. This can be explained by means of the
nature of the nuclei, i.e. the nuclei which are not themselves terms tend to
take PRO. It is interesting to compare AC232 and AC233 terms with AC222
and AC223 terms, where the types of objects are implied but are actually
represented linguistically. In AC22 terms, the specific subtypes of objects
seem to be important (perhaps partly because many AC22 morphemes im
ply types of means or manner within the domain), while in AC23, not the
specific subtypes of the products but means or manner are regarded as im
portant for differentiation.
Formation Patterns of Documentation Terms 151
nipulated from the point of view of the function they take. It is interest
ing to contrast these with the conceptual categories characterised by the
dominance of FUSE (where functional aspects are noted only passively),
i.e. ME212 (non-document information carriers), AE12 (linguistic entities)
and AE2 (subjective entities). These are regarded as taking a passive role
in documentation.
Table 6.26. Classificatory entity items used in the terms representing other categories.
of terms that belong to this class but because of the behaviour of CE mor
phemes used as nuclei, which appear in other broad categories.
Table 6.26 lists the linguistic items representing CE concepts used as
the nuclei for constructing the terms of other categories. It shows that the
category shifts are triggered by only a limited number of conceptual speci
fication patterns, i.e. FFUN (functional link), FUSE (use), PPAR (whole or
affiliations), PCON (constituent elements), RORI (origin), IFOA (formal
attributes) and a single case of PICR (information content and representa
tion).
The category shifts can be divided into three types:
a) Combinations whose categories are systematically determined by the
categories of the determinants: PPAR, PCON or PICR terms belong to
this group. Examples are (code element) and (con-
Formation Patterns o f Documentation Terms 155
cept system). Though the examples in the corpus are limited, this pattern
of category shift is transparent and seems to have wider applicability.
b) Combinations whose categories are not systematically determined from
the point of view of nuclei and categories of determinants: They consti
tute the central patterns of the resultant categories, and terms are formed
basically in the same way as the terms whose nuclei belong to the same
categories as the resultant concepts. Examples are FFUN terms repre
senting ME22, e.g. (reception body; receptor), and RORI terms
representing ME21111, e.g. (publication thing; publication).
c) Combinations whose categories are not systematically determined and
which do not belong to the central patterns of term formation of the
resultant categories: From the point of view of the nuclei, this type is
similar to b), but from the point of view of the resultant categories, this
type is more idiosyncratic, because the formation patterns do not belong
to the central patterns of the categories. Examples are FUSE and IFOA
terms with the nucleus (element), e.g. 2 (binary element).
In the field of documentation, a) and b) seem to be productive, while c)
is not. In correspondence with the a) and b) productive patterns of category
shifts, the use of CE nuclei in term formation can be described as follows:
a) CE are used as secondary concepts with respect to the categories of
resultant concepts, specifying (from the reverse point of view) config
urations, structures, units, etc. The CE concepts, not being nuclei of
substance themselves, do not occupy the generic position in the set of
resultant concepts. Thus, these CE concepts are used as "helpers" for
the formation of terms in the relevant conceptual categories.
b) CE concepts are used in the same way as the generic concepts of the
target categories. In this case, the CE concepts share at least one charac
teristic with the generic concepts of the target categories, e.g. "system"
(CE) and "device" (ME). In this case, as nuclei, these CE concepts can
be seen as representing generic categories of special hierarchies which
cut across different broad categories.
In addition, of course, CE concepts are used as the nuclei of indepen
dent CE term formations, in which case the category of CE is very close to
AE.
156 Dynamics of Terminology
Quantitative Patterns of
Terminological Growth
Chapter 7
In Part II, the conceptual patterns of term formation were observed. The
level of description remained necessarily broad to leave room for the de
scription of general tendencies of term formation. A quantitative approach
can be used to complement the conceptual description and explore finer
tendencies of term formation or, more precisely, the growth patterns of ter
minology. This is a field so far virtually unexploited in the research field
of terminology and even in the field of lexicology in general. By exploring
this, we will be able to shed light on aspects of the dynamics of terminology
that have been unexposed so far. This chapter first explains what aspects of
terminological growth can be captured by the quantitative approach, then
presents the quantitative method. Lastly, the conditions and assumptions
necessary for applying the quantitative method will be examined.
terminology when the size of the terminology changes. Figure 7.1 gives a
rough image of the position of the quantitative description of terminologi
cal growth in relation to the descriptions of the conceptual patterns of term
formation.
In order to pursue the characterisation of patterns of terminological
growth as exemplified above, it is necessary to clarify the mathematical
methods and the relation between the mathematical methods and the phe
nomenon of the dynamics of terminology. Let us turn to these tasks now.
For the sake of convenience, let us assume a situation where we are in
terested in the increase in the number of the constituent elements or mor
phemes when the size of a terminology is increased. In other words, we
are concerned with knowing, on the basis of given terminological data,
how many different morphemes would be used if the size of a terminol
ogy became 1.5 times, twice, three times, etc. the size of the given data. By
assuming the existence of the same terminology in these various sizes, we
may regard the original terminological data as a sample, derived from the
population which has the "essential" probabilistic structure of the termi
nology construction. In the following, starting from the basic explanation
of the binomial model of morpheme distribution in terminology, we intro
duce binomial interpolation and extrapolation, which, if properly applied,
fit very elegantly the concept of terminological growth. In this section, the
basic terminology of probability and statistics will be used without expla
nation; see DeGroot (1984) for the basic background of probability and
statistics, and Baayen (2001) for a detailed explanation of the quantitative
model adopted here. The notation used in the present study was taken from
Baayen (2001).
of words in texts (Yule 1944; Chitashvili & Baayen 1993; Ogino 1998) and
of morphemes in terminologies (Kageura 1998a).
In the present case, the mathematical model is constructed over the
distribution of morphemes in a given terminology. The model assumes that
there exist S element types1 in the urn. This equals to assuming that the
terminology population consists in total of S different morphemes. It is also
assumed that r token numbers of each morpheme type are included in the
urn, such that the relative token frequency of each morpheme type in the urn
equals the population probability of the morpheme type. This constitutes a
basic model of the population from which the actual terminological data is
assumed to be obtained.
The sampling of the terminological data corresponds to selecting mor
phemes from the urn, one after another. We assume that we are only con
cerned with the selection of morphemes in the terminology as a whole, and
not with the construction of individual terms. Sampling with replacement
corresponds to the binomial model, while sampling without replacement
corresponds to the hypergeometric model. These two are asymptotically
identical, and either can be used in the present situation. In any case, the
sample terminological data consisting of TV morpheme tokens is regarded
as equivalent to the set of N morpheme tokens chosen randomly from the
urn. So the basic model is constructed on the basis of morphemes and their
distribution in the terminology, and the level of individual terms is disre
garded in the model.
Formally, the situation can be defined as follows. First, assume that
there are S different types or events, w1, w2..., ws, in the population:
Definition 1 S : Population number of types or events.
Definition 2 w1,w2, ..., wi, ..., ws : Types or events in the population. In
the present case, an event corresponds to a morpheme type.
To each of these types, the population probability pi is assigned:
Definition 3 p1, P2, ..., Pi, ..., ps: Population probabilities of the events
W1,W2, ..., Wu ..., WS.
This is equivalent to what is represented by the urn.
1
In this and the following chapters, we use the terms "type", "token", "different" and
"running" in the standard sense of quantitative linguistics. For instance, the meanings of
"morpheme type" and "different morpheme" are identical; so are the meanings of "mor
pheme token" and "running morpheme". We keep using both pairs of terms because one or
the other fits some contexts better.
Quantitative Framework 169
= (1-Pi)N,
which constitutes the unbiased estimator for the binomial model (Minoya
1994).
For language data, however, it is widely known that this estimation is
not optimal (Baayen 2001; Chitashvili & Baayen 1993; Kita 1999; Man
ning & Schütze 1999). This problem arises because it is nearly always
expected that there are events (or morphemes in the present case) which do
not appear in a given sample. As long as we see the data as a sample of the
population, a view which we naturally have come to adopt in dealing with
the concept of dynamics, the data is incomplete in the sense that there are
events that exist or may come to exist but do not appear in the given sample.
In order to assess the degree of the incompleteness of data, Chitashvili
& Baayen (1993) introduces a measure called coefficient of loss (CL). This
measure calculates the ratio of the number of events which are lost by esti
mating the number of events in a sample equal in size to the original, using
the sample relative frequencies as estimations of the population probabili
ties, based on the binomial model2 :
Kageura (1998a) has shown that the coefficient of loss of the English and
Japanese terminological data of the four domains, i.e. computer science,
2
The reason why the view adopted in CL, i.e. collapsing the distinction of the events
and focusing only on the number of events, can be justified is examined in Kageura (2000).
172 Dynamics of Terminology
agriculture, psychology and physics, are all greater than 20 per cent. This
indicates that the terminological data is statistically incomplete and, thus,
the estimation of population probabilities by sample relative frequencies is
not valid. Note that, from the point of view of the dynamics of terminol
ogy, this incompleteness is an opportunity, not a deficiency, because the
completeness of a given terminological sample means that the dynamics of
terminology is already exhausted and that the terminology is dead.
Ê[V(k,:N)] = V(k,N)
and
Ê[V(N)] = V(N)
The function G{p) is a step function whose value jumps at such that at
least one event takes the population probability p. Let pj be the population
probability that is assigned to at least one event in the population, where the
subscript is assigned in ascending order of probability, from the smallest
value to the largest. Then the value of the jump at the probability pj is
given by:
AG(pj)=G(pj)-G(pj+1).
Secondly, we rewrite the equation given in (7.2) using the Poisson ap
proximation to the binomial distribution:
Rewriting the equation (7.6) by using (7.5), we obtain the integral form
for the expression of E[V(N)]:
The penultimate step uses the fact that E[V(1, N)] = Npe-Np according
to the Poisson distribution.
In the actual application, further clarification of the concept of growth
rate is needed; this will be given in 8.2.3.1. It is interesting to observe that
the growth rate of the morphemes at sample size N is given by the estimated
number of morphemes that occur only once divided by the sample size
N. If we approximate E[V(1,N)] by V(1,N), the growth rate can be
calculated on the basis of the given data. Incidentally, the growth rate is
equal to the probability mass assigned to the unseen events by the Good-
Turing estimates (Good 1953).
Quantitative Framework 177
N = α,
the condition that α, the average length of a term as counted by the number
of morphemes, is constant, irrespective of the size of the terminology. The
actual value of α is obtained by N/T of the original data.
This effectively assumes that the ratio of terms of each length does not
change. Therefore, for instance, if our original terminological sample con
sists of 20 terms, of which 10 are simple terms, 5 are terms with two con
stituent morphemes, 3 are terms with three morphemes and 2 are terms with
four morphemes, then we assume that, when the sample becomes twice as
big and contains 40 terms, it will consist of 20 simple terms, 10 terms with
two morphemes, 6 terms with three morphemes and 4 terms with four mor
phemes.
This is an obvious oversimplification of the actual terminological phe
nomena. It is intuitively more natural to assume that the ratio of longer
terms increases when the terminology grows (cf. Baayen 2001). It is, how
ever, not a critical oversimplification with respect to the structural assump
tion that was adopted in Part I (see especially 2.2.2)4.
Note that this simplification only becomes apparent because a rigid
mathematical model is adopted, while in Part II (and, in fact, in many
qualitative studies of terminology) this problem simply remains unnoticed,
though it has always existed. In that respect, the simplification introduced
by assuming α to be constant is no worse than many of the implicit and
unnoticed simplifications adopted in existing studies.
The second problem, the randomness assumption, is a well-known
problem in lexical statistics in general (Baayen 2001). In the present case,
assuming the randomness of the occurrence of morphemes means ignoring
two factors in terminology. The first is related to the distinction between a
core and a non-core set of terms or morphemes within the terminology of a
4
As will be shown in Chapter 9, the actual analysis sometimes leads to results that con
tradict the assumption that the distribution of the length of terms does not change. Within
the overall assumption adopted in the present study, it is technically possible to introduce
another model which traces the pattern of changes of the average length of terms and the
ratio of simple and complex terms according to the changes in sample size. In that sense,
the assumption adopted here is still a simplification even under the structural assumption.
We nevertheless do not incorporate the more complex model because the distribution of
word-length is a separate research matter that needs thorough examination as a separate
research topic.
Quantitative Framework 179
domain and within each conceptual category. The second is related to the
dependency of morphemes within individual terms.
The distinction between core and non-core terms or morphemes has
not been taken into account in the present study. We have assumed, not
unreasonably, that the data constitutes a representative sample of a termi
nology. In that sense, as long as we do not try to draw any conclusions
about the centrality and/or non-centrality of terms or morphemes, omitting
the distinction of core and non-core terms and morphemes does not cause
any trouble concerning the consistency of the present study5.
The dependency of morphemes in individual terms is an aspect that has
so far been maintained and taken into account by means of the distinction
between nucleus and determinant, as well as the distinction between term-
forming and embedded combinations. The binomial model ignores this.
Can this be justified? Based on an experiment, Kageura (1998a) showed
that disregarding the intra-term dependency of morphemes does not affect
the accuracy of the quantitative model. His experiment — which is based
on the same idea as Baayen (1996), in which the lexical dependency in texts
is examined — compares the growth patterns of morphemes calculated by
the random permutation of terms in a terminology and by the random per
mutation of morphemes disregarding the unit of terms. No statistically sig
nificant difference was observed for both the English and Japanese termi
nologies of the four domains, i.e. computer science, agriculture, physics
and psychology6. In general, therefore, as long as we can disregard the
distinction or grade of central and non-central sets of terms in terminology,
it is safe to disregard the intra-term coherence of morphemes.
data of reasonable size for the quantitative approach to work properly. The
actual analyses to be carried out in the following two chapters are subject
to this constraint as well.
Chapter 8
This attitude is most typically held by the studies in which concepts are
claimed to precede terms in terminology (Felber 1984; Lara 1998/1999).
It is in fact straightforward to relate individual morphemes to concepts;
simply declaring that individual morphemes represent corresponding indi
vidual concepts is sufficient and theoretically valid. Formally, by indicating
concepts with < > and lexical representations with italics, we can define
two one-to-one functions, one constituting the reverse of the other, between
a concept and a morpheme. This is illustrated in the following (with "infor
mation" used as an example):
< information > = fc (information)
information — fr(< information >)
where fc is the function that maps a lexical representation to a concept and
fr is the function that maps a concept to a lexical representation.
Theoretically, this one-to-one mapping appears at the final stage of
the detailed conceptual analysis of linguistic items. If one tries to obtain
the conceptual category that can distinguish every different linguistic phe
nomenon, then the concepts indicated by different linguistic representations
should all be distinguished, simply because they take different linguistic
representations. The concepts thus established can be referred to straight
forwardly by the corresponding linguistic representations using the func
tion fc. It is only when some sort of generalisation is important, as in the
analyses carried out in Part II, that concepts or conceptual categories that
do not correspond one-to-one to linguistic items are required. Because we
aim here to carry out an analysis of the detailed patterns of terminologi
cal growth which complement the generalised descriptions of conceptual
patterns, it is natural to carry out the analyses at the most detailed level,
where individual linguistic items have a one-to-one correspondence with
concepts. Thus, the morphological dynamics is assumed to represent the
conceptual dynamics at the finest level.
The only problems are synonymy and polysemy. The problem of pol
ysemy can be avoided by proper conceptual analysis, and we have already
clarified the morphemes to which more than one conceptual category is al
located in Chapter 4. In the present analysis, which uses the conceptual
categories established in Chapter 4, therefore, polysemy does not cause a
problem. The issue of synonymy has not been properly addressed in the
present study. It might be possible to bypass this problem by simply claim-
Growth Patterns of Morphemes 185
N = 2668,
V(N) = 845.
Table 8.1 shows V(m. N), the number of morphemes that occur m
times in the data. More than half of the morpheme types occur only once.
The growth rate P(N), calculated by letting E[V(1,N)] be approximated
by V(l,N),is:
186 Dynamics of Terminology
This means that a new morpheme token used to construct a new term can
be a new type with the probability of 0.177. The coefficient of loss of the
data is:
CL = 0.236.
The values of V(N) and CL clearly indicate that the data does not exhaust
the potentiality of terminology as seen from the angle of the distribution of
morphemes, i.e. there is room for the terminology to grow with the incor
poration of new morphemes.
the result of term-level permutation, and the solid line indicates the result
of morpheme-level permutation. No noticeable difference can be observed
between these two. The difference is well within the 95 per cent inter
val indicated by the dashed lines. The right panel of Figure 8.1 shows
the discrepancy of the numbers of morpheme types between the two for
20 equally-spaced intervals, i.e. the number of morphemes calculated by
morpheme-level permutations subtracted by the number of morphemes cal
culated by term-level permutations. The difference is at most one mor
pheme type. Thus, we can conclude that the randomness assumption of the
distribution of morphemes will not affect the conclusions drawn from the
mathematical model.
As mentioned in 7.3, we assume that the average length of a term, i.e.
α = N/T = 2.17, is constant, irrespective of the size N. The ratio of
simple terms, terms with two morphemes, etc. is assumed to be constant as
well. Thus, the observation made on the basis of TV can always be trans
ferred to the observation on the basis of the number of terms. In addition,
in the following observations, we assume that the ratio of morpheme tokens
representing each conceptual category is constant irrespective of N.
188 Dynamics of Terminology
Figure 8.2. The growth curves of morphemes for broad conceptual categories.
Figure 8.3. The growth curves of morphemes for broad conceptual categories (part).
190 Dynamics of Terminology
Figure 8.4. The average use of morphemes for broad conceptual categories.
of use N/V(N) is one such measure; it reveals the degree of the repeated
use of morphemes.
Figure 8.4 illustrates the changes in the average use of morphemes
for each broad conceptual category, plotted against T. The average use
of morphemes representing AE (abstract entities) is the highest when the
size of the terminology is small. When the size becomes bigger, the aver
age use of CE (classificatory entity) morphemes rises above that of abstract
entity morphemes. Through the repeated use of morphemes, abstract en
tity and classificatory entity concepts contribute to establishing connections
among terms in the terminology. This observation also matches our general
understanding of the nature of these conceptual categories: The stores of
highly-abstract concepts such as abstract entities and classificatory entities
are inherently small, and this small number of concepts can be used, in
combination with other concepts, in a variety of situations.
On the other hand, the average number of the use of both QL (qual
ity) and RL (relation) morphemes remains low. This corresponds to the
interpretation that these concepts are used for the differentiation of related
192 Dynamics of Terminology
Pm(N CAT ) measures the conditional probability, i.e. the probability that
the token is a new type of CAT when a new token is CAT. From the
point of view of constructing terminology, therefore, P m {N C A T ) can be
interpreted as indicating the degree of expectation of a new morpheme type
when the particular category CAT is called upon. Note that P(N) shown
in Table 8.2 is in fact P m (N C A T )·
Alternatively, we can define the growth rate of morphemes of individ
ual categories vis-à-vis the size of the terminology:
Figure 8.5. The growth rate of morphemes for broad conceptual categories.
middle position. Material entity concepts other than these two show low av
erage use and a high growth rate. Towards the right-hand end of the obser
vation range, we see that the growth rate of information carriers is expected
to become larger than that of the morphemes representing non-information
carriers. This implies that, among material entity concepts, the bigger the
terminology becomes, the higher is the demand for new morphemes for
information carriers.
Three of the four subcategories of representational entity concepts, i.e.
broad representational entities (RE1), linguistic entities (RE2) and docu
mentation entities (RE3), are very close to each other, both in average use
and in growth rate. Despite the different token sizes (RE3 is twice to three
times bigger), the degree of repetition and of new morphemes required in
the construction of the terminology is similar. The morphemes representing
software entities (RE4) show a somewhat unstable transition, perhaps be
cause of the size of the data (but we can still rely on the overall tendencies).
It is notable that when the terminology becomes bigger, the decrease in
the growth rate of documentation entities (RE3) becomes much slower than
Growth Patterns of Morphemes 197
Figure 8.6. Average use and growth rate of subcategories for ME, RE, AE and CE.
Figure 8.7. Average use and growth rate of subcategories for QL, RL and AC.
Figure 8.7 shows the average use and the growth rate Pm(NCAT) of the
subcategories of the quality, relation and activity concepts. As in Figure
8.6, they are plotted up to 1.5 times the original data size, with the same
absolute scale.
Morphemes representing quality and relation concepts show very sim
ilar internal patterns, though their absolute values differ. The average use of
"labels" (QL1 and RL1) is consistently higher than that of "values" (QL2
and RL2). The growth rates of "values" are higher at the beginning, but
they quickly decrease and come closer to, or even lower than, the growth
rates of "labels". The patterns confirm that the qualitative difference be
tween labels and values is reflected in the quantitative patterns of the use of
the morphemes in terminology construction.
The difference between quality concepts and relation concepts also be
comes clear, especially in the right-hand end of the transition patterns of
the growth rates. In the case of quality concepts, the decrease in the growth
rates slows down for both QL1 and QL2. In fact, the curve of the growth
Growth Patterns of Morphemes 199
rate of QL1 is nearly flat towards the end of the observation range. In
the case of RL, on the other hand, the decrease in the growth rate of RL1
does not seem to slow down at all, while that of RL2 does slow down.
Thus the phenomenon observed in 8.2.3.2 seems to come mainly from RL1
morphemes. This corresponds to our qualitative observation of the charac
teristics of relations, i.e. some relation concepts, especially in RL1, have a
similarity with abstract entity concepts.
The subcategories of activity concepts show a certain degree of diver
sity. Action (AC21) and production (AC23) concepts are somewhat closer
in terms of their patterns of average use, while the other three subcate
gories show similar transition patterns. This also holds for the patterns of
the growth rates when the size of the terminology is not very big. Un
fortunately, we cannot readily see a qualitative explanation behind these
patterns.
Chapter 9
subject field). A certain number of categories are not observed here because
they are too small for quantitative analysis and cannot be collapsed into
neighbouring categories. These are: document parts (ME2112: 3 terms),
places/locations (ME24: 4 terms), parts of information entities (ME25: 5
terms), parts of documents (RE32: 18 terms) and subjective entities (AE3:
14 terms). Some categories, i.e. RL2 (values of relations), AC1 (units of ac
tivities), AC22 (transference), AC23 (production) and AC24 (state change),
may be too small for quantitative analysis, so the results of quantitative
analysis applied to these categories will have to be interpreted with care.
for λ = 0 to 1.5.
Though we stated above that the relationships among the growth pat
terns of the four elements are taken into account, these measures cannot be
used for direct comparison. For instance, suppose that the number of nu
clei increases much faster than the number of specification patterns. This
comparison does not mean anything in itself because the nuclei and the
specification patterns are qualitatively different in the first place.
To solve this problem, referring to the growth and growth rate of all
the terms in the data as the baseline, we introduce a third measure, i.e. the
relative number of types. The relative number of types is defined as the
206 Dynamics of Terminology
VCAT(ΛN)
VALL(λN)
Let us observe here the developmental patterns for all 1,228 terms in the
data and then look at the general tendencies of the relative number of types
208 Dynamics of Terminology
Table 9.2. Basic quantities of nuclei, determinants, determinant categories and specification
patterns.
N V(N) V(1,N) V(N) cL
Nuclei 1229 443 284 0.231 0.259
Determinants 1104 632 448 0.406 0.286
Determinant categories 1104 90 9 0.008 0.053
Specification patterns 981 20 1 0.001 0.019
Figure 9.1. The growth patterns of nuclei, determinants, determinant categories and specifi
cation patterns f or all the data.
Table 9.3. The relative number of types of the four elements of observation for each con
ceptual category as calculated at a sample size 1.5 times that of the original.
Category NUC DET CDET SPEC
ME111 People (types) 0.33 0.91 0.55 0.16
ME12 Organisations 0.28 0.91 0.68 0.54
ME2111 Documents (types) 0.53 0.96 0.92 0.76
ME212 Non-documents 0.37 0.79 0.55 0.74
ME22 Machines and implements 0.38 0.84 0.67 0.58
REl Broad representational entities 0.45 0.89 0.73 0.80
RE2 Linguistic entities 0.50 1.01 0.85 0.82
RE311 Documentation entities (complex) 0.33 0.98 0.81 0.75
RE312 Documen tation entities (simple) 0.32 0.99 0.89 0.78
RE4 Software entities 0.71 0.75 0.81 0.45
AE1 Information entities 0.50 0.95 0.61 0.84
AE21 Subject fields 0.54 0.74 0.72 0.40
AE22 Methodologies 0.14 0.72 0.67 0.31
CE Classificatory entities 0.48 0.87 0.61 0.88
QL1 Labels of types of qualities 0.58 0.92 0.81 0.27
RL1 Labels of types of relation s 0.47 0.99 0.79 0.49
RL2 Values of relations 0.39 1.05 0.61 0.34
AC1 Units of activities 0.35 0.82 0.55 0.17
AC21 Action 0.38 0.82 0.71 0.45
AC22 Transference 0.41 0.71 0.63 0.65
AC23 Production 0.61 0.88 0.54 0.11
AC24 State change 0.81 1.03 1.00 0.25
Average 0.45 0.89 0.71 0.52
T 9.4. Basic quantities of nuclei, determinants, determinant categories and specification
patterns in ME111.
N V(N) V(1,N) E[V(1.5N)] P(N) cL
Nuclei 26 8 6 9.17 0.231 0.271
Determinants 27 24 22 35.28 0.815 0.338
Determinant categories 27 13 5 14.97 0.185 0.187
Specification patterns 23 2 0 2.04 0.000 0.002
Figure 9.2. The number of types, the growth rate and the relative number of types of nuclei,
determinants, determinant categories and specification patterns for ME111
terms.
Panel (a) shows that the value of the determinants is consistently higher
than the values of the other elements. This is followed, from highest to
lowest, by the values of the determinant categories, the nuclei and the spec
ification patterns5. The difference between the number of determinants and
the number of nuclei is rather large. The number of specification patterns
converges at a very early stage of the graph, and the increase in the number
of determinant categories slows down towards the end. The numbers of
5
As mentioned above, the comparison of the number of these elements is in itself not
meaningful, as the four elements consist of different classes of items (other than the deter
minants and the nuclei). The number of determinant categories is by definition not larger
than the number of determinants, and the total number of conceptual specification patterns
is only 21. However, by comparing the order of the elements' values and the discrepancies
among them with those of all the terms in Figure 9.1, we can intuitively observe the char
acteristics of the growth of terms in this category. This is why we refer to the order of the
values of the four elements in panel (a). In the following, we will refer to the order of the
numbers of nuclei, determinants and determinant categories in the same spirit.
Quantitative Dynamics in Term Formation 213
Table 9.5. Basic quantities of nuclei, determinants, determinant categories and specification
patterns in ME12.
N V(N) V(1,N) E[V(1.5N)] P(N) cL
Nuclei 38 10 6 12.88 0.158 0.235
Determinants 39 34 31 49.30 0.795 0.339
Determinant categories 39 19 10 23.53 0.256 0.225
Specification patterns 37 7 2 7.76 0.054 0.132
Figure 9.3. The number of types, the growth rate and the relative number of types of nuclei,
determinants, determinant categories and specification patterns for ME12
terms.
Panel () shows that the increase in the number of nuclei is much
smaller (even smaller than that of the 111 terms) than the increase in
the number of nuclei for all the terms, becoming less than one-third at
the end of the graph. This can be attributed partly to the use of
(book=house; library) as the nucleus in many terms (see 6.1.1.2). The
increase in the number of determinants is about the same as that of all the
terms, while the relative number of determinant categories becomes smaller
when the number of terms becomes greater. The graph also shows that
about 60 per cent of all the specification patterns are used in the terms of
this category.
The terms representing ME12 concepts are constructed from a small
number of nuclei and a limited number of conceptual specification pat
terns, with a wide variety of determinants whose conceptual categories are
slightly limited. In the case of growth, the number of new nuclei should in
crease steadily, though at a slow rate, while new determinants are expected
to occur at a high rate, though the rate itself will decrease slowly. New
conceptual specification patterns are expected, but they will not be many.
7
Recall that we argued that the number of nuclei will surpass that of determinant cat
egories for ME111 terms. The ME2111 terms, whose number is much larger than that of
ME111 terms, somehow realise that forecast, although, of course, the structure is different.
216 Dynamics of Terminology
Table 9.6. Basic quantities of nuclei, determinants, determinant categories and specification
patterns in ME2111.
N V(N) V(1,N) E[V(1.5N)] P(N) cL
Nuclei 136 51 37 68.45 0.272 0.278
Determinants 112 95 84 135.09 0.750 0.336
Determinant categories 112 45 20 54.08 0.179 0.194
Specification patterns 102 13 2 13.31 0.020 0.088
Figure 9.4. The number of types, the growth rate and the relative number of types of nuclei,
determinants, determinant categories and specification patterns for ME2111
terms.
pace of this increase will slow down gradually. The number of determinants
is expected to increase rapidly, but the pace of this increase, too, will slow
down. The number of conceptual specification patterns will not increase
after 13 or 14 patterns occur.
8
The ratio of simple terms is about 15 per cent for ME212 terms, so new nuclei are
expected to keep occurring under the assumption that the ratio of simple and complex terms
is constant irrespective of the size of the data. Here, however, the number of nuclei is
expected to be finite. This implies that at a certain stage no new nuclei will occur, which
in turn means that no new simple terms will occur. This contradiction always exists for
categories whose number of nuclei is expected to be limited but whose ratio of simple terms
is not zero. This makes invalid the assumption that the ratio of simple and complex terms as
well as the ratio of complex terms with lengths of 2, 3, etc. are constant. This, however, is a
problem to be addressed in the next stage of research and is put aside in the present study.
218 Dynamics of Terminology
Table 9.7. Basic quantities of nuclei, determinants, determinant categories and specification
patterns in ME212.
N V(N) V(1,N) E[V(1.5N)] P(N) cL
Nuclei 54 20 9 23.19 0.167 0.208
Determinants 47 36 31 50.96 0.660 0.325
Determinant categories 47 19 7 21.16 0.149 0.181
Specification patterns 46 10 2 11.23 0.044 0.104
Figure 9.5. The number of types, the growth rate and the relative number of types of nuclei,
determinants, determinant categories and specification patterns for ME212
terms.
Table 9.8. Basic quantities of nuclei, determinants, determinant categories and specification
patterns in ME22.
N V(N) V(1,N) E[V(1.5N)] P(N) CL
Nuclei 60 21 12 26.00 0.200 0.237
Determinants 64 51 44 71.50 0.688 0.330
Determinant categories 64 24 13 30.27 0.203 0.216
Specification patterns 52 7 4 9.06 0.077 0.217
Figure 9.6. The number of types, the growth rate and the relative number of types of nuclei,
determinants, determinant categories and specification patterns for ME22
terms.
number in the long run, while the number of specification patterns and de
terminant categories is expected to grow slowly but steadily.
Table 9.9. Basic quantities of nuclei, determinants, determinant categories and specification
patterns in REL
N V(N) V(1,N) E[V(1.5N)] P(N) cL
Nuclei 51 21 13 26.78 0.255 0.245
Determinants 38 33 29 46.88 0.763 0.332
Determinant categories 38 20 11 24.52 0.290 0.236
Specification patterns 36 10 3 11.29 0.083 0.144
Figure 9.7. The number of types, the growth rate and the relative number of types of nuclei,
determinants, determinant categories and specification patterns for REl
terms.
Panel (c) shows that the relative number of types for all four elements
is higher than the average (see Table 9.3) and that that of determinants is
even increasing.
The RE2 terms are constructed from a fair number of nuclei, with a
rich variety of determinants whose conceptual categories vary widely. A
limited but reasonable number of conceptual specification patterns is used.
In the case of growth, the number of nuclei is expected to keep increasing
steadily. The number of determinants and determinant categories will also
increase, but the pace of increase will slow down. No new specification
patterns are expected to occur.
222 Dynamics of Terminology
Table 9.10. Basic quantities of nuclei, determinants, determinant categories and specifica
tion patterns in RE2.
N V(N) V(1,N) E[V(1.5N)] P(N) cL
Nuclei 45 20 14 26.64 0.311 0.279
Determinants 30 29 28 42.75 0.933 0.354
Determinant categories 30 19 13 24.47 0.433 0.275
Specification patterns 29 11 1 11.08 0.035 0.108
Figure 9.8. The number of types, the growth rate and the relative number of types of nuclei,
determinants, determinant categories and specification patterns for RE2 terms.
Table 9.11. Basic quantities of nuclei, determinants, determinant categories and specifica
tion patterns in RE311.
N V(N) V(1,N) E[V(1.5N)] P(N) cL
Nuclei 113 32 14 37.39 0.124 0.194
Determinants 97 86 75 121.08 0.773 0.336
Determinant categories 97 37 17 44.85 0.175 0.192
Specification patterns 96 12 2 13.01 0.021 0.070
Figure 9.9. The number of types, the growth rate and the relative number of types of nuclei,
determinants, determinant categories and specification patterns for RE311
terms.
zero towards the end of the graph, so the increase in nuclei in panel (a)
will level off if extrapolated further. The growth rate of determinants keeps
decreasing, while that of determinant categories becomes flat.
Panel (c), together with Table 9.3, shows that the developmental curves
of the relative number of determinants, their categories and specification
patterns flatten out above the average. The relative number of nuclei is
below average and keeps decreasing.
The RE311 terms are therefore constructed from a comparatively small
number of nuclei, with a reasonably rich variety of determinants and deter
minant categories. A fair number of conceptual specification patterns are
used. In the case of growth, the number of nuclei is expected to converge,
and no new specification patterns are expected to occur. The number of
determinants is expected to keep increasing, but the pace will slow down,
while the number of determinant categories is expected to keep increasing
slowly, with the pace slowing down slightly.
224 Dynamics of Terminology
Table 9.12. Basic quantities of nuclei, determinants, determinant categories and specifica
tion patterns in RE312.
N V(N) V(1,N) E[V(1.5N)] P(N) cL
Nuclei 60 18 9 21.92 0.150 0.207
Determinants 55 51 47 73.89 0.855 0.346
Determinant categories 55 30 17 37.33 0.309 0.239
Specification patterns 52 11 3 12.04 0.058 0.124
Figure 9.10. The number of types, the growth rate and the relative number of types of nuclei,
determinants, determinant categories and specification patterns for RE312
terms.
Panel (b) shows that the growth rate of specification patterns becomes
almost zero at the end of the graph. The decrease in the growth rate of
the nuclei levels off. The growth rate of determinants keeps decreasing
steadily. The growth rate of determinant categories also keeps decreasing,
though less steeply towards the end of the graph.
Panel (c) shows that the developmental curves of the relative number
of determinants, determinant categories and specification patterns become
Quantitative Dynamics in Term Formation 225
stable, above the average values (see Table 9.3). It is only the nuclei whose
relative number of types is below the average; it keeps decreasing.
The RE312 terms are constructed from a comparatively small number
of nuclei, with a reasonably rich variety of determinants and determinant
categories. A fair number of conceptual specification patterns are used.
In the case of growth, the number of nuclei is expected to grow slowly
but steadily. Recall that in constructing RE312 terms, nuclei representing
different categories are used (see Figure 6.4). The steady increase in the
number of nuclei may correspond to this qualitative feature. The number of
determinants is expected to keep increasing, but the pace will slow down.
The number of determinant categories will also keep increasing, but at a
slower pace which will become progressively slower. Few new concep
tual specification patterns are expected to occur when the number of terms
increases further.
The overall growth pattern of RE312 terms is similar to the pattern of
RE311 terms. However, in the RE311 terms the number of nuclei is most
probably limited, while in the RE312 terms it is expected that new nuclei
will keep occurring, though at a slow rate.
Table 9.13. Basic quantities of nuclei, determinants, determinant categories and specifica
tion patterns in RE4.
N V(N) V(1,N) E[V(1.5N)] P(N) cL
Nuclei 40 26 19 34.20 0.475 0.291
Determinants 32 24 20 33.66 0.625 0.314
Determinant categories 32 18 13 24.52 0.406 0.274
Specification patterns 28 6 0 6.00 0.000 0.053
Figure 9.11. The number of types, the growth rate and the relative number of types of nuclei,
determinants, determinant categories and specification patterns for RE4 terms.
Panel (c), together with Table 9.3, shows that the relative number of
determinants and specification patterns is smaller than the average; they
keep decreasing slowly. The relative number of nuclei is much higher than
the average, though it also keeps decreasing. The relative number of de
terminant categories is higher than the average, and it is not expected to
decrease.
The RE4 terms are constructed from a very wide variety of nuclei, with
a comparatively small number of determinants whose categories, however,
vary. A small number of conceptual specification patterns is used. In the
case of growth, the determinants and their categories are expected to keep
increasing steadily, while the increase in the number of nuclei is expected
to slow down. No new specification patterns will occur.
The diversity of the nuclei in this category may be a reflection of the
fact that this category consists of three related but different types of subcat
egories. It may also be because the terms of this category are peripheral to
Quantitative Dynamics in Term Formation 227
the field of documentation, and only those representing core concepts are
recognised as terms of this field.
Table 9.14. Basic quantities of nuclei, determinants, determinant categories and specifica
tion patterns in AE1.
N V(N) V(1,N) E[V(1.5N)] P(N) cL
Nuclei 60 24 20 34.00 0.333 0.304
Determinants 42 38 35 55.12 0.833 0.342
Determinant categories 42 20 8 21.73 0.191 0.204
Specification patterns 42 11 4 12.41 0.095 0.147
Figure 9.12. The number of types, the growth rate and the relative number of types of nuclei,
determinants, determinant categories and specification patterns for AE1 terms.
expected to converge to zero when the data becomes larger. On the other
hand, the growth rate of nuclei converges to about 0.35, thus new nuclei
keep appearing at a fixed rate. Perhaps this corresponds to the fact that
this category consists of three rather hybrid subcategories. The growth rate
of determinants keeps decreasing, though this decrease is expected to level
off.
Panel (c) and Table 9.3 show that the developmental curves of the rel
ative number of determinants, specification patterns and nuclei are stable,
all of them above the average values. The relative number of specification
patterns is especially high. The relative number of determinant categories
keeps decreasing and is expected to decrease to well below the average.
The AE1 terms are constructed from a variety of nuclei and determi
nants whose categories, however, are limited, with a wide variety of spec
ification patterns. In the case of growth, the number of determinants and
nuclei is expected to keep increasing steadily, while the increase in the num
ber of specification patterns is expected to slow down and eventually level
off. Few new determinant categories are expected to occur.
Table 9.15. Basic quantities of nuclei, determinants, deteiminant categories and specifica
tion patterns in AE21.
N V(N) V(1,N) E[V(1.5N)] P(N) cL
Nuclei 25 13 9 17.41 0.360 0.261
Determinants 21 17 13 22.71 0.619 0.303
Determinant categories 21 13 8 16.44 0.381 0.253
Specification patterns 20 4 2 4.75 0.100 0.210
Figure 9.13. The number of types, the growth rate and the relative number of types of nuclei,
determinants, determinant categories and specification patterns for AE21
terms.
at the end of the observation range, keeps decreasing steadily. It is only the
nuclei whose relative number is above average at 1.5 times the sample size.
The AE21 terms are constructed from a variety of nuclei with a com
paratively limited number of determinants, determinant categories and
specification patterns. In the case of growth, new nuclei are expected to
occur at a fixed rate. The number of determinants and determinant cate
gories will also increase, though this pace will slow down. Only a few new
specification patterns will occur when the size of the data becomes larger.
These tendencies correspond to the qualitative nature of the terms in
this category, i.e. only those whose determinants belong to a certain con
ceptual category are identified as being relevant to the field of documenta
tion. The range of nuclei is also limited, but when more specific subject
fields are identified, complex nuclei are used, and these new (complex) nu
clei will keep occurring.
230 Dynamics of Terminology
Table 9.16. Basic quantities of nuclei, determinants, determinant categories and specifica
tion patterns in AE22.
N V(N) V(1,N) E[V(1.5N)] P(N) cL
Nuclei 109 11 8 14.81 0.073 0.279
Determinants 182 110 91 153.28 0.500 0.317
Determinant categories 182 39 19 47.14 0.104 0.203
Specification patterns 108 4 3 5.50 0.028 0.275
Figure 9.14. The number of types, the growth rate and the relative number of types of nuclei,
determinants, determinant categories and specification patterns for AE22
terms.
Panel (b) shows that the growth rates of nuclei, determinant categories
and specification patterns suddenly drop at the very beginning of the graph.
The developmental curve of the growth rate of nuclei seems to have flat
tened out with a small but non-zero value. The curve of the growth rate of
specification patterns also becomes almost flat, with a very small but still
non-zero value. The growth rate of determinant categories keeps decreas
ing.
Quantitative Dynamics in Term Formation 231
From panel (c) and Table 9.3, it can be observed that the relative num
ber of all four elements is below the average, though the number of deter
minant categories and specification patterns increases towards the end of
the graph. The small relative number of nuclei is particularly notable.
The AE22 terms are thus constructed from a very small number of
nuclei with a comparatively limited number of determinants and their cat
egories, and fairly limited specification patterns. The conceptual variety is
almost exclusively supported by determinants. In the case of growth, the
number of determinants is expected to keep increasing rapidly compared
to the other elements, though the pace of this increase is expected to slow
down. New nuclei and specification patterns will occur at a very slow but
fixed rate.
Table 9.17. Basic quantities of nuclei, determinants, determinant categories and specifica
tion patterns in CE.
N V(N) V(1,N) E[V(1.5N)] P(N) CL
Nuclei 50 23 13 28.19 0.260 0.238
Determinants 35 30 26 42.71 0.743 0.328
Determinant categories 35 16 9 19.71 0.257 0.239
Specification patterns 34 10 5 12.31 0.147 0.195
Figure 9.15. The number of types, the growth rate and the relative number of types of nuclei,
determinants, determinant categories and specification patterns for CE terms.
will occur with a relatively high ratio. New specification patterns will keep
occurring steadily, though with a low ratio.
Table 9.18. Basic quantities of nuclei, determinants, determinant categories and specifica
tion patterns in QL1
N V(N) V(1,N) E[V(1.5N)] P(N) cL
Nuclei 57 29 20 37.76 0.351 0.280
Determinants 46 40 36 57.75 0.783 0.336
Determinant categories 46 26 13 30.47 0.283 0.231
Specification patterns 46 4 0 4.03 0.000 0.069
Figure 9.16. The number of types, the growth rate and the relative number of types of nuclei,
determinants, determinant categories and specification patterns for QL1 terms.
Table 9.19. Basic quantities of nuclei, determinants, determinant categories and specifica
tion patterns in RL1.
N V(N) V(1,N) E[V(1.5N)] P(N) cL
Nuclei 49 21 14 27.04 0.286 0.268
Determinants 41 38 36 56.31 0.878 0.349
Determinant categories 41 23 13 28.12 0.317 0.242
Specification patterns 41 7 2 7.24 0.049 0.159
Figure 9.17. The number of types, the growth rate and the relative number of types of nuclei,
determinants, determinant categories and specification patterns for RL1 terms.
Panel (b) shows that the developmental curves of the growth rates of
specification patterns and determinants flatten out, the former at the value
of zero and the latter at around 0.9. Thus, no new specification patterns
are expected to occur, while new determinants are expected to keep occur
ring at a high, constant rate of around 0.9. The growth rates of nuclei and
Quantitative Dynamics in Term Formation 235
determinant categories keep decreasing. The pace of the decrease of the de
terminant categories is noteworthy; it may converge to zero when the data
becomes larger.
Panel (c) and Table 9.3 show that the relative numbers of nuclei and
specification patterns are more or less average, while the relative numbers
of determinants and of determinant categories are above average.
The RL1 terms are thus constructed from a reasonable number of nu
clei, with a comparatively wider variety of determinants and their cate
gories. As in QL1 terms, the variety of nuclei may be attributable to the
introduction of complex nuclei for more specific terms. Conceptual speci
fication patterns are limited, but not so much when compared to other cat
egories. In the case of growth, the number of determinants is expected to
increase steadily, while no new specification patterns are expected to oc
cur. The number of nuclei and that of determinant categories will increase,
though the pace will slow down, especially for the determinant categories.
Although the relation between the number of determinant categories
and the number of nuclei is reversed between QL1 and RL1, the general
dynamic patterns of growth of the four elements are rather similar. This
reflects the qualitative affinity between these two categories.
Table 9.20. Basic quantities of nuclei, determinants, determinant categories and specifica
tion patterns in RL2.
N V(N) V(1,N) E[V(1.5N)] P(N) cL
Nuclei 26 9 8 13.00 0.308 0.321
Determinants 20 20 20 30.00 1.000 0.358
Determinant categories 20 10 7 13.27 0.350 0.263
Specification patterns 20 3 2 4.00 0.100 0.239
Figure 9.18. The number of types, the growth rate and the relative number of types of nuclei,
determinants, determinant categories and specification patterns for RL2 tenns.
number of the other elements is smaller than the average, though the value
of specification patterns increases towards the end of the graph.
The RL2 terms are constructed from a comparatively small number of
nuclei, with a rich variety of determinants whose categories, however, are
comparatively limited. A comparatively small number of specification pat
terns is used. In the case of growth, almost all the determinant tokens used
are expected to be new determinant types. New nuclei and new specifica
tion patterns also keep occurring steadily, though at a much lower rate.
there are no simple terms in this category. The developmental curve of the
number of specification patterns completely flattens out at an early stage
of the graph. The increase in the number of determinant categories slows
down when the number of terms becomes larger.
Table 9.21. Basic quantities of nuclei, determinants, determinant categories and specifica
tion patterns in AC1
N V(N) V(1,N) E[V(1.5N)] P(N) cL
Nuclei 20 7 4 9.25 0.200 0.216
Determinants 22 18 16 25.69 0.727 0.327
Determinant categories 22 11 5 12.75 0.227 0.203
Specification patterns 20 2 0 2.01 0.000 0.000
Figure 9.19. The number of types, the growth rate and the relative number of types of nuclei,
determinants, determinant categories and specification patterns for AC1 terms.
Panel (b) shows that the growth rate of specification patterns becomes
zero at a very early stage. The instability of the developmental curves of the
growth rates of the other three elements at the right-hand side of the graph
may be due to a problem in numerical calculations, caused by the small size
of the data. Still, we can observe rough tendencies, i.e. the growth rate of
determinant categories approaches zero, while the decrease in the growth
rates of determinants and nuclei are expected to level off.
From panel (c) together with Table 9.3, it can be observed that the rel
ative numbers of all four elements are below average and keep decreasing.
The small value of the relative number of specification patterns is particu
larly notable.
238 Dynamics of Terminology
The AC1 terms are thus constructed from a comparatively small vari
ety of nuclei, determinants and determinant categories, with very limited
types of conceptual specification patterns. In the case of growth, no new
specification patterns are expected to occur, while fewer and fewer new
determinant categories will be observed. Determinants and nuclei are ex
pected to increase steadily.
Table 9.22. Basic quantities of nuclei, determinants, determinant categories and specifica
tion patterns in AC21.
N V(N) V(1,N) E[V(1.5N)] P(N) CL
Figure 9.20. The number of types, the growth rate and the relative number of types of nuclei,
determinants, determinant categories and specification patterns for AC21
terms.
Quantitative Dynamics in Term Formation 239
Table 9.23. Basic quantities of nuclei, determinants, determinant categories and specifica
tion patterns in AC22.
N V(N) V(1,N) E[V(1.5N)] P(N) cL
Nuclei 22 9 6 11.77 0.273 0.253
Determinants 18 13 11 18.28 0.611 0.312
Determinant categories 18 10 6 12.66 0.333 0.242
Specification patterns 16 5 4 7.00 0.250 0.285
Figure 9.21. The number of types, the growth rate and the relative number of types of nuclei,
determinants, determinant categories and specification patterns for AC22
terms.
terns vary rather widely. The wider variety of specification patterns is par
ticularly notable if compared with other AC2 terms; this may be because
many of the activity concepts represented by the AC23 nuclei are subject-
specific and thus can take a wider variety of modification. In the case of
growth, all four elements, including specification patterns, are expected to
increase more or less steadily.
Table 9.24. Basic quantities of nuclei, determinants, determinant categories and specifica
tion patterns in AC23.
N V(N) V(1,N) E[V(1.5N)] V(N) cL
Nuclei 21 14 8 16.96 0.381 0.252
Determinants 11 10 9 14.61 0.818 0.326
Determinant categories 11 7 3 7.58 0.273 0.213
Specification patterns 11 1 0 1.00 0.000 0.000
Figure 9.22. The number of types, the growth rate and the relative number of types of nuclei,
determinants, determinant categories and specification patterns for AC23
terms.
Table 9.25. Basic quantities of nuclei, determinants, determinant categories and specifica
tion patterns in AC24.
N V(N) V(1,N) E[V(1.5N)] P(N) cL
Nuclei 26 19 16 26.91 0.615 0.313
Determinants 14 14 14 21.00 1.000 0.354
Determinant categories 14 12 10 16.50 0.714 0.315
Specification patterns 14 2 1 2.50 0.071 0.177
Figure 9.23. The number of types, the growth rate and the relative number of types of nuclei,
determinants, determinant categories and specification patterns for AC24
terms.
The AC23 terms are constructed from a fair variety of nuclei and de
terminants, whose categories however do not vary much. Only a very small
number of specification patterns is used. In the case of growth, the increase
in nuclei is expected to slow down rapidly, perhaps to the stage where no
new nuclei will be observed. Few new determinant categories will be intro
duced, and no new specification patterns will be used.
Panel (b) confirms these points. The growth rates of nuclei, determi
nants and specification patterns converge to non-zero values (though that of
the specification patterns is very close to zero). Only the growth rate of the
determinant categories keeps decreasing.
Panel (c) and Table 9.3 show that the developmental curves of the rela
tive number of all four elements flatten out. The relative numbers of nuclei
and determinant categories are the highest among all the categories. The
relative number of determinants is also much higher than the average (the
second highest among all the categories). The relative number of specifica
tion patterns, on the other hand, is much lower than the average.
The AC24 terms are constructed from a very wide variety of nuclei and
determinants, whose categories also vary widely. In contrast, only a very
small number of specification patterns is used. In the case of growth, all
four elements are expected to keep increasing in number, though the growth
rates will vary. Only the increase in the number of determinant categories
is expected to slow down as the data become larger. The other elements
will keep increasing linearly. Given that the growth rate of determinants
converges to a higher value than that of nuclei, determinants are expected
to outnumber nuclei in the long ran.
adding a fourth type: (iv) The number of elements does not increase at all
(the growth rate converges to zero at a very early stage of the observation).
Table 9.26 summarises the growth patterns on the basis of these gen
eral types, where ↑ indicates (i), indicates (ii), → indicates (iii) and ↓
indicates (iv). As in Table 9.3, NUC indicates the nuclei, DET the deter
minants, CDET the determinant categories and SPEC the conceptual spec
ification patterns. These tendencies are not rigid in the statistical sense but
show general trends. Table 9.26 and Table 9.3 give the overall tendencies
of the growth patterns of terms in each conceptual category in the field of
documentation. As a natural extension of the present study, examining the
conceptual motivations that lead to the dynamic tendencies described in
this chapter would be an important research topic for the next stage of the
research cycle.
Quantitative Dynamics in Term Formation 245
Conclusions
Chapter 10
Having carried out a concrete analyses of the patterns of term formation and
of terminological growth for the terminology of documentation, and having
shown how conceptual descriptions can be complemented by quantitative
descriptions within the structural framework, it is time to look back and
examine what has been achieved and to clarify what still remains to be
done.
We start by summarising and examining what has been done so far in
the present study. In the process, the technical problems to be addressed in
future work will also be clarified. After this, we will illustrate the wider per
spective of research in term formation and terminological growth in which
the present study can be situated.
The present study makes two major contributions to the study of terminol
ogy. Firstly, we examined the requirements for a theory of terminology (in
Chapters 1 and 2) and provided a concrete theory of the structural dynamics
of terminology based on the example of the terminology of documentation
(in Chapters 6 and 9). Our principal theoretical concern was to establish a
theoretico-descriptive framework for the study of terminology and not just
a study about exemplar terms. Secondly, we consolidated both the concep
tual and the quantitative methodologies for the establishment of a theory of
the structural dynamics of terminology. Let us briefly review these points
in turn.
250 Dynamics of Terminology
1
Nevertheless, methodologically, the structural approach was adopted, assuming the ex
istence of the terminological sphere as distinct from the textual sphere or discourse. Al
though it was emphasised that the "structural" approach in the present study is a method
ological concept and does not contradict the perception that terminology belongs to the
sphere of parole, this methodological choice still limited the range of the investigation. The
consequence of this will be examined further in the next section.
252 Dynamics of Terminology
course in which individual terms are used is linked with the flexible aspect
of terminology. From the point of view of individual terms, this distinction
corresponds to the dual, i.e. classificatory and descriptive, role of a term.
On the one hand, a term distinguishes a concept it represents from other
concepts represented by other terms and demonstrates the position and sta
tus of the concept within the overall conceptual structure represented by the
terminology. On the other hand, a term describes the concept it represents
within the descriptive structure of the discourse in which the term is used.
Based upon this distinction, the systematic patterns of term formation
and terminological growth, as observed in the overall system of terminol
ogy, were investigated, assuming that some systemic/systematic factors in
the existing terminology of a domain direct the formation of new terms and
the growth of terminology. As shown especially in Chapter 6, the actual
description of documentation terms demonstrated the conceptual system
aticky of term formation patterns in many conceptual categories. So the
assumption of systematicity of terminology has empirically proved to be
valid to a significant extent.
All in all, even though some of these assumptions are not completely
correct, they can be justified given the current state of terminological re
search and because the central objective of the present study is (i) to move
away from establishing a theory which only describes exemplar terms to a
genuine theory of terminology, and (ii) to show the possibility of describing
the formation and growth patterns of terminology, not just of some exem
plar terms, with due granularity. Both, we contend, are necessary for the
ories of terminology, irrespective of what kind of assumptions one adopts
with respect to the nature of terms and terminology.
the conceptual specification patterns. Empirically they are little more than
ordinary intra-term relations, but theoretically they constitute an important
mechanism for incorporating a classificatory, as opposed to a descriptive,
aspect of term formation. Note that the classificatory aspect of term forma
tion is essentially linked with our assumption that the formation of terms is
systematic with respect to the structure of terminology, as opposed to the
use of terms in discourse.
The actual descriptions of the conceptual patterns of term formation,
carried out in Chapters 4 to 6, have shown that (i) these devices were useful
in revealing the systematic patterns of term formation vis-à-vis the total
ity of the terminology of a domain and that (ii) a considerable degree of
systematicity was actually observed in the formation patterns of documen
tation terms representing various types of concepts. It was therefore shown
that the basic assumptions of the present study as well as the conceptual
approach were useful as a first approximation to the theorisation of term
formation and terminological growth.
Under the current framework and assumptions, a few technical prob
lems remain nevertheless (see 6.3 as well for reflections on more technical
details). Firstly, we did not explicitly distinguish the general part and the
domain-dependent part of the conceptual system. Although this can be jus
tified for the immediate purpose of describing the formation patterns of the
terminology of documentation, it would be desirable to clarify the distinc
tion between general and domain-dependent parts of conceptual systems.
By doing this, it would become possible to use the general part of a con
ceptual system as a methodological resource for the analysis of terminolo
gies of other domains, which would in turn contribute to the clarification of
the conceptual factors of term formation of terminologies across different
domains.
The second problem is related to the simplicity of the conceptual sys
tem introduced in the present study. Formally, the concept system intro
duced in Chapter 4 has a simple hierarchical structure. However, as is
widely acknowledged, conceptual structures are essentially multidimen
sional (Bowker 1997; Kageura 1997b), which would lead to a lattice of cat
egories or would better be defined by the combination of features. In fact,
the multidimensional nature of concepts manifests itself in some of the de
scriptions in Chapter 6. For instance, the "horizontal" category shifts as ob
served in Figure 6.4 reveal that some cross-categorial conceptual character-
Towards Modelling Terminological Dynamics 255
istics are involved. The affinity between formation patterns of quality terms
and relation terms also showed the existence of conceptual dimensions that
cut across these two categories. Although we referred to cross-categorial
relationships and the multidimensionality of the viewpoints of concept cat
egorisation wherever necessary, we did not fully explore this aspect. In this
respect, there remains further room for refining and re-formalising concep
tual systems and the descriptions of term formation patterns.
Lastly, we used the conceptual categories defined within the conceptual
system in two different ways. First, by attributing the categories to terms,
we used the conceptual categories as a conceptual "field" for observing the
systematicity of term formation. Second, by attributing the categories to
morphemes, we used the conceptual categories to observe the regularity of
combination patterns. For the latter purpose, it might have been better to
use the detailed conceptual features rather than to use the same concep
tual categories as defined for classifying terms. We did not elaborate this
because the main purpose of the current study is to describe the forma
tion patterns of terms rather than the combination restrictions or tendencies
of morphemes in terms. In the analyses in Chapters 6 and 9, we retained
the bottom-level categories assigned to morphemes, even though we did
not go into such a detailed level of analyses. This is because the bottom-
level categories will provide a good starting point for the exploitation of
the combination restrictions or tendencies of morphemes, which would be
better pursued in parallel with the type of study elaborated in this book.
creases), no existing terms disappear (or no new terms appear). This also
means that no existing constituent elements or specification patterns are as
sumed to disappear when the number of terms increases. This contrasts
with what may be called a generation/decline model, which assumes that
some terms — and thus possibly morphemes and specification patterns —
may disappear even when the number of terms increases.
The fact that the descriptions in this study assume the accumula-
tion/disaccumulation model is explicit in the quantitative model3. This is
not so obvious in the conceptual descriptions, simply because the concep
tual descriptions are not sensitive enough to reveal this point explicitly. One
might think that replacing the randomness assumption of the distribution of
morphemes and specification patterns with a more complex model would
contribute to establishing a generation/decline model. This is not the case,
however, because the quantitative model is first and foremost constructed
over the distribution of morphemes, while the essential feature of a genera
tion/decline model should theoretically be attributed to the generation and
disappearance of terms.
In fact, as was clear from Chapters 6 and 9, the increase and decrease in
the terms is an explanatory parameter, not the phenomenon to be explained
in itself. What was to be explained was the systematic conceptual patterns
and quantitative structure which govern the formation of possible terms and
the growth of terminology when the terminology data size increases or de
creases. If we take a diachronic perspective, this corresponds to regarding
terminology growth as the addition of new terms to the existing terms, with
out being concerned with the disappearance of existing terms. However, it
should be recalled that, in the first place, we explicitly stated that we are
concerned with the dynamic potentiality (of creating new terms) observed
in the synchronic slice of the internal structure of terminology (see Chapter
2, especially Figure 2.2). It would therefore be misleading to interpret the
descriptions of the present study from a diachronic perspective.
How, then, can the contributions of the present study be located within
the overall perspective of term formation and terminological growth? Even
3
Technically, binomial interpolation leaves room to estimate an item occurring only
once in the original sample size will occur more than twice in a smaller sample size. This
is due to the fact that we used the binomial approximation to the hypergeometric model
(see Baayen [2001: 65-66] for the technical discussion concerning this point) and does not
mean that the binomial model incorporates the generation/decline type of dynamism.
Towards Modelling Terminological Dynamics 259
if the study of terminology should first and foremost be concerned with the
terminologies of individual domains, is it not the case that actual termi
nologies of different domains sometimes interact with one another? How
can these be incorporated into the framework of the study of synchronic
dynamics? What is more, if terms belong to the sphere of parole, is it not
necessary after all to relate the structural dynamics of terminology to what
is actually happening to the terminology, i.e. evolution and change, in the
real world along a real time scale?
Now that we have obtained the descriptions of the dynamic potential
ity of the terminology of documentation as seen from the synchronic view
point, we can widen the perspective to what was illustrated in Figure 2.2.
Let us conclude the present study by elaborating this and suggesting further
directions of research in the dynamics of terminology.
In the study of the terminologies of more than one domain, there are
two aspects which belong to different levels that must be considered. The
first, a natural extension of the type of study we have carried out, is a com
parative analysis of the dynamics of terminologies of different domains.
The second is the study of the interaction of terminologies.
In the first type of study, at least two levels of analyses are possible:
1. A comparative study which takes into account the overlap of concepts
and lexical items represented by terms in different domains (or lan
guages). For instance, when Pugh (1984) compares the terminology
of information processing in English, French and Spanish, the interlin
gual overlap of concepts represented by terms was taken into account.
From the point of view of the study of the dynamics of terminology, it
is important to carry out the comparison not only between lexical items,
conceptual categories or conceptual specification patterns, but also be
tween the interactions of these aspects; the comparison of terminolog
ical data alone falls short of the comparison of the dynamics of termi
nologies. In that sense, it is essential to describe the patterns of term
formation and terminological growth for the terminologies of individual
domains. Though there are some comparative studies of terminology
across domains or languages (e.g. Miyajima 1981; Kageura 1994), what
they compare is the data rather than the dynamic characteristics of the
data.
2. A comparative study at the level of the formal structure of terminolo
gies, without taking into account the overlap of concepts, lexical items
or conceptual specification patterns. Let us explain this by means of a
simplified example. Take, for instance, the following three terminolo
gies, T(A), T(B) and T(C), of the domains A, and C:
(A) = {a, ab, abc, bc, bd, acd}
() = {a, b, , e, . abce}
() = {p.pq.pqr,qr,qs,prs}
where elements separated by commas, such as a, ab, etc., in the curly
brackets indicate terms and individual lower-case letters, such as a, b,
etc., indicate constituent elements or morphemes of terms. If we take
into account the overlap of lexical items and concepts represented by
them, then (A), the terminology of the domain Ճ, is much closer to
T(B) than to T(C) because there are many common lexical items, i.e.
Towards Modelling Terminological Dynamics 261
Note that, although JSLS (1997) was the best choice for validation at
the time of writing this book, it has a different editing policy from Wersig
& Neveling (1984) and the field covered is not completely the same (library
science vs. documentation). Thus, the result of the observations cannot be
attributed solely to the gap between the synchronic dynamics and the actual
diachronic change of the terminology. Also, for some conceptual categories
established in Chapter 4, the number of terms in JSLS (1997) is smaller
than that in Wersig & Neveling (1984). This shows that regarding JSLS
(1997) as the result of monotonic accumulation from Wersig & Neveling
(1984) is not completely justified. This said, the comparison between Wer
sig & Neveling (1984) and JSLS (1997) would still be interesting for the
immediate purpose of obtaining the general relation between the dynamic
potentiality and the actual change of terminology.
As a rigid examination is not possible nor very useful given the differ
ences in the two sets of data, we only examine qualitatively the tendencies
of the conceptual specification patterns for the terms representing three cat
egories: material entity - animate - people - types (111); representa
tional entity - documentation entities - simple units (RE312); and quality
(QL). 111 was chosen because the analyses showed that the terms of
this category are expected to follow very coherent formation and growth
patterns. RE312 was chosen because it is one of the central conceptual cat
egories of the field of documentation and library science, and at the same
time the patterns show reasonable diversity. QL was chosen as a represen
tation of a non-entity category.
10.2.2.1.1 Material entity - animate -people - types (111)
In JSLS (1997), there are 31 (4 simple, 24 two-item and 3 three-item) terms
that belong to 111 (proper names were excluded). Table 10.1 shows
these terms with their conceptual specification patterns. The terms marked
with an asterisk (*) are also in Wersig & Neveling (1984)5. As observed in
6.1.1.1 (Table 6.1), specifications of functional link (FFUN) and of whole
or affiliations constitute the majority; the formation of 111 terms in
5
Tables 10.1, 10.2 and 10.3 show that the number of terms listed in Wersig & Neveling
(1984) is not large. This seems to reflect the difference in the field of coverage and editorial
policy rather than the creation and/or disappearance of terms. For this reason, we did not
make a quantitative analysis of the relationship between the terms in JSLS (1997) and the
terms in Wersig & Neveling (1984), nor did we merge these two to keep the data in line
with the monotonic accumulation model.
Towards Modelling Terminological Dynamics 265
Table 10.1. Material entity - animate - people - types (111) terms in JSLS (1997).
Table 10.2. Representational entities - documentation entities - simple units (RE312) terms
in JSLS (1997).
the prediction made in 9.3.2.3, where a few new patterns were predicted to
occur when new terms of this category are created.
10.2.2.1.3 Quality (QL)
There are 23 (2 simple and 21 two-item) terms that belong to QL. Table 10.3
shows these terms with their conceptual specification patterns. Here again,
the major pattern, the specification of attributed concept (IACO: 18 in total),
is the same as that observed in the QL terms in Wersig & Neveling (see
6.1.5 and Table 6.18). Other than IACO, only two specification patterns
are observed, of which specification of use (FUSE) was observed also in
Wersig & Neveling (1984). On the other hand, the quantitative analysis
in 9.3.5 predicted that few new specification patterns would occur, which
contradicts the data here; the specification or nature (INAT) occurs twice in
JSLS (1997). As in 111 terms, the general tendency of the specification
patterns observed in 6.1.5 holds for the terms in JSLS (1997), but not the
detailed predictions.
sig & Neveling (1984) explain the tendencies observed in JSLS (1997),
though many terms are not listed in Wersig & Neveling (1984). This
proves that the basic starting hypothesis of the present study, formulated
in 2.2.1, is, in principle, valid. We can now formulate the following as an
empirically-valid observation.
On the other hand, there are cases which violate the predictions of the
structural model. There are two possible reasons for this. Firstly, as men
tioned above, the editing policies and the fields of coverage of Wersig &
Neveling (1984) and JSLS (1997) are different. Secondly, even if Wersig &
Neveling (1984) and JSLS (1997) had the same editing policy and the same
coverage (and we thus would have been able to compare the equivalent data
for 1976 and for 1997), it is still expected that the model which is based on
idiosynchronic terminological data would not be able to predict completely
the actual diachronic changes of the terminology. This is because the for
mer only describes the realistic potentiality as expected from an existing
structure of terminology, while the actual changes of terminology are also
affected by other factors related to language facts or parole. The dynamic
potentiality observed in the existing terminology provides a necessary con
dition or a basic restriction in which actual changes of terminology take
place, but does not uniquely determine the formation of new terms.
In order to explore the study of the dynamics of terminology beyond
what has been carried out so far, it is technically necessary to overcome the
first obstacle of obtaining theoretically equivalent data at different times6.
Assuming that the problem of data collection is solved, it is necessary to
6
This causes a serious problem in the study of lexicology. While textual data is con
sidered to be a "natural" product, lexicological data can be consolidated only as a result of
secondary processing of the "natural" language data. Though theoretically speaking, the
lexicological sphere is not necessarily secondary to the textual sphere, lexicological data
is. One possible way of obtaining equivalent terminological data for different times is to
apply some automatic measures, as proposed in, for instance, Frantzi & Ananiadou (1999),
Hisamitsu, Niwa & Tsujii (2000) or Nakagawa (2001), to the textual data of different times
which are themselves obtained by consistent sampling with proper consideration of such
factors as register variation (Biber 1995).
Towards Modelling Terminological Dynamics 269
nised from the use of terms in discourse. These may include, among others,
the following:
1. How microscopic variations of individual terms in discourse affect the
actual change of terminology over time.
2. How the difference in terms across different register variations of text
and discourse affects the change of terminology over time.
3. How inter-domain interaction in discourse affects the structuring, for
mation and change of terms and terminology of one domain7.
Figure 10.2. The overall framework of the study of the dynamics of terminology.
At the moment, only a few studies deal with the dynamics of termi
nology in relation to the factors related to discourse. The pioneering work
by Tartier (2001) tries to explore the first aspect of the above three, tak
ing full advantage of the recent availability of corpora and the development
of computational tools. But the potential of her approach has yet to be
7
Note that the interaction of terminology mentioned at the end of 10.2.1 can be studied
at this stage.
272 Dynamics of Terminology
fully exploited. Studies such as Shaw & Gaines (1993), Polanco, Grivel
& Royauté (1995) and Ibekwe-SanJuan (1998) touch upon the relation be
tween terms (and their variations) and variations of texts or interactions of
different domains. But their aim is to clarify the patterns of evolution or in
teractions of scientific domains through the observation of terminology; the
direction is opposite. Other studies such as Beaugrande (1996) or Ahmad
(1996) touch on some of these aspects, but they remain mostly speculative.
In addition, presently it is not at all clear how the factors in the sphere of
discourse and in the terminological space can be integrated into a single
modelling of the dynamics of terminology. This may perhaps become clear
only through concrete individual studies, such as Tarder (2001), that deal
with such aspects. Figure 10.2 illustrates the overall framework of the study
of the dynamics of terminology. At the moment, the schema necessarily re
mains rough, as the concrete integration of the different types of study is
yet to be established.
The extension of the study may go beyond the linguistically-bound
concept of discourse into the social and cultural sphere. This is the natural
consequence of the fact that terminology is a concept consolidated in the
sphere of parole. From the wider perspective of the study, therefore, the
vast space of the dynamics of terminology remains unexplored.
Looking back from here, it is clear that what is addressed in this book is
only a small step towards the full exploration of the complex world of term
formation and terminological growth. It was, however, an essential step, for
two reasons. Firstly, because the descriptions given in this study constitute
a core part of the overall study of the dynamics of terminology. Secondly,
because it tried to establish the study of the dynamics of terminology as
distinct from the study of the dynamics of data which happen to be a set
of terms. Though this sounds like a mere truism, it has so far largely been
overlooked in much of the otherwise interesting research in terminology.
Appendices
Appendix A
List of Conceptual Categories
AC Activity
AC1 Units of Activities/Activity Indicators
AC2 Specific Activities
AC21 — Action
AC211 Instant Activity
AC212 Static Durative Activity
AC213 Dynamic Durative Activity
AC22 — Transference
AC221 General Transfer
AC2211 Instant Activity
AC2212 Dynamic Durative Activity
AC222 Representational Entity (Information) Transfer
AC2221 Instant Activity
AC2222 Dynamic Durative Activity
AC223 Material Entity (Document) Transfer
AC2231 Dynamic Durative Activity
AC23 — Production
AC231 General Production
AC2311 Instant Activity
AC2312 Static Durative Activity
AC2313 Dynamic Durative Activity
AC232 Representational Entity Production
AC2321 Instant Activity
AC2322 Dynamic Durative Activity
278 Appendix A
EJXT Juxtaposition
FCOM DET specifies the complementary elements of NUC as a function, by AFO,
PRO, MEA and/or MAN
FDES DET specifies the use of NUC by DES
FFUN DET specifies the functional link of NUC, by FUN, AFO, PRO, MEA and/or
MAN
FROL DET specifies the role of NUC by ROL
FUSE DET specifies the use of NUC, by USE and/or MAN/MEA
IACO DET specifies the concept to which NUC is attributed by
IFOA DET specifies the formal attributes of NUC by FOA
IMAN DET specifies the static manner of NUC by MAN
INAT DET specifies the nature of NUC by NAT
IQUA DET specifies the quantity of NUC by QUA
ODTA DET differentiates NUC by DTA
PCON DET specifies the constituent elements of NUC by CON
PICR DET specifies the information contents and representation of NUC by ICR
PPAR DET specifies the whole or affiliations of NUC by PAR
RLOC DET specifies the location of NUC by LOC
RORI DET specifies the origin of NUC by ORI and MAN or FOA
RSCO DET specifies the scope of NUC by SCO
RSTA DET specifies the status of NUC by STA
RSUB DET specifies the subject of NUC by SSP
RTIM DET specifies the time of NUC by TIM
Appendix
List of Terms by Conceptual Categories
Each entry consists of (i) conceptual category tag, (ii) Japanese term, and
(iii) English term. "=" shows embedded combinations.
282 Appendix
List of Terms by Categories 283
284 Appendix
List of Terms by Categories 285
286 Appendix
List of Terms by Categories 287
288 Appendix
List of Terms by Categories 289
290 Appendix
List of Terms by Categories 291
292 Appendix
List of Terms by Categoríes 293
294 Appendix
Appendix D
List of Morphemes by Conceptual Categories
Each entry consists of (i) conceptual category tag, (ii) Japanese morpheme
(* is added if the morpheme is itself a term), (iii) English translation and
(iv) occurrence in the corpus.
296 Appendix D
List of Morphemes by Categories 297
298 Appendix D
List of Morphemes by Categories 299
300 Appendix D
List of Morphemes by Categories 301
302 Appendix D
Bibliography
DeGroot, M. 1984. Probability and Statistics (2nd ed). Reading: Addison Wesley.
Deuleuze, G. 1973. "A Quoi Reconnait-on le Structuralisme?" In Chattet, F. (ed.)
La Philosophie au XXe Siècle. Paris: Hachette. [Nakamura, Y. (trans.) 1998.
"Kouzousyugi ha naze sou yobarerunoka," Nakamura, Y. (trans. & ed.) Ni-
jusseiki no Tetsugaku (Seiyou Tetsugaku no Chi VIII). Tokyo: Hakusuisya. p.
332-371.]
Desmet, I. and Boutayeb, S. 1994. "Terms and words: Propositions for terminol
ogy," Terminology 1(2), p. 303-325.
Downing, P. A. 1977. "On the creation and use of English compound nouns,"
Language 53(4), p. 810-842.
Ducrot, O. and Todorov, T. 1979. Encyclopedic Dictionary of the Sciences of
Language. Baltimore: Johns Hopkins University Press. [Porter, C. (trans.)]
EDR, 1989. Gainen Jisyo (TR-012). 2nd ed. Tokyo: Japan Electronic Dictionary
Research Institute.
Fabre, 1996. "Interpretation of nominal compounds: Combining domain-
independent and domain-specific information," COLING'96, p. 364-369.
Felber, H. 1984. Terminology Manual. Paris: Unesco and Infoterm.
Fellbaum, . (ed.) 1998a. WordNet: An Electronic Lexical Database. Cambridge:
MIT Press.
Fellbaum, 1998b. "A semantic network of English verbs," In Fellbaum, C. (ed.)
WordNet: An Electronic Lexical Database. Cambridge: MIT Press. p. 69-104.
Finin, T. 1980. The Semantic Interpretation of Compound Nominals (TR -96).
Illinois: University of Illinois at Urbana Champaign.
Foucault, M. 1968. "Sur l'archéologie des sciences: Réponse au cercle
d'epistemologie," Cahiers peur l'Analyse 9 (Génealogie des Sciences), p. 9-
40. [Ishida, H. (trans.) "Kagaku no koukogaku," Michel Foucault Shikou Syusei
3. Tokyo: Chikuma. p. 100-143.
Frantzi, T. K. and Ananiadou, S. 1999. "The C-value/NC-value method for ATR,"
Journal of Natural Language Processing 6(3), p. 145-179.
Fujii, T. 1976. " 'Doushi+teiru' no imi," In Kindaichi, H. (ed.) Nihongo Doushi
no Asupekuto. Tokyo: Mugisyobo. p. 97-116.
Fujita, K. 1984. Nihongo Meishi Renzoku Fukugougo no Kouzou Kaiseki. MSc
Thesis, Kyoto University.
Fujiwara, S. and Fujiwara, Y. 1987. Sougou Gakujutsu Yougosyu. Tokyo: Kinoku-
niya.
Gale, W. A. and Sampson, G. 1995. "Good-Turing frequency estimation without
tears," Journal of Quantitative Linguistics 2(3), p. 217-237.
Geeraerts, D. 1994. "Lexicology," In Asher, R. E. (ed.), The Encyclopedia of
Language and Linguistics 4. Oxford: Pergamon Press. p. 2189-2192.
Good, I. J. 1953. "The population frequencies of species and the estimation of
population parameters," Biometrika 40(3-4), p. 237-264.
306 Dynamics of Terminology
Good, I. J. and Toulmin, G. H. 1956. "The number of new species, and the increase
in population coverage, when a sample is increased," Biometrika 43(1), p. 45-
63.
Greimas, A. J. 1966. Semantique Structurale: Recherche de Méthode. Paris:
Larousse. [Tajima, H. and Torii, M. (trans.) 1988. Kouzou Imiron: Houhou
. Tokyo: Kinokuniya.]
Guda, R. V. and Lenat, D. . 1990. "Cyc: A mid-term report,"AI Magazine 18(3),
p. 32-59.
Hatcher, A. G. 1960. "An introduction to the analysis of English noun com
pounds," Word 16(3), p. 356-373.
Hayashi, C. 1993. Koudou Keiryougaku Josetsu. Tokyo: Asakura.
Hempel, C. G. 1952. Fundamentals of Concept Formation in Empirical Science.
Chicago: University of Chicago Press.
Herdan, G. 1960. Type-Token Mathematics: A Textbook of Mathematical Linguis
tics. 's-Gravenhage: Mouton.
Hisamitsu, T. Niwa, Y. and Tsujii, J. 2000. "A method of measuring term rep
resentativeness — baseline method using co-occurrence distribution," COL
ING'2000, p. 320-326.
Hockett, C. F. 1958. A Course in Modern Linguistics. New York: Macmillan.
Hoffmann, L. 1979. "Towards a theory of LSP," Fachsprache 1(1/2), p. 12-17.
Ibekwe-SanJuan, F. 1998. "Terminological variation, a means of identifying re
search topics from texts " COLING-ACL'98, p. 564-570.
Ishii, M. 1986. "Fukugou meishi no gokouzou bunseki ni tsuite no ichi kousatsu
— gakujutsu yougo wo rei ni —," Kokugogaku 144, p. 13-26.
Ishii, M. 1987a. "Fukugou meishi no kouzou to kinou," Mizutani, S., Tajima, K.,
Satake, H., Nomura, M., Ishii, M., and Kabashima, T. Moji-Hyouki to Gokousei
(Asakura Nihongo Shinkouza 1). Tokyo: Asakura. p. 145-173.
Ishii, M. 1987b. "Economy in Japanese scientific terminology," In Czap, Η. and
Galinski, C. (eds.) Terminology and Knowledge Engineering. Frankfurt: Indeks
Verlag. p. 123-136.
Ishikawa, T., Sakamoto, Y and Sato, M. 1986. "Mu projekuto ni okeru imi ma-ka
no gainen to taikei," Jouhou Syori Gakkai u Houkoku Shizengengo Syori
Kenkyukai Houkoku 55-1. np.
ISO 5127-1 1983. Documentation and information— Vocabulary — Part I: Basic
Concepts. Geneva: International Organization for Standardization.
ISO 704 1987. Principles and Methods of Terminology. Geneva: International
Organization for Standardization.
Jackendoff, R. 1983. Semantics and Cognition. Cambridge, Mass: MIT Press.
Jackendoff, R. 1987. Consciousness and the Computational Mind. Cambridge,
Mass: MIT Press.
Jackendoff, R. 1990. Semantic Structures Cambridge, Mass: MIT Press.
Bibliography 307
Vachek, J. 1983. "On some less known aspects of the early Prague linguistic
school," In Vachek, J. and Duskova, L. (eds.) Praguiana : Some Basic and Less
Known Aspects of the Prague Linguistic School. Amsterdam: John Benjamins.
Warren, B. 1978. Semantic Patterns of Noun-Noun Compounds. Lund: Acta Uni
versitaris Gothoburgensis.
Way, E. C. 1991. Knowledge Representation and Metaphor. Dordrecht: Kluwer
Academic.
Wersig, G. and Neveling, U. (eds.) 1976. Terminology of Documentation. Paris:
Unesco.
Wersig, G. and Neveling, U. (eds.) 1984. UNESCO Jouhou Kanri Yougosyuu.
Tokyo: JICST and NIPDOC. [Japanese Version of Wersig, G. and Neveling, U.
(eds.) 1976)]
Wierzbicka, Α. 1996. Semantics: Primes and Universals. Oxford: Oxford Uni
versity Press.
Wüster, E. 1959/60. "Das Worten der Weld, schaubildlich und terminologisch
dargestellt," Sprachforum 3(3), p. 183-204.
Yoshikane, F., Tsuji, F., Kageura, K. and Jacquemin, C. 1999. "Detecting Japanese
term variation in textual corpus," The Fourth International Workshop on Infor
mation Retrieval with Asian Languages (IRAL'99), p. 97-108.
Young, Η. (ed.) 1983. ALA Glossary of Library and Information Science.
Chicago: American Library Association.
Young, H. (ed.) 1988. ALA Tosyokan Jouhougaku Jiten. Tokyo: Maruzen.
[Japanese Version of Young, H. (ed.) 1983)]
Yule, G. U. 1944. The Statistical Study of Literary Vocabulary. Cambridge: Cam
bridge University Press.
Yumoto, S. 1977. "Awasemeishi no imikijutsu wo megutte," Tokyo Gaikokugo
Daigaku Ronsyuu 27. p. 31-46.
Yumoto, S. 1979. 'Awasemeishi no kouzou — n + n taipu no wago meishi no baai
—," In Gengogaku Kenkyukai (ed.) Gengo no u. Tokyo: Mugisyobo. p.
367-395.
Zawada, . and Swanepoel, P. 1994. "On the empirical inadequacy of termino
logical concept theories: A case for prototype theory," Terminology 1(2), p.
253-275.
Index
A 172-174
conditions for application of
"abstract entity" concept 67
177-181
subcategories of 74
fluctuation of values of 174
"abstract entity" terms
binomial model 167-170
conceptual patterns of 136-142 Poisson approximation to 175
quantitative patterns of 227- Bloomfield 11, 40
231 Bowker 254
Achinstein 61 Budin 20, 40
"activity" concept 67
subcategories of 78-80
"activity" terms
conceptual patterns of 146-152 Cabré 20, 253
quantitative patterns of 236- Carlson 63
243 Carnap 57
Adams 93 Chan 62
Ahmad 272 characteristic 10
Aitchson 62, 71 and semantic features 18
American structural linguistics 40 Chitashvili 37, 168, 171,257
Aristotle 55, 61 Chomsky 12
artificial language 15 Church 172
"classificatory entity" concept 67
average length of terms 178, 185
average use of morphemes 190-192, subcategories of 74-75
"classificatory entity" terms
194-199
conceptual patterns of 142-143
quantitative patterns of 231-
232
Baayen 37, 168, 171, 172, 175, 178, classificatory regularity of terminol
179, 245, 256, 257 ogy 35, 252
basic-level nucleus 51-53 coefficient of loss 171, 186
Bauer 11,45 Collins 56, 57
Beaugrande 272 Colon Classification 62
Bessé 9, 24 comparative study of terminology
binomial interpolati on/extrapolation 260
316 Dynamics of Terminology
I J
Ibekwe-SanJuan 272 Jackendoff 18, 54, 56, 57, 62, 78
intra-term relations 58-60, 98-105 Jacquemin 20
"affected object" 99 Jespersen 93
"attributed concept" 101 Johnson-Laird 57
as binary relation 60
comparison of 96-97
"connection introduction" 98 Kageura 37, 48, 62, 168, 171, 179,
"constituents" 100 254, 256, 257, 260, 261
"destination" 99 kango 5
"differentiation tag" 102 Katz 57
form of 58-60 Keil 18,57
"formal attributes" 100 Kindaichi 78
"function" 99 Kita 171
"information content and repre
Kokuritsu Kokugo Kenkyusho 62
sentation" 100
Kuhn 35
"juxtaposition" 99
"location" 101
L
"manner/mode" 100
"means" 100 Lamberts 55
"nature" 101 language system 13
"origin" 101 langue 11, 251
"partitive" 100 Lara 184
Index 319
accumulation/disaccumulation 160-162
model 257 and intra-term relations 49
generation/decline model 258 linguistic elements 51-54
structural semantics 39 "material entity" terms 117-
structural type distribution 175 27, 264-265
subject field 10 "quality" terms 143-145, 267
Swanepoel 20, 251 quantitative tendency of 48
synchronic dynamics 34, 259-261 "relation" terms 145-146
and diachronic perspective "representational entity" terms
263-272 127-136, 265-267
validation of 263-267 and word formation 46-51
synchronic study of terminology 18, terminography 28
263-272 terminologisation 12
synonymy 184 terminology 9
system of terminology 32-34 and artificial language 15
and discourse 33-34, 270-272 as an aspectual category 26-27
as a concept 26, 29-31
conceptual approach 253-255
conceptual regularity 36-37
Takeuchi 78 diachronic study of 18, 263-
Tartier 20, 271,272 272
Temmerman 11, 20, 29, 39, 251, 253 dynamics of 32-34, 40-42,
term formation 160-162
"abstract entity" terms 136-142 as an empirical object 26, 29-
"activity" terms 146-152 31
affinities 152-153 evolution of 40-42
basic assumptions 45-46 flexibility of 15
building blocks of 46 general theory of 17-20
and "classificatory entity" 153— and lexicology 262
155 methodology of 35-39
"classificatory entity" terms and natural language 15
142-143 potentiality in 34, 268
classificatory regularity 50 quantitative approach 255-257
and conceptual categories 47- quantitative pattern 37-38
49, 56-58 recent trends in 20
conceptual elements 54-60 regularity of 35
and conceptual specification representative samples 38, 48
patterns 49 rigidity of 15
and conceptual system 49, 56- as a sample 168
58 standardisation 19
description of 51-60 synchronic study of 18, 263-
and dynamics of terminology 272
322 Dynamics of Terminology
system of 32-34 U
theory of 21-24, 31-32 Ullmann 185
traditional theory of 16-20 Ungerer 56, 57
terminology growth urn model 167-168
binomial interpolation/extrapo
lation 172-174 V
binomial model 167-170
mathematical model 167-176 Vienna school 16
randomness assumption 168, vocabulary 9, 30
178, 180
urn model 167-168 W
terminology population 168 wago 4
terms 9 Warren 93
as an aspectual category 26-27 Way 57, 253
characteristics of 14-15 Wersig 4, 185,263
as a concept 26, 29-31 Woods 57
and conceptual categories 86- Wierzbicka 18
88 words 10-11
core and non-core 178-179
as a frame category 26-27
dynamics of 40-42
and terms 11-14
as an empirical object 26, 29-
word formation
31
semantic patterns of 47
evolution of 40-42
and term formation 46-51
as a functional class 12, 26
WordNet 64
and words 11-14
Wüster 16
theory of terminology 21-24, 31-32
conditions for 25-32
Y
range of generalisations of 27-
29 Yoshikane 20
structural approach to 39-40, Yule 168
250 Yumoto 58, 92-93
and terminography 28
thesaurus 61-62 Z
Toulmin 37, 172-173,256
Zawada 20, 251
Toyama 39
traditional theory of terminology
16-20
Tsuji 20
Tuldava 38