Professional Documents
Culture Documents
Grammatical Idioms
Grammatical Idioms
On more than one occasion (see especially Čermák 1998, 2000), it has been pointed out that multi-word
lexemes (MLs) do exist for all classes of words, forming an important extension and continuation of the single-
word items. The extension of these MLs has to be functionally viewed as taking over the function of a word class
single-word items where these are not sufficient and fail to do their job, while the function is stable and in further
demand. Moreover, it can easily be shown that these MLs amount to idioms of special type, as there is little
freedom in their formation there being a pronounced anomaly in their structure. In contrast to traditional fields,
these represent a field very much in the shadow of better-studied areas of idioms.
In most cases, grammars do not take their existence and special nature into consideration and if they do, they
use all sort of labels for them, which suggest that MLs enjoy no uniform way of interpretation. Thus, Quirk´s
grammar (Quirk et al. 1972) has usually no use for „compound/complex“ grammar words and if it does, on an
occasion, the term „compound subordinator“ and „correlative subordinator“ (11.9) is used. Thus, it avoids using
„compound conjunction“, that cases such as as if, as soon as, so that, in order that, as far as undoubtedly are.
The only other case of taking grammar MLs into account and recognizing their special quality is that of
prepositions, which Quirk et al. chooses to call differently, however, namely „prepositional phrase“ and, on one
occasion, „complex preposition“, cf. except for, with the exception of, apart from, etc. (Quirk 6.48, 10.24). There
is neither any mention of ML functioning as pronouns, particles (this being due to a tradition of grammatical
description not recognizing them in English), nor any other word classes.
A very much similar approach using no unified view and label is to be found elsewhere, too. Recent Biber et
al. (1999) does mention a rather non-committal term „multi-word lexical units“ (p.58-9, 85-6), which, later on,
is used for and replaced by „complex preposition, correlative coordinator“ and „complex subordinator“. For
other word classes, only „phrasal and prepositional verbs“ is offered. In the case of nouns, adjectives etc., the
quasi-generative terms are used, such as „noun phrases, prepositional phrases, wh-questions“, etc. However, it is
not quite clear whether these cover text constructions only or stable and system units, too.
Czech grammars do not do any better in this, multi-word grammatical lexemes not being recognized at all (for
a recent grammar, see P. Karlík et al., 1995).
It is not very difficult to list idioms that are functionally equivalent to all major word classes, such as verbs,
nouns or adverbs (for Czech, see Čermák 1998), as there seems to be, on the subsentential level, exactly the
same number of idiom types as that of word classes, cf. rub someone´s nose in it, change horses in midstream,
skeleton in the cupboard, an Indian summer, with hands down, in the middle of nowhere. In this sense, these
stand, respectively, for a verb, noun and adverb; evidently they do form the bulk of idioms and draw most of
descriptive attention. Grammar idioms, equivalent to grammar words, namely prepositions, conjunctions,
particles (if recognized) and pronouns, are much more rare and fewer in number.
Potential candidates for inclusion can be seen in English and Czech cases such as
preposition: as to, on behalf of, with the exception of; na úkor někoho/něčeho, s výjimkou někoho/něčeho,
conjunction: as if, even though, in order that, as long as; i když, jak – tak, v souvislosti s tím, že,
particle: all right, as well; že by, jen jestli, co kdyby,
pronoun: anyone who, those who; ten kdo, to co, kdokoliv který,
although one may easily see interjection in cases such as All right!, For God's sake!, too. Numerals are rather
special due to their well-defined meaning; however, some traces of idiom formation can be found here, too (Čermák
1998 and 2000). This is basically where attention, if any, paid to these cases stop. However, one can easily delve
deeper and suggest, for example, that the English question-tags be viewed as a kind of particles, too, since they are
paradigmatic in their structure and express an attitudinal function, mostly, as in John has got the book, hasn´t he?
This is just an example, how easily may the area be extended, despite what the grammatical tradition says.
Generally, functionality of the above types is due to their use in the same functions or functional slots that are
occupied by their better-known single-word counterparts. Compare on behalf of in a sentence, such as
Thank you for the bouquet of flowers sent to me on behalf of all our local branch,
where the basic syntactic structure and function is that of linking a VERB (thank) and NOUN (branch), with a
simple, though not identical use of for, as in
to be found in the same example, where for occupies the same position and has the function of preposition.
A theory of idioms (Čermák 1994a, 1994b, 2001), which has been used for an extensive description of the
Czech language for decades now and which may be considered general, may be summed up by a working
definition, holding that the idiom is such a unique and fixed combination of at least two elements for which it
holds that at least some of these do not function, in the same way, in any other combination or combinations of
the kind, or occur in a highly restricted number of them, or in a single one only. One of the consequences of this
approach, based on a commutation (of idiom´s components) and using anomaly as the basic criterion, is a
possibility to test combinations of items as to their idiomatic character, class membership, restricted collocability,
idiom identification (Čermák 1998b) and, finally, inclusion into the system of idioms. In the following, an
illustration of this approach will be applied to multi-word idiomatic prepositions, a rather familiar type of the
Grammar Idioms, listed quite often in handbooks (though more often in dictionaries than in grammars).
These prepositions seem to be quite general, at least in Indo-European languages, and in wide use. Thus,
equivalents of the same English prepositions in comparison with/to (in contrast to/with) are easily found
elsewhere, cf. French en comparaison de, par comparaison à/avec, Italian in confronto à, Spanish en
comparacion con, Czech ve srovnání s, Russsian в сравнении с, Polish v porównanie s, Serbian u poređéniu sa,
Swedish i jämförelse med, German im Vergleich mit, Dutch in vergelijking met, Greek συγκρινομευος με, etc.
However, due to its different structure and in accordance with it, other languages, such as Finnish, preferring
postpositions to prepositions do use a functionally equivalent postposition johonkin verrattaeessa/verratuna.
Quite a lot of what might be considered a multi-word preposition has been hand-gathered until now (see
recently Klégr 2002), sometimes under different labels (see, eg. New Oxford English Dictionary calling them
just phrases). Since one can never be sure whether their list is exhaustive, an attempt has been made at their
automatic discovery and subsequent recognition in corpus. Looking at the structure of a typical preposition of
this type, such as in comparison with/to, two classes of constituents are evident, single-word preposition(s) and
abstract noun. While it is easy to list prepositions in this sense, in order to make a list of abstract nouns for a
test one needs a frequency dictionary, as only nouns of a rather high frequency seem to occur here. The
procedure has been applied to a list of Czech and, to a lesser degree, English nouns, too, using large corpora of
the same size, Czech National Corpus (CNC, Čermák 1997) and British National Corpus (BNC).
Results obtained from a tagged corpus had to be hand-checked to make sure that the combinations of the noun
and preposition(s) were what one was looking for, namely stable prepositions, and not mere text entities and free
combinations (see Čermák 1998a). Generally, a good criterion for inspection and determining candidates are
high frequency cases. Of course, the overall general functional framework of the preposition has to be
safeguarded, i.e one in which a preposition serves as a link either
Next to introspection, which it is difficult to apply in less frequent cases (where it might fail), a useful support
may be found in the use of statistics, such as MI-score. However, a major problem, which has to be solved only
manually now, is to distinguish between cases of the noun used as part of genuine multi-word prepositions and in
free combinations. Such cases as
Usually I keep quiet in case people think I'm crazy.
Or, a curt reply of the kind
Just in case, sir.
can hardly be included in the list of a multi-word preposition use. Similarly in Czech cases such as
Tento manévr se užívá v případě, kdy nelze zjistit cíl (This manouevre is used in the case when it is impossible
to find out the aim.).
Search may be made easier if the constituent noun has a severely restricted collocability and is found inside
such a ML preposition only. However, there are not many such nouns that would be used in the given
prepositional construction exclusively or almost exclusively, such as the Czech na úkor (at the expense of) or
English in accordance.
It is easy to see (1) a unified pattern used for the bulk of multi-word (or complex) prepositions, represented by
(1) the English in comparison with/to and its Czech equivalent ve srovnání s and (2) a much smaller group
formed by diverse means and ways, represented by according to, co do (=as to/for), etc. Evidently, the former
group, which due to its following a clear pattern and being probably an open class (in the long run), is formed
rather regularly and has a pronounced paradigmatic character, which will be preferably used for any further
formation. In contrast to these (1) paradigmatic prepositions, the rest is non-paradigmatic. Non-paradigmatic
prepositions (2), such as opposite to, depending on, according to, as to, next to, along with, prior to, relative to,
due to and in Czech co do, počínaje od, spolu s, vzhledem k, tváří v tvář etc. are anomalous in that there is no
regularity in their formation at all, being variously based on a couple of Adjectives, Verbs, conjunctions,
prepositions, etc. The general paradigmatic character of the group (1), following a common pattern in its
formation, does not, however, contradict anomaly of its individual idioms. Thus, following the test mentioned
above, there is no possibility, modifying the combination of in comparison with to arrive at other meaningful
prepositions, such as *in comparison against/for/on/through, etc. Equally, other combinations such as
through/about/before comparison with, etc., are mere syntactic constructions having no prepositional function.
Evidently, paradigmatic multi-word prepositions (type (1) above) use the same general structure in Czech
(Čermák 1996) and English, with some minor variation, namely
where prep is a single-word preposition and NOUNabstr an abstract noun. The variation consists in the omission
of the second preposition in Czech (the missing link being substituted by a valency case), such as z pohledu,
resulting in two subtypes here (A, B), or, as in English, in variation of both the plural form of the noun and use
of the definite/indefinite article in front of it, as in on the side of, by the side of, on the sides of (BNC examples).
Prepositions used in the structure are invariably those having a core place in the language and a very high
frequency (see Appendix 2).
The search for paradigmatic prepositions has been limited to inspection of those abstract nouns (broadly those
with a non-concrete denotate) whose place in the frequency dictionaries used is rather high (Čermák-Křen 2004,
Leech-Rayson-Wilson 2001). To draw a line, an arbitrary decision has been made not to go below the first one
hundred abstract nouns (see below). It seems that a majority of multi-word prepositions is to be found just there.
The first 100 abstract nouns show a remarkable similarity in both languages. While Czech uses 37 of these nouns
(out of 100) for the multi-word preposition formation, English uses 45 nouns here. However, the 37 Czech nouns
take part in 72 prepositions of this kind, while the 45 English nouns participate in 68 prepositions. For Czech it
holds that the sum of frequencies of 10 of these nouns (such as případ=case) occupies over 10% of their total
frequencies, while in English the situation is somewhat different: only 5 nouns (such as form) surrender over
10% of their total frequency to this collocationally or, rather idiomatically, bound prepositional use.
The Czech abstract nouns having this function and being from the given frequency section include rozdíl
(difference), strana (side), oblast (area), případ (case), centrum (centre), začátek (beginning), konec (end),
úroveň (level), doba (time), pomoc (aid/help), while the English ones are represented by result, view, need, form,
end, use, centre, course, support, event, etc. See the table in Appendix 1.
The collocational extremes, mentioned above, may be illustrated by the English nouns accordance, having
1965 occurrences in BNC out of which 1939 are reserved for MW idiomatic prepositions in accordance with
(99%), or conjunction (total of 1415 and 1225 occurrences) used in in conjunction with (86,5%). Czech
examples are similar: úkor (total of 1945 and 1670 occurrences) as in na úkor (85%) or přihlédnutí (total of 557
and 493 occurrences) as in s přihlédnutím k (88%).
While for English this has not been researched, it seems to hold for Czech that in order to take part in such a
preposition the noun has to have a frequency higher than or equal to 500; if the frequency of the noun is lower, it
tends to form part of a different type of idiom, usually that of the verb type, such as postavit na roveň s něčím (to
put it on a par with, noun roveň).
It seems that there is a tendency of certain types of nouns for no-participation in these prepositional structures.
Thus, work, week or such nouns as those denoting names of months are not used in this function in either Czech
or English.
Some of the English types are paradigmatic including the following basic structures that form one or more
multi-word prepositions. Here, only a skeleton structure can be given (illustrated by a single example only),
while the Abstract Noun (N) is left out. However, figures showing numbers of prepositional cases in BNC found
by us are much higher than those that BNC tags as prepositions; there is only some 60 paradigmatic prepositions
that are marked as prepositions while many more could in fact be included. Thus,
In practical life, these structures are represented by various specific ML prepositions. Thus, for example, for
the structure in - N - of , one can easily recall prepositions such as in the centre of, in the course of, in the event
of, in the form of, in use of, in the view of, etc.
However, these figures should be viewed with some reservation, since they are based on probes in BNC only.
Out of suspected 400 prepositions or more in the given span, BNC mentions specifically just few, offering no
information about the bulk of similar combinations with a prepositional function (in all, Klégr 2002 records over
1000 of what is traditionally called here complex prepositions and candidates of these). A counter-search has
been made, as was indicated above, namely through the collocations of Abstract nouns selected on the basis of
their frequency, such as result: as – of, of – of, of – in, of – from and so on. The only real multi-word preposition
here seems to be the one using the first structure, as a result of with the frequency of 5498, thus taking 25% out
of almost 22 000 collocations of result (more exactly 25,1% out of 21934 occurrences). Other prepositions or
candidate prepositions have a much lower frequency in BNC, below 200.
Moving on, other combinations, such as in - N - of and under - N - of may be viewed as very potent and, in
fact, productive structures, too. It is interesting to note that those using the preposition in in the first position
collocate with abstract nouns of highest frequency, while those using the preposition under all use nouns whose
frequency is much lower, under 100, such as aegis, auspices, direction, guise, influence, etc.
A similar survey of prepositional structures is found for Czech, too, ie. through search of CNC. Here, cases
having no second preposition (replaced by a mere case ending of a noun or pronoun, instead) are distinguished
(A, see above) from those that have it (B). Again, the skeleton structures do not give abstract nouns (N) and
abbreviations stand for case endings (Gen-Genitive, Dat-Dative, Acc-Accusative, Loc-Locative, Instr-
Instrumental).
Subtype A
For the Czech language, which in this respect has been explored more, additional figures can be offered
showing case distribution, too. Thus, the highest number of multi-word prepositions (103) use Locative case
after the noun and the primary preposition v (=in) in the first position of the structure (as in v oblasti Loc „in the
area of“), followed by those using the preposition pod (=under) and Instrumental case (45), na (=on) and
Accusative case after the noun (43), z (=from) and genitive case (31), za (=for) and genitive (25) and s (=with)
and Instrumental case (19).
Subtype 1B
It has the following primary prepositions in its initial position: bez, do, k, na, o, od, po, pod, pro, při, s, u, v, z,
za, while the final position, following a noun, uses somewhat different prepositions (not to be found in subtype
A) do, k, na, o, od, pro, proti, s, za. This makes the following combinations (illustrated by a single example,
however):
Prepositions v (in) and na (on/at) are highest on the frequency list in Czech as well as in CNC. Hence their
overwhelming share in the construction of ML prepositions, too.
4. Conclusions
Obviously, there is an area of idioms that due to their function might be called Grammar idioms, based on the
same type of anomaly as any other type of idiom where this general label stands for multi-word prepositions,
conjunctions, etc. Accordingly, all of these classes should be viewed and labeled in the same way. Since classes
of their constituents are closed paradigms, to use Louis Hjelmslev´s term (i.e. except for nouns), they might
easily be researched in their combination and checked against corpus data. However, some of the combinatory
results, as the case of ML prepositions shows, naturally go down in frequency and it might be difficult to discern
them. Discovery procedure using, for example, MI-score tells us that this depends very much on the frequency of
the constituent single-word prepositions and on the part of the total frequency of the constituent noun occupied
by the prepositional constituent function. Although the decision where to ultimately draw the line is somewhat
arbitrary, there is no avoiding the frequency of the structures researched. Hopefully, such an approach might help
one to pin down exhaustively both candidates for inclusion in the grammar idiom list, structures of these items
and, perhaps, the shape of the border area where grammar and lexicon are in touch and seem to merge.
Other types of Grammar idioms may not have the kind of regularity in their dominant structures as ML
prepositions. If so, at least a combinatory calculus of its constituents whose number tends to be finite can be set
up and checked against corpus and, perhaps, regularities of a different type found.
In conclusion, a preliminary calculation may be given showing that Czech and English are rather similar. In
the ML prepositions, the ratio between paradigmatic and non-paradigmatic types seems to be
Paradigmatic : Non-paradigmatic
Note: the frequencies of the English prepositions are lower as they are partly included in the frequencies of
multi-word preposition lemmas, such as in front of etc.
●●●●●
Shrnutí:
Gramatické frazémy
Pro všechny slovní druhy existují důležité extenze jejich jednoslovných zástupců v podobě víceslovných
frazeologických kombinací; oba typy pochopitelně spojuje tatáž funkce. Existence těchto frazémů s gramatickou
povahou, vykazujíích všechny základní vlastnosti frazémů, je dána potřebou dalších, nových členů staré funkční
třídy se specializovaným významem.
V anglické i české gramatické tradici se jejich existence přiznává jen zčásti a pod různými nálepkami, popř. se
nepřiznává vůbec a gramatiky se tváří, že neexistují.
Obecně lze na úrovni slovních druhů najít stejné množství typů (kolokačních) frazémů jako je slovních druhů.
Vedle známých frazémů funkčně ekvivalentních autosémantikům lze pak také spatřovat frazémy funkčně
ekvivalentním synsémantikům, srov. anglické a české příklady
prepozice: as to, on behalf of, with the exception of; na úkor někoho/něčeho, s výjimkou někoho/něčeho,
konjunkce: as if, even though, in order that, as long as; i když, jak – tak, v souvislosti s tím, že,
partikule: all right, as well; že by, jen jestli, co kdyby,
pronomen: anyone who, those who; ten kdo, to co, kdokoliv který,
Teoretický rámec zde užívaný je týž jako jinde pro českou frazeologii. Že tento jev není nikterak omezený jen na
dva jazyky, ukazuje porovnání českého ve srovnání s něčím/někým pár jazyků
angl. in comparison with/to, franc. en comparaison de, par comparaison à/avec, It. in confronto à, špaň. en
comparacion con, rus. в сравнении с, pol. v porównanie s, srb. u poređéniu sa, švéd. i jämförelse med, němě. im
Vergleich mit, nizoz. in vergelijking met, řec. συγκρινομευος με, etc. a třeba ještě analogické finské, kde však jde
nutně o postpozici namísto prepozice, srov. johonkin verrattaeessa/verratuna.
Otevřenou otázkou je, nakolik lze nalézt automatickou metodu identifikace těchto frazémů, užijeme-li korpusu.
Zdá se, že to kombinací metod v některých případech do jisté míry možné je. V dalším se soustředíme na
předložky.
Ty lze rozdělit na (1) paradigmatické typu ve srovnání s, in comparison with, a (2) neparadigmatické, typu
opposite to, depending on, according to, as to, next to, along with, prior to, relative to, due to ; co do, počínaje
od, spolu s, vzhledem k, tváří v tvář, jejich tvoření za sebou nemá žádný precedens a které jsou tudíž
anomálnější než ty první. Ovšem i první typ jasnou anomálii vykazuje, srov. s jediným frazémem ve srovnání
s možná, ale neustálená a funkčně odlišná spojení pro srovnání s, na srovnání s, ze srovnání s, popř. nemožná *pod
srovnáním pro, *ke srovnání za apod. Struktura paradigmatických prepozic je jak vidno tedy prep - Sabstr – (prep).
Z frekvenčně vybraných prvních 100 abstrakt jich 37 vstupuje do této konstrukce, zatímco v angličtině je to 45
abstrakt. Zobecníme-li struktury a skutečné prepozice, zjišťované na BNC a ČNK, a zachytíme-li je jen číslem,
dostaneme pro angličtinu a češtinu (tam napřed bez druhé předložky a pak s ní)
after - N - of (after the date of) 10
against - N - of (against the backdrop of) 4
as - N - of (as a result of) 38
at - N - of (at the time of) 46
behind - N - of (behind the mask of) 3
between - N - of (between the ages of) 8
beyond - N - of (beyond the reach of) 17
by - N - of (by the time of) 21
by - N - with (by comparison with) 5
for - N - of (for the use of) 19
for - N - in (for the use in) 2
from - N - of (from the centre of) 18
in - N - of (in case of) 47
in – N - with (in line with) 19
in - N - to (in addition to) 14
into - N - with (into line with) 6
on - N - of (on the side of) 42
on - N - to (on the way to) 4
out - of - N of (out of sight of) 6
out - of - N - with (out of touch with) 5
over - N - of (over a period of) 8
through - N - of (through the use of) 6
to - N - of (to the power of) 20
under - N - of (under the control of) 52
up to - N - of (up to the value of) 8
upon - N - of (upon the basis of) 4
with - N - to (with regard to) 16
with - N - of (with the development of) 24
within - N - of (within the framework of) 25
without - N - of (without the support of) 17
Je jasné, že užitou metodu je třeba dále propracovávat, zvláště pak i to, nakolik je požitelná i mimo předložky.
Závěrem ještě poměry předložek paradigmatických a neparadigmatických v obou jazycích, srov.
Paradigmatické : Neparadigmatické
česká abstrakta frekvence subst. vícesl. prepozice frekvence podíl víceslov. prepozic
víceslov. na frekvenci abstrakta
prepozic (%)
rozdíl 23087 na rozdíl od, 8937, 49
bez rozdílu 302
strana 130835 ze strany, 126, 30.5
stranou od 4097
oblast 44806 v oblasti 9850 22
případ 78228 v případě, 13056, 17.5
pro případ 686
centrum 28962 do centra, 397, 15.3
v centru 3113
začátek 24655 na začátku 3724 15
konec 57888 na konci, 6538, 14.4
ke konci 1854
úroveň 26162 na úrovni, 2319, 11.6
na úroveň 726
doba 111053 11 prepositions 11069 10.4
pomoc 28962 s pomocí, 1633, 10
za pomoci 1210