Professional Documents
Culture Documents
Optimality Theory
Optimality Theory
Optimality Theory
CHAPTER SEVEN
OPTIMALITY THEORY
Introduction
Optimality theory, otherwise called OT, is relatively new among the
family of established phonological theories such as Generative
phonology (Chomsky and Halle, 1968), Natural generative phonology
(Hooper, 1976), Natural phonology (Stampe 1979), Autosegmental
phonology (Goldsmith, 1976, 1979), Lexical phonology (Strauss,
GE
1982), Metrical phonology (Liberman, 1985), Dependency phonology
N
(Anderson, Ewen & Staun, 1985) and Prosodic phonology (Nespor
and Vogel, 1986). Although OT started off as a phonological theory
and has had its widest applications in phonology, it has also been
extended to other aspects of language. Hence, it also has applications
in syntax, semantics and sociolinguistics (McCarthy, 2007). However,
its applications in these other aspects are rare in comparison to
phonology’s. The central argument in favour of OT is its explanatory
adequacy, compared to rule-based theories.
A number of studies, prior to the advent of OT had noted that rule-
based phonology lacked the mechanism to explain relationships
between phonological rules’ outputs. Kisseberth (1970) was the first of
such studies. It argued that the bracketing convention of the order of
SPE did not always select the sets of rules which had a natural
relationship with one another. This is because such relationship was
based on the similarity between the outputs that the rules produced.
Blumenfeld (2006, p. 1) notes that: “Over the years, Kissebert’s
original insight was developed, culminating in the realisation that the
weight and importance of output-based conspiracies was too great for
standard rule-based theory to handle.” Moreover, by the close of the
1980’s, there had been a consensus among leading scholars concerning
the importance of output constraints, although it was not yet clear what
the nature and functions of these constraints were. The foregoing
arguments became the springboard for the emergence of OT.
Originators of Optimality Theory
OT was originally proposed by the duo of Paul Smolensky and Alan
Prince in 1993 from a course taught by them in the Summer Institute
of the Linguistic Society of America. Although Paul Smolensky was
one of the two persons that proposed OT, a phenomenal theory of
language, he did not have a background in linguistics. Paul was born
on May 5, 1955. He was educated at Harvard University, where he got
A.B. in Physics (1976) and Indiana University Bloomington, where he
received M.S. in Physics (1977) and Ph.D. in Mathematical Physics
(1981). He received the Rumelhart Prize in 2005 for his pursuit of the
ICS Architecture, a model of cognition that aims to unify
Connectionism and Symbolism.
The theory was later expanded by McCarthy and Prince (1995) and
McCarthy (2001). This expanded version was applied in one of
McCarthy’s (2008) work titled Doing Optimality Theory: Applying
Theory to Data.
The Rudiments of OT
OT has three basic components which it assumes to be universal.
These are: GEN (from generator), CON (from constraint) and EVAL
(from evaluator). GEN generates the list of potential outputs or
possible candidates. CON provides the criteria, that is, violable
constraints, used to decide between candidates; while EVAL chooses
the optimal candidate based on the constraints.
CON-ACON-BCON-CCON-D
Cand1 *! *
• Cand2 *
Cand3*! * *
Cand4 *
Table 1.1: A hypothetical OT tableau
In OT convention, an asterisk (*) in a candidate’s column means that
the candidate violates the given constraint; an exclamation mark (!)
means a fatal violation, leading to the exit of that candidate from the
‘competition’; while the pointing finger ( ) signifies the winner
candidate. The winner does not have to be a ‘saint’: even if it commits
some ‘sins’ (violations), they must be lesser sins (lower ranked
constraints) compared to the rest of the candidates in the competition.
Hence, in the tableau Cand2 emerged winner despite being the only
one that violated constraint D (the lowest ranked constraint). It still
emerged the winner because, unlike Cand1 and Cand3, it did not
commit a fatal violation. Moreover, Cand4 which was its strongest
competitor violated CON-C which was a higher ranked constraint than
CON-D. The import of the foregoing is this: all constraints are
violable; and all candidates, including the winner, are capable of
violating at least one constraint in any given hierarchy. The winner
does not have to satisfy all constraints; it only has to satisfy them
better than the rest.
This particular family ranks religion (Catholic) higher than all other
constraints. Hence, any violation of it is a fatal violation. The
constraints hierarchy can thus be represented as follows: R (Catholic)
>>E (Igbo) >>F (rich) >>C (honest) >>S (tall). Consequently,
candidates Musa (a Hausa Muslim) and Chucks (an Igbo Anglican)
commit the fatal violation. As for Candidate Yemi, he violates just one
constraint, that is, constraint E (Igbo) which is ranked higher than
constraints F (rich), C (honest) and S (tall). However, Candidate
Chibuzor emerges winner despite violating more constraints (three)
than all the other candidates. This is because the three constraints are
the least ranking constrains in the hierarchy. As earlier mentioned, “the
winner does not have to be a ‘saint’: even if it commits some ‘sins’,
they must be lesser sins compared to the rest of the candidates in the
competition.”
OT’s concept of richness of the base principle suggests that the set of
possible inputs is universal because it is common to all languages.
Likewise, GEN and CON are universal tendencies among all
languages. However, although EVAL is also universal, its modus
operandi varies from language to language. According to Prince and
Smolensky (2004, p. 6), “The account of interlinguistic differences is
entirely tied to the different ways the constraint-system H-eval can be
put together, given UG.” (i.e. universal grammar). This is due to the
relative variation of the ranking of constraints among languages; while
a language ranks a particular constraint low, another may rank it high
on its hierarchy. Hence, constraint ranking is an only language specific
criterion; and in order to acquire a language, therefore, it requires
learning the appropriate constraint hierarchy in the language. This
process is referred to as grammar learning algorithm (Tesar and
Smolensky, 2000; Boersma and Hayes, 2001; Prince and Tesar, 2004;
Pater, 2005). The algorithm, according to Biros (2000, p. 10), “is
expected to return a hierarchy that produces the correct outputs for the
given underlying forms”. Note that Prince and Smolensky refers to
EVAL as H-eval. According to them, “The function H-eval evaluates
the relative Harmony of the candidates, imposing an order on the entire
set.” (p. 6). This order imposed by EVAL leads to the determination of
the optimal candidate. They further add that “An optimal output is at
the top of the harmonic order on the candidate set; by definition, it best
satisfies the constraint system.” (p. 6).
EVAL
Set of
UR
candi Con
d-ates 3 Con Con
SR
2 1
222
Figure 1.1: The basic Optimality Theory Architecture
The underlying representation (UR) supplies the input, from where the
GEN generates a set of candidates. These candidates are then
subjected to EVAL which evaluates them and returns the optimal
candidate as the output (surface representation). EVAL is like a filter
conduit where the constraints filter out the candidates that are not
harmonic with the grammar. Samek-Lodovici and Prince (1996, p. 6)
define candidate as “an atomic, unanalyzed notion”. The candidates
are competitors competing for the optimal slot. Constraints are the
eliminators; they eliminate weak competitors by assigning violation
marks (asterisks) and candidates with higher number of asterisks leave
the arena for the one with the least number.
Types of Constraints
Generally speaking, there are two types of constraints in all languages.
These are faithfulness constraints and markedness constraints (Prince
and Smolensky, 1993). This position is corroborated by Kager (1999)
who adds that this classification is premised on the fact that every
constraint evaluates one specific aspect of output markedness or
faithfulness. Moreover, McCarthy (2007, p. 14) categorically states
that those two are the only constraints domiciled in CON: “CON
contains only markedness and faithfulness constraints”.
Among the major family of constraints, the three most prominent and
commonly used constraint types in OT are Markedness, Faithfulness
and Alignment. Let us consider each of them in some details.
Faithfulness Constraints
Faithfulness constraints are the class of constraints that require the
output structure to strictly resemble the input. McCarthy (2007)
considers them to be inherently conservative because they resist any
form of change(s) to the input structure. Prince and Smolensky (1993,
2004) note that faithfulness constraints act to preserve the input. They
identify two basic types: MAX and DEP. The MAX (that is maximise
the input) sub-group penalise for deletion. MAX is violated whenever
any input segment is deleted in the output. DEP (that is, depend on the
input for all material) penalise for insertion. The constraint is violated
whenever any epenthetic material is added to the input.
McCarthy & Prince (1995), in addition to MAX and DEP, propose
IDENT(F) as a third member of the basic families of faithfulness
constraints. The constraint prohibits any alteration to the value of
feature F. IDENT is from the word ‘identical’. The names of each of
the constraints may be suffixed with ‘-IO’ or ‘-BR’. ‘-IO’ represents
input/output, while ‘-BR’ stands for base/reduplicant. ‘-BR’ is
essentially used in analysis of reduplication. However, the ‘F’
in IDENT(F) represents the distinctive feature of the segment being
analysed. Therefore, it can be substituted with the name of any
distinctive feature as appropriate. For example, IDENT-IO(V) for voice,
or IDENT-IO(P) for place. MAX and DEP seem to have replaced an
earlier couple of constraints set - PARSE and FILL - proposed by Prince
& Smolensky (1993). PARSE states that ‘underlying segments must be
parsed into syllable structure’ and FILL states that ‘syllable positions
must be filled with underlying segments’ (p. 94).
PARSE and FILL perform essentially the same functions
as MAX and DEP. The difference between them is that while
PARSE and FILL evaluate only the output, MAX and DEP evaluate the
relation between the input and the output. Hence, PARSE’s and FILL’s
function is rather similar to that of markedness constraints (McCarthy,
2008, p. 209).
This function of theirs originates from Prince & Smolensky’s
containment theory, a model which assumes that any input segment
not realised by the output is not deleted but rather ‘left unparsed’ by a
syllable (Kager, 1999, p. 99-100). However, correspondence theory,
another model proposed by McCarthy & Prince (1995, 1999), has
subsequently replaced it as the standard framework (McCarthy, 2008,
p. 27). Correspondence theory is a proposal of a relation that holds
between root nodes. For instance, the root node /n/ in the input /ɑni/
can be in correspondence relation with the root node /ŋ/ in the output /
ŋɑni/. de Lacy (2010) however argues that in the course of processing
an output candidate from an input, GEN has the freedom of generating
correspondence relations. As a result, there will also be a candidate
consisting of the input /ɑni/ and even an output /ŋɑni/ where
/n/corresponds to /ŋ/.
Markedness Constraints
The term ‘markedness’ refers to those characteristics of languages that
are considered to be universally more complex and/or rarer in
languages. It refers to a phonological feature or property that is
generally uncommon or unusual among languages. For instance, /p/
and /b/ contrast in English - /b/ is characterised by the presence of
voicing, while /p/ lacks it. Moreover, in Thai, /pʰ/ and /p/ contrast. /pʰ/
has aspiration, while /p/ lacks it. Thus, the candidate that is
characterised by the presence of a mark is said to be ‘marked’, while
that which does not have it is said to be ‘unmarked’. Potts and Pullum
(2002) regard markedness constraints as the simplest OT constraints
from the model-theoretic perspective. They claim that the constraints
“place conditions on individual candidate outputs, i.e. on structures
that are, in most theories, trees” (p. 366). Markedness constraints relate
to rules about structure. These include constraints against segments
that are difficult to pronounce like consonant clusters, as well as
alignment-related constraints. Prince and Smolensky (1993) propose a
list of markedness constraints which operate across languages. These
include:
i. NUC: Syllables must have nuclei.
ii. –CODA (NOCODA): Syllables must not have codas.
iii. ONS (ONSET): Syllables must have onsets.
iv. HNUC: A nuclear segment must be more sonorous than another
(from ‘harmonic nucleus’).
v. *COMPLEX (i.e. no complex syllable is allowed): A syllable must
be V, CV or VC.
vi. CODACOND (CODACONDITION): Coda consonants cannot have
place features that are not shared by an onset consonant.
vii. NONFIN (NONFINALITY): A word-final syllable (or foot) must
not bear stress.
viii. FTBIN (FOOTBINARITY): A foot must be two syllables (or
moras).
ix. PK-PROM (PEAKPROMINENCE): Light syllables must not be
stressed.
x. WSP (WEIGHT-TO-STRESS PRINCIPLE): Heavy syllables must
be stressed.
Alignment Constraints
Alignment is a phenomenon in which languages show a preference for
certain linguistic features to be aligned with other linguistic features. It
refers to the tendency for certain linguistic features to coincide in a
language. Such features include the location of primary word stress
word-initially, or a question marker word-finally. Alignment was
initially proposed as a constraint in OT by Prince and Smolensky
(1993) in order to explain infixation. It was later developed by
McCarthy and Prince (1993). According to McCarthy (2011),
alignment constraints require that the edges of linguistic structures
coincide. He adds that alignment constraints have the capacity to
discriminate among candidates that are imperfectly aligned; but this is
when they are evaluated gradiently. Alignment is used to explain word
stress patterns in a number of languages such as Polish and Garawa. It
is also used to explain loan word phenomenon in Japanese (McCarthy
and Prince, 1993; Kager, 2001).
Categorical Constraints
According to McCarthy and Prince (1993), categorical constraints
constitute the majority of proposed OT constraints. When candidates
are being evaluated in the course of a phonological analysis, this class
of constraints makes categorical statements or judgements. They
require that they (that is, the constraints themselves) are either satisfied
or not satisfied. They assign not more than one violation mark to a
candidate, except if the form under evaluation has several violating
structures in it. McCarthy (2003, p. 75-76) asserts that categorical
constraints “never assign more than one violation-mark, unless the
candidate under evaluation contains more than one instance of the
marked structure or the unfaithful mapping that the constraint
proscribes.”
Gradient Constraints
Unlike categorical constraints which assign not more than one violation
mark to a candidate in their evaluation, gradient constraints evaluate the
range of deviation. Thus, gradient constraints can assign multiple
violation marks even when there is only one instance of the non-
conforming structure. McCarthy argues that gradient constraints are
predominantly of the alignment family. He however proposes that all
relevant constraints in OT must be categorical. He is of the opinion that
the different gradient constraints that have been proposed are
unnecessary as most of them have undesirable consequences on
phonological analysis. Hence, he argues that “OT's universal constraint
component CON permits only categorical constraints.” (McCarthy,
2003 p. 76).
ALL-FT-LEFT (Align foot left) - Every foot stands at the left edge of
Prosodic Word. The left edge of the word must match the left edge of
the head foot.
FAITH V – The vowels in the input must be the same as those in the
output.
IDENT (central) - Output have a central vowel identical to that of the
input
The constraints are adapted from Kager (1999), de Lacy (2002) and
McCarthy (2007).
Both candidates (a) and (c) commit the same number of violations and
on the same lower ranked constraints (ALL-FT-LEFT and NON-
FINALITY). However, candidate (b) commits a fatal violation of the
highest ranked constraint - UNEVEN-IAMB and that automatically
drops it out of the race. The other two candidates are tied. The
constraints ranking is: UNEVEN-IAMB >> WSP >> ALL-FT-LEFT
>> NON-FINALITY. It however remains for us to find a winner
between the two candidates that are tied in the competition. To resolve
this, another constraint or two have to be introduced while we drop at
least one of the constraints in which they tie. Hence, we bring in
FAITH V and IDENT(central) and then drop NON-FINALITY.
In this tableau, candidates (a) and (b) both violate the highest ranked
constraints, ALL-FT-R and FT-BN, as well as two lower ranked
constraints. Having both committed fatal violations, they are
automatically out of the competition. The two surviving candidates (‘c’
and ‘d’) are however tied on the violation rating, having both violated
NON-FINALITY. The constraints ranking here is: ALL-FT-R >> FT-
BN >> UNEVEN-IAMB >> NON-FINALITY. To resolve the conflict,
two other constraints have to be brought into the hierarchy. These are
*IDENT(Syllable) and MAXLex. This is expressed in Tableau 5.
From the tableau, candidate (a) does not violate the lower ranking
constraints on the hierarchy as well as WSP which is a higher ranking
constraint, but it violates UNEVEN-IAMB, another higher ranking
constraint, which attracts a fatal penalty. Likewise, candidate (c)
commits a fatal violation of UNEVEN-IAMB along with the violation
of the lower ranking constraints on the hierarchy. Consequently, both
candidates (‘a’ and ‘c’) are disqualified from the competition, leaving
candidate (b) as the only candidate left in the race. Thus, it becomes
the optimal candidate. The constraints ranking here is: UNEVEN-
IAMB >> WSP >> ALL-FT-LEFT >> NON-FINALITY.
5.0 Discussion
From the Optimality analysis, it is evident that the most prominent
constraint among educated Yoruba NE (eYNE) speakers is UNEVEN-
IAMB which prefers the assignment of stress on light-heavy syllables
(syllables made up of light mora + heavy mora) to light-light (syllables
made up of two successive light moras) or heavy (made up of heavy
mora) syllables. Trisyllabic nouns are usually stressed finally or
penultimately; hence, NON-FINALITY ranks low among eYNE
speakers. In the same vein, ALL-FT-LEFT ranks low because of the
tendency for rightward stress placement among eYNE speakers.
However, both WSP and UNEVEN-IAMB rank higher because heavy
syllables are those that usually attract stress among eYNE speakers
(hence, WSP); and since eYNE speakers’ words are not usually
stressed along the left edge of the word, UNEVEN-IAMB is preferred
to both ALL-FT-LEFT and NON-FINALITY. Hence the constraints
ranking for ‘workmanship’ is UNEVEN-IAMB, WSP >> ALL-FT-
LEFT >> NON-FINALITY. Consequently, we may conclude that NE
is iambic in nature; unlike sBE which, according to Tremblay (2008)
“is generally analysed as having a trochaic (i.e., stressed-unstressed)
foot whose right edge is aligned with the right edge of the prosodic
word.” Despite the variation in word stress placement between
educated Yoruba NE speakers and sBE speakers, there exists a level of
correlation between the two: where sBE has a final stressed syllable,
educated Yoruba NE speakers agree with sBE, probably due to its
tendency to align stress rightwards.
6.0 Findings
The findings in this study are as follows:
1. The most prominent constraint among eYNE speakers is
UNEVEN-IAMB.
2. sNE is iambic in nature; unlike sBE which is trochaic.
However, despite the variation in word stress placement
between sNE and sBE, there exists a level of correlation
between the two: where sBE has a final stressed syllable, sNE
agrees with sBE, probably due to its tendency to align stress
rightwards.
3. The study also demonstrates the adequacy of OT to explicate
word stress as used by eYNE speakers.
References