Professional Documents
Culture Documents
2lexical Semantics and WSD - 080115
2lexical Semantics and WSD - 080115
2lexical Semantics and WSD - 080115
Lexical Semantics
What is lexical? Lexical means relating to the words or vocabulary of
a language. Lexical item can be a single word, a part of a word, or a
chain of words.
What is semantics?
Semantics is the study of meaning of words, phrases and
sentences or
Is the study of meaning of linguistic expressions, such as
morphemes, words, phrases, clauses, and sentences.
Semantics is concerned with denotation (primary or objective)
meaning not with connotation (idea or feeling) meaning
There are two general types of semantics namely:
a. Lexical semantics which deals with the meaning of the words
(or is the study of words in general, study of related words).
b. Structural semantics which deals with the meaning of
utterances (vocal sound) larger than words.
Cont…
What is a word?
Definitions we’ve used over the class: Types, tokens, stems, roots,
uninflected forms, etc…
Lexeme: an entry in the lexicon (An entry in a lexicon consisting of a
pairing of a form with a single meaning representation) that includes:
an orthographic representation
a phonological form
a symbolic meaning representation or sense
Lexicon: A collection of lexemes (mental dictionary).
Dictionary: is a kind of lexicon where meanings are expressed through
definitions and examples
Red (‘red) n: the color of blood
Blood (‘bluhd) n: the red liquid that circulates in the heart, arteries
and veins of animals
Right (‘right) adj: located nearer the right hand esp. being on
the right when facing the same direction as the observer
Left (‘left) adj: located nearer to this side of the body than the
right
• What can we learn from dictionaries?
– Relations between words:
• Oppositions, similarities, hierarchies
Relationships between word meanings (lexical relations)
Homonyms: Words with same form but different, unrelated
meanings, or senses (multiple lexemes)
It is a relation between words that have the same form and the same
PoS, but unrelated meanings
A bank holds investments in a custodial account in the client’s
name.
As agriculture is burgeoning on the east bank, the river will shrink
even more
It causes ambiguities for the interpretation of a sentence since it
defines a set of different lexemes with the same orthographic form
(bank1, bank2,..)
Related properties are homophony (same pronunciation but different
orthography, e.g. be-bee) and homography (same orthography but
different meaning like lead/lead)
Cont…
Homonymy causes problems for NLP applications
General semantic interpretation
Machine translation
Spelling correction
Speech recognition
Text to speech
Same orthographic form but different phonological form
Bass vs bass
Bow vs bow
Record vs record
Information retrieval
Different meanings same orthographic form
Cont…
Polysemy: Words with multiple but related meanings (same
lexeme)
It happens when a lexeme has more related meanings
When two senses are related semantically, we call it polysemy (rather
than homonymy)
It depends on the word etymology (unrelated meanings usually have a
different origin) - e.g. bank/data bank/blood bank
They rarely serve red meat.
He served as U.S. ambassador.
He might have served his time in prison.
What’s the difference between polysemy and homonymy?
Homonymy:
Distinct, unrelated meanings
Polysemy:
Distinct, but related meanings
Idea bank, blood bank, bank bank
Cont…
Synonymy: Substitutability: different lexemes with the
same meaning
It is a relationship between two distinct lexemes with the same
meaning (i.e. they can be substituted for one another in a given
context without changing its meaning and correctness) – e.g. I
received a gift/present
How big is that plane?
How large is that plane?
How big are you? Big brother is watching.
The substitutability may not be valid for any context due to
small semantic differences (e.g. price/fare of a service – the bus
fare/the ticket price)
In general substitutability depends on the “semantic
intersection” of the senses of the two lexemes and, in same cases,
also by social factors (father/dad)
Cont…
Hyponymy: is a relationship between two lexemes (more precisely
two senses) such that one denotes a subclass of the other
car, vehicle – shark, fish – apple, fruit
The relationship is not symmetric
The more specialized concept is the hyponym of the more general one
The more general concept is the hypernym of the more specialized
one
Hyponym (hypernym) is the basis for the definition of a taxonomy ( a
tree structure that defines inclusion relationships in an object
ontology) even if it is not properly a taxonomy
The definition of a formal taxonomy would require a more
uniform/rigorous formalism in the interpretation of the inclusion
relationship
However, the relationship defines a inheritance mechanism of the
properties from the ancestors of a given concept in the hierarchy
Cont…
General: hypernym (super…ordinate)
Needed in:
16
Approaches
Several approaches to WSD have been proposed
Knowledge Based Approaches
Supervised Approaches
Semi-supervised Approaches
Unsupervised Approaches
Hybrid Approaches
Cont…
Knowledge Based Approaches
Probabilistic/Statistical models.
Hybrid Approaches
18
Use corpus evidence as well as semantic relations form WordNet.
WSD using selectional preferences
Sense 1 Sense 2
This airlines serve dinner in This airlines serve the sector
the evening flight. between Jima & AA.
serve (Verb) serve (Verb)
agent agent
object – sector
object – edible
Requires exhaustive enumeration of:
Argument-structure of verbs.
20
Supervised Approaches
WSD can be approached as a classification task
The correct sense is the class to be predicted
The word is represented by a set (vector) of features to be
processed as the classifier input
Usually the feature includes a representation of the word to
be disambiguated (target) and of its context (a given number
of words at the left and the right of the target word)
The word itself, the word stem, the word PoS can be
exploited as features
The classifier can be learnt from examples given a labeled
dataset
Different models can be exploited to implement the classifier
(Naïve Bayes, neural networks, decision trees…)
Semi-supervised Approaches
Step1: Train the Decision List algorithm using a small amount of seed
data.
Step2: Classify the entire sample set using the trained classifier.
Step3: Create new seed data by adding those members which are
tagged as Sense-A or Sense-B with high probability.
Identify words that are tagged with low confidence and label them
with the sense which is dominant for that document
22
22
Unsupervised Approaches
Unsupervised approaches to sense disambiguation eschew (avoid)
the use of sense tagged data of any kind during training.
In these approaches, feature-vector representations of unlabeled
instances are taken as input and are then grouped into clusters
according to a similarity metric.
These clusters can then be represented as the average of their
constituent feature-vectors, and labeled by hand with known word
senses.
Unseen feature-encoded instances can be classified by assigning
them the word sense from the cluster to which they are closest
according to the similarity metric.
Cont…
Fortunately, clustering is a well-studied problem with a wide
number of standard algorithms that can be applied to inputs
structured as vectors of numerical values .
The most frequently used technique in language applications is
known as agglomerative clustering.
In this technique, each of the N training instances is initially
assigned to its own cluster.
New clusters are then formed in a bottom-up fashion by
successively merging the two clusters that are most similar.
This process continues until a either a specified number of clusters
is reached, or some global goodness measure among the clusters is
achieved.
In cases where the number of training instances makes this method
too expensive, random sampling can be used on the original training
set to achieve similar results.
Cont…
Of course, the fact that these unsupervised methods do not
make use of hand-labeled data poses a number of
challenges for evaluating the goodness of any clustering
result.
The following problems are among the most important
ones that have to be addressed in unsupervised approaches.
The correct senses of the instances used in the
training data may not be known.
The clusters are almost certainly heterogeneous
with respect to the senses of the training instances
contained within them.
The number of clusters is almost always different
from the number of senses of the target word being
disambiguated.
Hybrid Approaches
Uses semantic relations (synonymy and hypernymy) form WordNet.