Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 25

Natural Language Processing

Course Outline
What is Linguistics
History of the study of language
 Modern Linguistics and Traditional Arabic Grammar
 Principles of Linguistic Analysis
Symbolic NLP
 Machine Translation was the first non-numeric
application for the digital computer
 Linguistic knowledge is central in NLP applications
 performs tasks that parallel linguistic levels:
Morphological analysis, syntactic parsing .etc
 Problems of symbolic NLP that led to the rise of
statistical NLP
Statistical NLP
 Capitalizes on probability theory
 Proved to be very successful in Speech Recognition
 Needs large data which has become available
 Made great success in machine translation
 Well-defined technology: Hidden Markov Model,
Support Vector Machines .etc
 Problems of statistical NLP
What is Linguistics?
 Linguistics deals with Language
 Traditional Grammar:
1. Focused on words. Is there a relationship between a
word and what it means?
2. Notional definitions of parts of speech
3. Literary language is the best language
4. Spoken language is full of errors
Modern Linguistics
1. Identifying language families and genetic
relationships among languages
2. Primacy of speech
3. No value judgment on any variety of language
4. A child is born with a tabula raza
5. Learning a language is a matter of habit
6. Analyzing language through discovery procedure
7. Empirical study of language – link with behavioral
psychology
Generative Grammar
1. Rational approach to scientific inquiry
2. Mentalistic Communication Model
3. Distinction between Competence and performance
4. Linguistic knowledge is localized in the brain
5. Language is infinite
6. Grammar is finite but generates an infinite number of
sentences (recursiveness)
7. Phrase structure rules + Transformations
8. Deep and surface structure
Principles of Linguistic Analysis
Levels of Language
 Phonological Levels
Linguistic Analysis
Phonology
There is a universal set of sounds that are used by
languages.

Each language selects a subset of this universal set to


make meaningful distinctions between words
Phonological Analysis
 Phonology of a language determines which sounds
 (consonants and vowels) are used in this language for
 meaning contrasts (Phonemes of the language)

 Phonetics deals with the description of actual sounds used


in language whether for meaningful distinctions or not
Example /p/, /b/ in English versus [p], [b] in Arabic
Classification of Sounds
Articulatory Phonetics
Vowels, consonants and diphthongs
Acoustic Phonetics

Auditory Phonetics
Sound Interaction
Consonants and Vowels
‫و ي‬
When do we pronounce them as vowels and when as
consonants

Assimilation rules in Arabic


Suprasegmental Featues
Features that go over more than one segment of speech

Stress

Intonation
Morphology
Difficulty in defining what a word is

Hence comes the morpheme

Definition: The morpheme is a minimal linguistic unit


that has a meaning
Types of morphemes
Free morphemes
A morpheme that can occur freely
‫ مع‬- ‫ولد – كتاب‬
Bound morphemes
‫ ها‬- ‫ ات‬- ‫ون‬
Types of Morphology
Derivational Morphology

Bound Morpheme that change the part of speech

Example nation+al  Adjective

National+ize  Verb

Nationalize  tion  Noun


Inflectional Morphology
Bound morphemes that change grammatical
information but does not change the part of speech

Example play + ed
boy + s
dish + es
Syntax
Every sentence is a sequence of words

Not every sequence of words is a sentence


Grammaticality is language specific
and based on native speakers’ intuitions
Syntactic Rules
Syntactic categories

S (sentence)
NP (noun phrase)
VP (verb phrase)
PrepP (prepositional phrase)
……
Production Rules
The boy hit the ball

S  NP VP
NP  Det N
VP  V NP
Properties of Syntactic Rules
1. Hierarchical Relationships

2. Meaningful constituency

3. Resolution of ambiguity
Semantics
Semantics defines the meaning of language

Meaning is compositional
The meaning of the whole = the meaning of the parts
and the way they are put together
Semantic Features
Boy

+ Human + Male > 13 years old


‫‪Semantic Relations‬‬
‫‪Ambiguity‬‬
‫مدير البنك الجديد‬
‫‪Paraphrase‬‬
‫من المهم بناء االقتصاد المصري اآلن‬
‫بناء االقتصادالمصري مهم حاليا‬
‫‪Implicature‬‬
‫سينتخب المصريون رئيساجديدا قريبا ً‬

You might also like