Content 16366190101064033023

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 16

Languages

Alphabet
A finite, nonempty set of symbols.
Symbol: Σ
Examples:
The binary alphabet: Σ = {0, 1}
The set of all lower-case letters: Σ = {a, b, . . . , z}
The set of all ASCII characters
String
• A string (or sometimes a word) is a finite sequence of symbols chosen
from some alphabet
• Example: 01101 and 111 are strings from the binary alphabet Σ = {0,
1}
• Empty string: the string with zero occurrences of symbols This string is
denoted by ǫ and may be chosen from any alphabet whatsoever.
• Length of a string: the number of positions for symbols in the string
Example: 01101 has length 5
• Notation of length of w: |w| Example: |011| = 3 and |ǫ| = 0
Power of an alphabet
• IF Σ is an alphabet, we can express the set of all strings of a certain
length from that alphabet by using the exponential notation:
• Σ k: the set of strings of length k, each of whose is in Σ
• If Σ = { 0, 1 }, then:
1. Σ 1 = { 0, 1 }
2. Σ 2 = {00, 01, 10, 11 }
3. Σ 3 = {000, 001, 010, 011, 100, 101, 110, 111 }
Concatenation
Define the binary operation . called concatenation on Σ∗ as follows:
If a1a2a3 . . . an and b1b2 . . . bm are in Σ∗, then
a1a2a3 . . . an. b1b2 . . . bm = a1a2a3 . . . an. b1b2 . . . bm
Thus, strings can be concatenated yielding another string:

If S and T are subsets of Σ∗, then S.T = {s.t | s ∈ S, t ∈ T}


Language
• If Σ is an alphabet, and L ⊆ Σ∗, then L is a (formal) language over Σ.
• Language: A (possibly infinite) set of strings all of which are chosen
from some Σ∗
• Examples: Programming language C
• English or French
Other Language
Important Operator on Language: Union
Important Operator on Language:
Concatenation
Important Operator on Language: Closure
Language
For lexical analysis we care about regular languages. Regular languages can be described using regular
expressions. Each regular expression is a notation for a regular language (a set of words).
If A is a regular expression, we write L(A) to refer to language denoted by A.
Regular Expression

• A regular expression (RE) is define as


a ordinary character from Σ
Λ the empty string
Regular Expression
[abc] a|b|c (any of listed)
[a-z] a|b|…|z (range)
Regular Expression
Some more regex
RE Strings in L(R)
Hello contains{Hello}
gray | grey contains(gray,grey}
gr(a|e) y contains{gray,grey}
gr[ae] y contains{gray,grey}
b[aeiou]bble contains {babble, bebble, bibble, bobble, bubble}
[b-chm-pp]at|ot contains {bat, cat, hat, mat, nat, oat, pat, Pat, ot}
go*gle contains {ggle, gogle, google, gooogle, goooogle, ...}
z{3} contains {zzz}
z{3,6} contains {zzz, zzzz, zzzzz, zzzzzz}
^hello Begins with “hello”
hello$ Ends with “hello”
Notations in Programming Language

You might also like