Download as pdf or txt
Download as pdf or txt
You are on page 1of 19

Experiment in

Compiler Construction
Phn tch t vng
Nguyen Ngoc Duong
Nguyen Huu Duc
Faculty of information technology
Hanoi university of technology

Experiment in compiler
construction - Scanner design

Scanner l g?

Trong mt chng trnh dch, thnh


phn thc hin chc nng phn tch t
vng gi l scanner.

Experiment in compiler
construction - Scanner design

Scanner l g?

Experiment in compiler
construction - Scanner design

Nhim v ca mt scanner

B qua cc k t v ngha nh: du trng, tab,


k t xung dng, ch thch.
Pht hin cc k t khng hp l
Pht hin token

nh danh (identifier)
t kha (keyword)
s (number)
Hng k t/xu k t
special character
...
Experiment in compiler
construction - Scanner design

Nhim v ca mt scanner

Chuyn ln lt cc token cho b phn tch c


php (parser)

Experiment in compiler
construction - Scanner design

Bng ch ci ca KPL

Ch ci (letter): a-z, A-Z, _


Ch s (digit): 0-9
Cc k hiu c bit

+, -, *, /, >, <,!, =, [space], [comma], ., :, ;, , (, )

Experiment in compiler
construction - Scanner design

Cc token ca ngn ng KPL

T kha
PROGRAM, CONST, TYPE, VAR, PROCEDURE,
FUNCTION, BEGIN, END, ARRAY, OF, INTEGER,
CHAR, CALL, IF, ELSE, WHILE, DO, FOR, TO
Ton t
:= (assign), + (addition), - (subtraction), * (multiplication), /
(division), = (comparison of equality), != (comparison of
difference), > (comparison of greaterness), < (comparison
of lessness), >= (comparison of greaterness or equality),
<= (comparison of lessness or equality)

Experiment in compiler
construction - Scanner design

KPLs tokens

K hiu c bit
; (semicolon), . (period), : (colon), , (comma), (
(left parenthesis), ) (right parenthesis),
(singlequote)
V
(. v .) nh du ch mc ca mng
(* v *) nh du im bt u v kt thc
ca ch thch
Ngoi ra
nh danh, s, hng k t
Experiment in compiler
construction - Scanner design

Nhn dng cc token ca KPL

Cc token ca KPL to nn mt ngn ng


chnh quy v c th m t bi mt s c
php chnh quy.
Chng c th nhn dng bng mt automat
hu hn xc nh
scanner l mt automat hu hn xc nh

Experiment in compiler
construction - Scanner design

Nhn dng cc token ca KPL


c

S0

S1

S1

S1

Tokens type: identifier


Tokens value: car
car >= 30

S0

>

S0

S4

S5

Tokens type: greaterness or


equality comparison operator
Tokens value: >=

S0

S0

S2

S2

Tokens type: number


Tokens value: 30

Mi khi hon tt nhn dng mt token, automat s chuyn li v trng thi 0


Khi c li xy ra (gp k t ngoi bng ch ci,), automat s tr v trng
thi -1, .
Experiment in compiler
construction - Scanner design

10

Xy dng scanner Cu trc


STT Tn tp

Ni dung

Makefile

Project

scanner.c

Tp chnh

reader.h, reader.c

c m ngun

charcode.h,
charcode.c

Phn loi k t

token.h, token.c

Phn loi v nhn dng token, t kha

error.h, error.c

Thng bo li

Experiment in compiler
construction - Scanner design

11

Xy dng scanner reader


// c mt k t t knh vo
int readChar(void);
// M knh vo
int openInputStream(char *fileName);
// ng knh vo
void closeInputStream(void);
// Ch s dng, ct hin ti
int lineNo, colNo;
// K t hin ti
int currentChar;

Experiment in compiler
construction - Scanner design

12

Xy dng scanner charcode


typedef enum {
CHAR_SPACE,
CHAR_LETTER,
CHAR_DIGIT,
CHAR_PLUS,
CHAR_MINUS,
CHAR_TIMES,
CHAR_SLASH,
CHAR_LT,
CHAR_GT,
CHAR_EXCLAIMATION,
CHAR_EQ,
CHAR_COMMA,
CHAR_PERIOD,
CHAR_COLON,
CHAR_SEMICOLON,
CHAR_SINGLEQUOTE,
CHAR_LPAR,
CHAR_RPAR,
CHAR_UNKNOWN
} CharCode;

//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//

Khong trng
Ch ci
Ch s
+
-
*
/
<
<
!
=
,
.
:
;
\
(
)
K t ngoi bng ch ci

Experiment in compiler
construction - Scanner design

13

Xy dng scanner charcode

charcode.c nh ngha mt bng charCodes


nh x tng k t trong bng m ASCII vo
mt trong cc CharCode c nh ngha
Lu : Lnh c k t getc c th tr v m
EOF c gi tr nguyn l -1, nm ngoi bng
m ASCII

Experiment in compiler
construction - Scanner design

14

Xy dng scanner token


typedef enum {
TK_NONE,
// i din cho mt li
TK_IDENT,
// nh danh
TK_NUMBER,
// S
TK_CHAR,
// Hng k t
TK_EOF,
// Kt thc chng trnh
// Cc t kha
KW_PROGRAM, KW_CONST, KW_TYPE, KW_VAR,
KW_INTEGER, KW_CHAR, KW_ARRAY, KW_OF,
KW_FUNCTION, KW_PROCEDURE,
KW_BEGIN, KW_END, KW_CALL,
KW_IF, KW_THEN, KW_ELSE,
KW_WHILE, KW_DO, KW_FOR, KW_TO,
// Cc k hiu c bit
SB_SEMICOLON, SB_COLON, SB_PERIOD, SB_COMMA,
SB_ASSIGN, SB_EQ, SB_NEQ, SB_LT, SB_LE, SB_GT, SB_GE,
SB_PLUS, SB_MINUS, SB_TIMES, SB_SLASH,
SB_LPAR, SB_RPAR, SB_LSEL, SB_RSEL
} TokenType;

Experiment in compiler
construction - Scanner design

15

Xy dng scanner token


// Cu trc lu tr ca mt token
typedef struct {
char string[MAX_IDENT_LEN + 1];
int lineNo, colNo;
TokenType tokenType;
int value;
} Token;
// Kim tra mt xu c l t kha khng
TokenType checkKeyword(char *string);
// To mt token mi vi kiu v v tr
Token* makeToken(TokenType tokenType, int lineNo, int colNo);

Experiment in compiler
construction - Scanner design

16

Xy dng scanner error


// Danh sch cc li trong qu trnh phn tch t vng
typedef enum {
ERR_ENDOFCOMMENT,
ERR_IDENTTOOLONG,
ERR_INVALIDCHARCONSTANT,
ERR_INVALIDSYMBOL
} ErrorCode;
// Cc thng bo li
#define ERM_ENDOFCOMMENT "End of comment expected!"
#define ERM_IDENTTOOLONG "Identification too long!"
#define ERM_INVALIDCHARCONSTANT "Invalid const char!"
#define ERM_INVALIDSYMBOL "Invalid symbol!"
// Hm thng bo li
void error(ErrorCode err, int lineNo, int colNo);

Experiment in compiler
construction - Scanner design

17

Xy dng scanner scanner


// c mt token tnh t v tr hin ti
Token* getToken(void) {
Token *token;
int ln, cn;
if (currentChar == EOF)
return makeToken(TK_EOF, lineNo, colNo);
switch (charCodes[currentChar]) {
case CHAR_SPACE: skipBlank(); return getToken();
case CHAR_LETTER: return readIdentKeyword();
case CHAR_DIGIT: return readNumber();
case CHAR_PLUS:
token = makeToken(SB_PLUS, lineNo, colNo);
readChar();
return token;

}
}

Experiment in compiler
construction - Scanner design

18

Nhim v

Hon thin cc hm sau trong scanner.c

void skipBlank();
void skipComment();
Token* readIdentKeyword(void);
Token* readNumber(void);
Token* readConstChar(void);
Token* getToken(void);

Experiment in compiler
construction - Scanner design

19

You might also like