Lex & Yacc: Lex - A Lexical Analyzer Generator

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 5

The 

Lex & Yacc Page


The asteroid to kill this dinosaur is still in orbit.
- Lex Manual Page 

ON THIS PAGE
Overview | Lex | Yacc | Flex | Bison | Tools | Books 

OVERVIEW
A compiler or interptreter for a programminning language is often decomposed into two parts:

1. Read the source program and discover its structure.


2. Process this structure, e.g. to generate the target program.

Lex and Yacc can generate program fragments that solve the first task.

The task of discovering the source structure again is decomposed into subtasks:

1. Split the source file into tokens (Lex).


2. Find the hierarchical structure of the program (Yacc).

 A First Example: A Simple Interpreter 

LEX
Lex - A Lexical Analyzer Generator
M. E. Lesk and E. Schmidt

Lex helps write programs whose control flow is directed by instances of regular expressions in
the input stream. It is well suited for editor-script type transformations and for segmenting input
in preparation for a parsing routine.

Lex source is a table of regular expressions and corresponding program fragments. The table is
translated to a program which reads an input stream, copying it to an output stream and
partitioning the input into strings which match the given expressions. As each such string is
recognized the corresponding program fragment is executed. The recognition of the expressions
is performed by a deterministic finite automaton generated by Lex. The program fragments
written by the user are executed in the order in which the corresponding regular expressions occur
in the input stream.

 Online Manual
 PostScript
 Lex Manual Page
YACC
Yacc: Yet Another Compiler-Compiler
Stephen C. Johnson

Computer program input generally has some structure; in fact, every computer program that does
input can be thought of as defining an ``input language'' which it accepts. An input language may
be as complex as a programming language, or as simple as a sequence of numbers. Unfortunately,
usual input facilities are limited, difficult to use, and often are lax about checking their inputs for
validity.

Yacc provides a general tool for describing the input to a computer program. The Yacc user
specifies the structures of his input, together with code to be invoked as each such structure is
recognized. Yacc turns such a specification into a subroutine that han- dles the input process;
frequently, it is convenient and appropriate to have most of the flow of control in the user's
application handled by this subroutine.

 Online Manual
 PostScript
 Yacc Manual Page

FLEX
Flex, A fast scanner generator
Vern Paxson

flex is a tool for generating scanners: programs which recognized lexical patterns in text. flex
reads the given input files, or its standard input if no file names are given, for a description of a
scanner to generate. The description is in the form of pairs of regular expressions and C code,
called rules. flex generates as output a C source file, `lex.yy.c', which defines a routine `yylex()'.
This file is compiled and linked with the `-lfl' library to produce an executable. When the
executable is run, it analyzes its input for occurrences of the regular expressions. Whenever it
finds one, it executes the corresponding C code.

 Online Manual
 PostScript
 Flex Manual Page
 Download Flex from ftp://prep.ai.mit.edu/pub/gnu/

BISON
Bison, The YACC-compatible Parser Generator
Charles Donnelly and Richard Stallman

Bison is a general-purpose parser generator that converts a grammar description for an LALR(1)
context-free grammar into a C program to parse that grammar. Once you are proficient with
Bison, you may use it to develop a wide range of language parsers, from those used in simple
desk calculators to complex programming languages.

Bison is upward compatible with Yacc: all properly-written Yacc grammars ought to work with
Bison with no change. Anyone familiar with Yacc should be able to use Bison with little trouble.

 Online Manual
 PostScript
 Bison Manual Page
 Download Bison from ftp://prep.ai.mit.edu/pub/gnu/

TOOLS

Other tools for compiler writers:

 Compiler Construction Kits


 Lexer and Parser Generators
 Attribute Grammar Systems
 Transformation Tools
 Backend Generators
 Program Analysis and Optimisation
 Environment Generators
 Infrastructure, Components, Tools
 Compiler Construction with Java

BOOKS

Yacc
From Wikipedia, the free encyclopedia

Yacc is a computer program for the Unix operating system. It is a LALR parser generator,


generating a parser, the part of a compiler that tries to make syntactic sense of thesource code,
specifically a LALR parser, based on an analytic grammar written in a notation similar to BNF. Yacc
itself used to be available as the default parser generator on most Unix systems, though it has since
been supplanted as the default by more recent, largely compatible, programs.

Contents
  [hide] 

 1Description
 2See also
 3References
 4External links

Description[edit]
YACC is an acronym for "Yet Another Compiler Compiler". It is a LALR parser generator, generating
a parser, the part of a compiler that tries to make syntactic sense of thesource code, specifically
a LALR parser, based on an analytic grammar written in a notation similar to BNF.[1] It was originally
developed in the early 1970s by Stephen C. Johnson at AT&T Corporation and written in the B
programming language, but soon rewritten in C.[2] It appeared as part of Version 3 Unix,[3] and a full
description of Yacc was published in 1975.[4]

The input to Yacc is a grammar with snippets of C code (called "actions") attached to its rules. Its
output is a shift-reduce parser in C that executes the C snippets associated with each rule as soon
as the rule is recognized. Typical actions involve the construction of parse trees. Using an example
from Johnson, if the call node(label, left, right)constructs a binary parse tree node with
the specified label and children, then the rule

expr : expr '+' expr { $$ = node('+', $1, $3); }

recognizes summation expressions and constructs nodes for them. The special identifiers $
$, $1 and $3 refer to items on the parser's stack.[4]

Yacc and similar programs (largely reimplementations) have been very popular. Yacc itself used to
be available as the default parser generator on most Unix systems, though it has since been
supplanted as the default by more recent, largely compatible, programs such as Berkeley
Yacc, GNU bison, MKS Yacc and Abraxas PCYACC. An updated version of the original AT&T
version is included as part of Sun's OpenSolaris project. Each offers slight improvements and
additional features over the original Yacc, but the concept and syntax have remained the same. [citation
needed]
 Yacc has also been rewritten for other languages, including OCaml,
[5]
 Ratfor, ML, Ada, Pascal, Java, Python, Ruby,Go[6] and Common Lisp.[7]

Yacc produces only a parser (phrase analyzer); for full syntactic analysis this requires an
external lexical analyzer to perform the first tokenization stage (word analysis), which is then
followed by the parsing stage proper. [4] Lexical analyzer generators, such as Lex or Flex are widely
available. The IEEE POSIX P1003.2 standard defines the functionality and requirements for both
Lex and Yacc.[8]

You might also like