(F) Lex & Bison/Yacc: Language Tools For C/C++ CS 550 Programming Languages

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

(F)lex & Bison/Yacc

Language Tools for C/C++


CS 550 Programming Languages

Alexander Gutierrez
Lex and Flex Overview

● Lex/Flex is a scanner generator for C/C++


● It reads pairs of regular expressions and code to create a lexical analyzer
(scanner) written in C/C++
● Lex was the original generator written under proprietary license
● Flex was a separate project to recreate lex as an open source program
● Lex was originally the standard program, but Flex is now the preferred version
● They both are practically the same and Lex is harder to get, so we will refer to
Flex

2
Yacc and Bison Overview

● Yacc/Bison is a parser generator for C/C++


● As a compiler-compiler (parser generator), it is used to create a parser
● It reads a LALR grammar and creates a parser
● This parser is can be used as a component of a compiler by feeding in tokens
generated by a lexical analyzer
● In this case, we will use Flex to generate the tokens for Bison
● Similar to Lex/Flex history, Bison was created as an open source version of
Yacc
● We will refer to Bison for this presentation

3
Which to use?

● We will use Flex & Bison


○ http://flex.sourceforge.net/
○ http://www.gnu.org/software/bison/
● These are freely available (BSD,GNU), Lex & Yacc are not (AT&T proprietary)
● Lex & Yacc formerly were standard on machines, but are now basically
superseded by Flex & Bison
● Since they’re basically the same, we only really care about Flex & Bison
● Flex & Bison are on tux

4
Lex/Flex on tux.cs.drexel.edu

● Only Flex is available


● Command name: flex

● Why am I able to type lex and it seems to work?


On tux, it is symlinked lex -> flex

5
Yacc/Bison on tux.cs.drexel.edu

● Only Bison is available


● Command name: bison

● Typing yacc seems to work? It is just a symlink, too?


Mostly, yes…

● On tux, invoking yacc calls a script that runs bison in yacc-compatibility mode

● Why is there a yacc-compatibility mode for bison if they are basically the
same?
To account for some POSIX differences and minor quirks that we don’t really
care about

● Just use bison

6
The Bigger Picture

● We can use Flex and Bison to (relatively) easily implement our own
programming language

● To do this, we need to make the instruction manuals for Flex and Bison

● For Flex, we need to determine what tokens our language consists of and
how each token can be described using a regular expression

● For Bison, we need to create a (LALR) grammar that takes these tokens and
turns it into machine code

● Both Flex and Bison will produce a piece of C/C++ code which we can
compile using an appropriate C/C++ compiler

7
Balanced Parentheses Example

● The code for this example can be found at:


○ https://www.cs.drexel.edu/~jjohnson/2012-13/spring/cs550/programs/grammars/
○ Files paren.l paren.y

● This example looks at the language of balanced parentheses


● First, we will look at the regular expression file we give to Flex
● Next, we will look at the grammar we give to Bison
● Finally, we will compile and test our compiler

8
paren.l

%{
#include "paren.tab.h"
%}

%%
\( { return LEFTPAREN; }
\) { return RIGHTPAREN; }
.|\n { return 0; }
%%

9
paren.y
%{
#include <string.h>
#include <stdio.h>
%}

%token LEFTPAREN RIGHTPAREN

%%
S0: S1 S0 { printf("S0 => S1 S0\n"); }
| S1 { printf("S0 => S1\n"); }
;

S1: LEFTPAREN S2 RIGHTPAREN { printf("S1 => (S2)\n"); }


| LEFTPAREN RIGHTPAREN { printf("S1 => ()\n"); }
;

S2: S1 S2 { printf("S2 => S1 S2\n"); }


| S1 { printf("S2 => S1\n"); }
%%
10
Compiling on tux

● All we need are these two files, paren.l and paren.y, in our directory:

$ ls
paren.l paren.y

● We can compile using the following sequence of commands (NOTE: ORDER


IS VERY IMPORTANT)

$ bison -d paren.y
$ flex paren.l
$ gcc paren.tab.c lex.yy.c -ly -lfl

● Further explanation follows...

11
Running Bison

● The reason we use bison first is to produce information about tokens that it accepts,
which we can feed to flex to create our lexical analyzer

$ bison -d paren.y

● The ‘-d’ option for bison creates header files which enable us to feed this information to
flex

● Remember this line in paren.l :

#include "paren.tab.h"

● paren.tab.h is a header file that bison creates with this option


● Our directory now looks like:

$ ls
paren.l paren.tab.c paren.tab.h paren.y

12
Running Flex

● Now we can simply run flex to produce our lexical analyzer:

$ flex paren.l

● This produces another piece of code, ‘lex.yy.c’ :

$ ls
lex.yy.c paren.l paren.tab.c paren.tab.h paren.y

● Next we can compile the whole thing and try it out.

13
Compiling the compiler

● Now, we use the last command mentioned earlier:

$ gcc paren.tab.c lex.yy.c -ly -lfl

● Here, we are using gcc to compile the code using the bison (yacc) and flex
libraries.

● The order of the options are actually important in order to make the resulting
compiler work.

● As usual with the GNU C/C++ compilers, the result is an executable named
a.out by default

14
Using Our New Language

● We can test to make sure it works by running the executable and giving it
input.

$ ./a.out
(())
S1 => ()
S2 => S1
S1 => (S2)
S0 => S1

● I entered in a string that is in the language, (()), and it executes the


associated code. In this case, the code that is instructed to run by the
language were the printf statements we saw earlier in the grammar.

● In other words, the function of this interpreter is to display its own parsing via
its grammar rules.
15
Using Our New Language (cont.)

● Another example input:

$ ./a.out
(()()(
S1 => ()
S1 => ()
syntax error

● In this case, I gave it a malformed program. The input was not in the
recognized language due to imbalanced parentheses and therefore gave a
syntax error.

● The grammar that we gave it is being enforced.

16
Summary

● Use flex and bison on tux (already installed)

● Design your own language by creating tokenization instructions via regular


expressions for Flex and a grammar for Bison

● Implement the language by giving Flex and Bison these instructions to


generate a lexical analyzer and parser respectively

● Compile with a C/C++ compiler to realize your very own programming


language

17
Reference

John R. Levine, flex & bison, O'Reilly &


Associates.

● This book can be found through Drexel’


s library website for free.

● flex & bison is basically an updated


version of the old lex & yacc book
because they are practically the same
utilities.

18

You might also like