Download as pdf or txt
Download as pdf or txt
You are on page 1of 24

Interpreter Assignment 1

Due Date: February 15th, 2022

As part of this class, we are going to practice creating lexical rules and grammars. These
concepts are important in the design of programming languages and are a little wacky, meaning
they take some practice to get the hang of them. The sequence of these assignments will be as
follows:

1. Setup a development to use the programming languages Antlr and Java. We will use this
environment, and the IntelliJ IDE, to create an expression parser and have it interpret
programs. This will involve writing some lexical rules and one grammar rule.
2. In the second assignment, students will design several grammar rules and base their
decisions both on technical factors and on human factors evidence. In this assignment,
students will add rules for several more grammar rules.

Setup for Assignments 1 - 2


While in much of your computer science courses until now, you've focused on using linux and
the console to write programs, in practice such a procedure leaves something to be desired.
Notably, modern development environments are not perfect, but they are explicitly designed for
developer productivity. For this reason, we are going to practice using more modern tools when
we write our code. While many are available for the type of work we are doing, the programming
language Antlr works best in IntelliJ, compared to NetBeans or others, because that
development team has focused on it more. You may not use the console/emacs/ or other
editors for this assignment. Part of the lesson is in learning new environments. I have included
in this document instructions for getting everything.

Download and Setup IntelliJ


First, we need to download our Integrated Development Environment, which we can do here:

https://www.jetbrains.com/idea/

Download and install it. My screenshots are all on Mac, so it may look different if you use it on a
different operating system, but when first opened IntelliJ allows you to import older settings if
you wish. I did not have a previous version on the machine I was using, so I said no. In practice,
unless you already use IntelliJ, you can likely do the same.
Next, IntelliJ requires you accept a privacy policy. If you, for some reason I cannot imagine,
actually read the privacy policy and have legal objections to it, please contact the company, or a
lawyer, and ask them about it. Otherwise, click accept and move on:

Next, IntelliJ asks if they can collect data about your use of the system. Many development
environments do this and the data can be helpful for development teams trying to make the
tools better. However, there may also be privacy concerns. Please use your own best
judgement as to whether you accept or reject:
Next, IntelliJ wants to know if would like to customize it. Do so if you wish, but I said no and
accepted the defaults:

For our Antlr project, we are now going to create a new Java Project. We do this because Antlr
is going to compile the files we create into Java files, which will then get compiled by the Java
Compiler. This might sound a little weird, and it is, but it is commonplace for programming
languages that generate other programming languages. Here's the window for creating a
project:

Before we make a project, we need to install the Antlr plugin. To do that, we click the Configure
button and click "Plugins." From there, this window opens:
Type Antlr into the box. There are two that pop up and the one we want is the Antlr v4 grammar
plugin (not the Antlrworks one). It is shown here:
Click install and the following Window will pop up:

Antlr is run by an academic group and seems pretty safe to me and if you have concerns again
contact the author. Otherwise, just click accept. When this happens, it will ask you to restart the
IDE, so say yes to that too. Next, we are going to create a new Java Project:
Click next and then select the Java Hello World Application:
We are then going to give our project the name of Interpreter 1, like so:
Ok cool, we now have a Java project up and running. The development environment will now
look like this:
There may be a tip window that shows up when this screen appears. Feel free to keep it turned
on if you like the tips. Otherwise, feel free to turn it off. To run our program, while it's adorably
small for some reason, we click the little green triangle at the top:
The output of our program shows at the bottom of the screen. Notice that it says Hello, World.
We are good to go to the next step, which is getting Antlr setup as part of our project.

Setting up Antlr in IntelliJ


We already setup the plugin for Antlr in IntelliJ, but we also have to hook it up. To do this, we are
going to start by creating a grammar file in the main folder of src (right click on it):

We will name it Interpreter.g4, which is the file extension for version 4 of Antlr. At the bottom of
the window is a button named Antlr Preview, which we click. When we do that, notice it pops up
an error:
Let's add some default code to get ourselves started. As a beginning template, put in the
following code:

grammar Interpreter;

start :
expression EOF
;

expression
:
| INT
| expression (PLUS | MINUS) expression
;

PLUS : '+';
MINUS : '-';
INT : '0'..'9'+;
Next, we are going to "test" our grammar to make sure it is doing what we expect. We do this by
right clicking on the word "start" in the text editor and then selecting Test Rule Start from the
context Menu.

This will pull up a very handy window for testing and learning about grammars. It looks like this:
In the lower left box, type in a phrase like "1+2+5" and observe the parse tree that Antlr creates.
Generating Antlr Into Java
The interpreter built into IntelliJ's Antlr plugin is helpful for understanding our grammar, but
ultimately we want to be able to create files written in our programming language and execute
them. This is the last part of our setup and then we can get to our first assignment. To do this,
we need to tell Antlr where to generate our files. We first right click on our grammar again,
but this time we select Configure Antlr. That pulls up this window:
When filled out, we need to specify the output directory, basically where to generate the files,
the java package, and we need to tell Antlr to generate the code in Java. Antlr generates into
many programming languages (it calls them targets), so this is important. Finally, leave the
checkbox on "generate parse tree listener (default)." That impacts the way the grammar is
generated and is, in my opinion, easier to program.
Warning: Note that Java must have a capital J, not a lowercase J. If you accidentally type a
lowercase j, you will obtain the following error:

error(31): ANTLR cannot generate java code as of version 4.7.2

If everything went well, there should be a folder called interpreter with a bunch of stuff in it. Feel
free to look at it if you want, but it is not critical for this assignment. The environment will now
look like so:

Run Java
Once Antlr is generating, we now have to run Java code to tell our program to execute. But, to
compile our generated code, there is one final step: we need to add a dependency for Antlr to
our Java project. To start, we can download the Antlr Jar from this page:

https://www.antlr.org/download.html

It is a little hard to see it, but the actual download link we want is this one:
https://www.antlr.org/download/antlr-4.9.1-complete.jar

That file is a "Jar" file, which basically just means it is a compressed file that contains code we
can execute, all bundled up in a way Java understands. We need to place our code somewhere.
For the sake of argument, I put mine here:

/Users/stefika/IdeaProjects/antlr-4.9.1-complete.jar

To add it into our project, we then go to File -> Project Structure. From there, we get a window
like so:
In this Window, we click the little plus and select "JARs or directories..." and browse to where we
stored the file we downloaded from the Antlr site. We will then have the dependency listed.

If we run the code, it will compile everything correctly, but still just say Hello, World. Let us fix
this problem now, giving us a way to execute our grammar. We will use the following code as a
starting point:

import org.antlr.v4.runtime.CharStreams;
import org.antlr.v4.runtime.CommonTokenStream;
import Interpreter.InterpreterParser;
import Interpreter.InterpreterLexer;

public class Main {


public static void main(String[] args) {
InterpreterLexer lexer = new
InterpreterLexer(CharStreams.fromString("1+2+5"));
InterpreterParser parser = new InterpreterParser(new
CommonTokenStream(lexer));

parser.start();
System.out.println("My parser has executed Order 66");
}
}

We can then run our grammar and it will look like the following in IntelliJ:

Notice that it does not "look like" the parser did anything. While this is weird, it did and the
parser had no errors, so having no output is typical for Antlr. As a final step, add an illegal
statement to our code, in this case some extra white space in the string (spaces). If we add two
spaces, we will obtain two errors in Antlr, which both say "token recognition error at" ' '. This
error is telling us that Antlr is rejecting the grammar. In lay terms, this is not saying that there is
a problem with our grammar. it is saying that the program we fed to the grammar was not
accepted. There may in fact be a problem with our grammar, but for now the easiest way to
think about this is that our parser has given us a "compiler error." The point is, however, that this
is a compiler error in our language, but this does not mean our language was necessarily good
or bad.
Assignment 1
Now that we have completed our setup, we can move on to doing the part of the assignment
that you turn in. Before you begin this piece, however, please be sure you have successfully
completed all of the steps above and have a working program. If you haven't, you
realistically can't do this part of the assignment. Please do not discount the amount of time it
takes to get this kind of stuff up and running. Configuration can be a bit of a pain in the field of
computer science. Even as someone with a Ph.D. that has a lot of experience, I get stuck too!
It's normal and the steps above will at least provide guidance.

For the assignment itself, we are going to focus on fleshing out our expression and adding a few
lexical rules to our grammar. So far, here is our grammar again:

grammar Interpreter;

start :
expression EOF
;

expression
:
| INT
| expression (PLUS | MINUS) expression
;

PLUS : '+';
MINUS : '-';
INT : '0'..'9'+;

In this case, start and expression are both rules for a context free grammar (sort of, Antlr is
complicated). The capital letter rules PLUS, MINUS, and INT are lexer rules and are
approximately equivalent to a finite automata. We will be adding a series of lexer rules and
grammar rules.

Write some Rules


I would like you to add the following lexer rules:

1. BOOLEAN primitives
2. NUMBER primitives
3. WHITE SPACE
4. COMMENTS
Next, I would like you to modify your expression grammar rule to include:
1. BOOLEANS
2. NUMBERS
3. Multiplication and Division
4. Parentheses

Once you have completed your rules, I would like you to put that into Antlr. It does not have to
be perfect, but it should parse correctly and you should try to figure it out on your own: do
not dig around on the Internet yet. The point is, I am not telling you what these rules should
do on purpose: that is what you should think about and design. Do not just copy something from
somewhere. Think it through and design your rules so that they make sense and fit with
common expectations you have learned in previous courses and from math.

Internet Search
Next, I would like you to find three grammars written in Antlr. For each, I want you to make a list
of what the lexer and grammar rules are for each language for only the pieces included in
your language. This might sound easy, but some grammars can be large. I want you to read
them, figure out what parts match to yours (e.g., this is how language X does an integer or a
number) and what doesn't.

Then write a short paragraph or two, in Microsoft Word, comparing and contrasting the
grammars with each others and the solution you created. Finally, adapt your solution how you
see fit, from looking at the wide variety of grammars that already exist. This source is excellent
and you are expected to use it:

https://github.com/antlr/grammars-v4

Here is a list of what you turn in for this part:

1. Your original solution, before an Internet search or any help, whatsoever (5 points)
2. A list of the three grammars you chose from the source above (3 points)
3. For each grammar, place in a document the rules that are similar to yours. DO NOT
COPY THE ENTIRE GRAMMAR. IF YOU DO, YOU WILL RECEIVE ZERO POINTS
FOR THE ASSIGNMENT. The purpose is to look through the grammar and learn what
regions of it match to the kinds of grammar rules you are working with so far, not to just
blindly copy something from the Internet. (5 points)
4. For each rule you copied in, discuss whether, and how, it is different, from your attempt
at the problem that you tried before you looked things up. (3 points)
5. Finally, rewrite your grammar rules to what you think, subjectively, is the best solution
from amongst the various languages. Write a paragraph stating what you changed and
why you changed it. (5 points)
6. Important: As one final inclusion, have a section in your word document that says what
problems, if any, you had while completing this assignment. If you had none, that's ok. If
you had difficulties, please state them in no more than 1 paragraph. The TA will compile
the responses. (4 points)

To turn it in, include one zip file with your project in it (uncompiled, do a clean) and one word
document describing the assignment.

25 Points Possible

You might also like