CMPT-225 Assignment2
A simple compiler.
Due on Monday March 10th, at 23:59.
PLEASE READ THIS PAGE CAREFULLY BEFORE STARTING THE ASSIGNMENT
Problem Statement
Ready to write your first compiler. Don’t panic, it’s a very simple compiler. I am sure you will enjoy writing it and will learn a lot. Well actually it’s not exactly a compiler. A compiler (such as gcc) takes a text file containing a number of statements in in a particular language (such as c++) and produce an executable file (i.e. an equivalent program in machine language) as output. The program that you will write gets a program file as input and executes it directly line by line (it does not produce a machine code file). Such programs are called interpreters. The programming language that we use in this assignment is a very simple language called “expression language”.
The expression language.
Statements: each line of a program in expression language contains a single statement. There are three types of statements in the expression language.
1. The assignment statement: the syntax of an assignment statement is variable=infix-expr. When an assignment statement is executed the infix-expr is evaluate and its value is assigned to the variable.
2. The input statement: the syntax of an input statement is read variable. When an input statement is executed the program asks the user to enter an integer and assigns the entered value to the specified variable.
3. The output statement: the syntax of an output statement is print infix-expr. When an output statement is executed the infix-expr is evaluate and its value is printed on the screen.
Variables: the variables in expression language are one letter characters: “a” to “z” and “A” to “Z”. The language is case sensitive so “a” and “A” refer to two different variables. Also Read or PRINT are not valid keywords.
Infix-expr: the infix-expr’s are used in assignment or print statements. An infix-expre is an algebraic expression in infix format. The operators in an infix-expr include: “+”, “-”, “*”, and “/”. The operators are integer binary operators (there is no unary operator such as y = –x). The operands are either variables or positive integer constants. You can assume that there is exactly one space between operands and operators (including parenthesis) in an infix expression. This means that your program will be tested for the programs that include statements with exactly one space between their operands and operators. Below are some examples of valid infix-expr:
Y, X + 2 * y, ( 0 — 45 ), ( x — y ) * ( x + y ), G * ( ( m * n ) / ( r * r ) ),
3 * r * r, 2 * 3 * r.
and here some invalid infix-expr’s:
2y //2y is not a variable.
- 45 * u + ( - Z ) //no unary operator is allowed.
( ( y + 7 ) * z //unbalanced parenthesis.
Sum = J + I //sum is not a valid variable (variables consist of one letter only).
[ x + y ] //invalid character ]
Below is a simple program:
read x
read y
z = x * x + y * y
print z
print ( z + x ) / 2
Interpreter output:
Enter an integer value for x: 2
Enter an integer value for y: 3
13
7
How does the interpreter work?
Given an input program file your interpreter must read its statements one by one and execute them properly. If a statement contains error, the program must print an appropriate error message and the statement that contains the error and then terminate the execution of the input program.
There can be two types of errors in a program statement:
1. Syntax error: if a statement does not follow the rules of the expression language it has a syntax error. A syntax error can be an unbalanced parenthesis expression, an incorrect infix expression (such as y+u- ) , an invalid variable name (such as x2), an invalid character (such as %, ^, &, ...). For such errors it’s enough that your program prints “syntax error” along with the line number and the statement that contains the error.
2. Un-initialized variable error: If a statement tries to access the value of a variable that is not initialized before, the interpreter must print an un-initialized variable error message. For instance if the first statement of a program is
“print x” the interpreter must print:
1. print x
error: variable x is not initialized.
Here 1. is the line number.
Here are some erroneous programs and the appropriate error messages by the interpreter.
Program #1
read x
x = x % 2
print x
Interpreter output:
Enter an integer value for x: 25
2. x = x % 2
Syntax error.
Program #2
x = 24
read y
print x + ( -y )
Interpreter output:
Enter an integer value for y: -34
3. print x + ( -y )
Syntax error.
Program #3
read p
read q
d = p * p + q * q
print d
d = c * d
print d
Interpreter output:
Enter an integer value for p: 2
Enter an integer value for q: -3
13
5. d = c * d
error: variable c is not initialized.
Implementation guidelines
¨ Handling variables
You can define a class variable in your program to store the value for each variable. The class variable should have an initialized flag member that indicates whether the variable has been initialized or not.
class variable{
private int value;
private boolean initialized;
public variable(){
initialized = false;
}
public int setValue(int v){
value=v;
initialized=true;
}
public int getValue() throws UninitializedException{
if (!initialized)
throw new UnintializedException();
else
return value;
}
}
Note that the getValue() method throws an exception if we try to get the value of an un-initialized variable.
Now you can create an array of variables as follows:
memory variable[]=new variable[‘z’];
for(int i=0; i<‘z’; i++)
memory[i]=new variable();
thus to set a new value for variable ‘G’ for instance, you can write:
memory[‘G’].setValue(0);
and to get the value of variable ‘G’ you can write:
try{
int value = memory[‘G’].getValue();
...
}
catch (UninitalizedException E){
//the variable has not initialized yet. Print an error message and exit
}
¨ Evaluating expressions
There are two options to evaluate expressions in a program.
1. Converting Infix expressions to equivalent postfix expressions and then evaluate the postfix expressions. Both algorithms are explained in the text book (p. 351) and in the lecture notes. The advantage of this method is that if there’s any error in the infix expression it can be detected in the first step.
2. Evaluating the infix expressions directly. You can use the algorithm in the programming problem number 5 of chapter 7 (page 375) to evaluate an infix expression directly. However the algorithm assumes that the infix expression is syntactically correct. Thus you should modify it without this assumption (problem 8. page 376).
The main class name
You must call the main class (the class that contains the main method) exprInterpreter.java
Input program file
The interpreter must get its input file name as argument. If you are using the eclipse compiler go to the Run menu and click on “Open Run dialogue”. Select the “Arguments” tab and type the name of the input file in the Program arguments box. If you are using the command line java compiler, type the name of the input file after the main class. For instance if the input program is myprog.expr then type java exprInterpreter myprog.expr in the linux prompt. This way the arg[0] value in the main method below will be the string “myprog.expr”.
public static void main(String arg[]){
//arg[0] is the first argument which should be the name of the input program.
...
}
Bonus 2%
You will get an extra 2% mark if your program can handle the for loops. The syntax of a for loop is
for variable from constant1 to constant2
statement-list
endfor
variable is the index of the loop, constant1 is the initial value of the index and constant2 is the boundary value. You can assume that there is no nested loop.
Example:
s = 0
for i from 1 to 5
s = s + i
print i
endfor
print s
Output
1
2
3
4
5
15
Important. If your program handles for loops, you must specify that in a comment in the beginning of the main file i.e exprInterpreter.java .
Important Remarks
It’s very important that you make sure your program compiles without any problem (you may get 0 if your program didn't compile-see the submission section below).
Submission
Submit your source files only (for java users: *.java and for c++ users: *.cpp and *.hpp or *.h). You can use any compiler that you want but you need to make sure that your program will compile under the Linux. The java programs should be compiled under the Linux without any problem. However, sometimes the c++ programs written in windows won’t compile directly under the Linux (especially if you are using Visual C++). We recommend that you use the eclipse compiler as it makes less problem with Linux. To test whether your program compiles under the Linux copy all your files in a directory and type the command g++ *.cpp. If the program compiles successfully it will create an executable file called a.out . To run the file type ./a.out . You can also download a windows version of the g++ compiler from
http://www.claremontmckenna.edu/math/ALee/g++/g++.html.
|