Compiler Design and Construction COMP2932
We want to develop a parser for a simple procedural language which we will call SAL (for Simple Algorithmic Language). The language syntax is a hybrid of C, Pascal and other things. The language has two simple data types: the integer and the real. The language supports the following statements:
An assignment statement which starts with the keyword set, but this keyword can be dropped (i.e. it is optional).
An input statement that is used to read in data from the keyboard.
An output statement that is used to print out data to the console.
A traditional if else statement.
Two types of iteration (loop) statements, the repeat … times and the repeat while statements, both of which must start with the keyword repeat. The repeat … times loop repeats the body of the loop a fixed number of times determined by an integer value (expression) inserted between the repeat and times keywords. The repeat while loop is just like the while loop in C.
In addition to the four arithmetic operators (+, -, *, and /) the language has a built-in operator for exponentiation (the ^ operator). The ^ operator has higher precedence than the * and / operators which in turn have higher precedence than + and -.
Notice that there is no need for a semicolon at the end of each statement in this language, and that every statement must start with a keyword (the ‘set’ keyword is optional however). The language supports a built-in function for the square root operation, called sqrt. Single line comments start with a double exclamation mark (with no space in between). A program is comprised of one single file having a .sal extension.
The following are some example programs in this language:
!! Find the area and circumference of a rectangle
variable Length , Width: real
variable Area, Circum : real
output “Enter the Length and Width of the rectangle: ”
input Length , Width
set Area = Length * Width
set Circum = (Length + Width) * 2
output “Area = ” , Area
output “Circumference = ” , Circum
!! Solve a quadratic equation of the general form:
!! Ax^2 + Bx + C = 0
variable A , B , C : realvariable delta : real
variable x1 , x2 : real
input A , B , C
set delta = B^2 – 4 * A * C
if (delta < 0)
{
output “Cannot be solved! ”
stop
}
set x1 = (-B – sqrt (delta) ) / (2 * A)
set x2 = (-B + sqrt (delta) ) / (2 * A)
output x1 , x2
!! reads the score (mark) of a student (on a scale from 0 -100) and prints the student’s assessment
!! note: pass mark is 60
input M
if (M > 100 )
output “Wrong Number”
else
if (M >= 90)
output “Excellent”
else
if (M >= 80)
output “Very Good”
else
if (M >= 70)
output “Good”
else
if (M >= 60)
output “Not Bad”
else
output “Failed ”
!! Calculate the sum of a series of integer numbers
variable NN: integer !! The number of numbers in the series
variable N : integer !! one of the numbers of the series
variable sum : integer !! holds the sum of the numbers
output “How many numbers are there in the series :”
input NN
set sum =0
repeat NN times
{
output “Enter a number: ”
input N
set sum = sum + N
}
output “The sum = ” , sum!! find the minimum number in a series of integer numbers
variable NN, N, Min, n : integer
output “How many numbers ? ”
input NN
input N
Min = N
n = 0
repeat while (n < NN-1)
{
output “Enter a number: ”
input N
if (N < Min)
{
Min = N
}
n = n + 1
}
output “The Min =” , Min
!! The factorial of a integer number
variable n , fact , w : integer
input n
if (n < 0)
output “Error”
else
{
fact = 1
w = n
repeat while (w > 0)
{
fact = fact * w
w = w – 1
}
output “The factorial of ” , n , “=” , fact
}
Examine the above examples then:
Write a CFG (context free grammar), or an Extended CFG for the this language.
(20 marks)
Using the C programming language, implement a lexer for this language. The lexer should expose itself through two main functions: getToken, which returns the next token in the input file, and peekToken which looks at the next token without consuming it (i.e. without removing it from the input file). Use the above examples to test your lexer.
(20 marks)
Based on the above grammar, implement a Recursive Descent Parser for this language (using C as well) that accepts a single input file containing a SAL program then parses the program and prints appropriate syntax error messages (if any) or prints ‘OK’ if the program is free of errors. To avoid the complexities of error recovery, the parser can terminate after encountering and reporting the first syntax error in the input file. Use the above examples to test your parser.
(20 marks)
Write a short report (3-5 pages excluding cover page) detailing the grammar you have devised, and briefly explaining your implementation of the lexer and parser.