Starting from:

$30

CS314-Project 1 A Compiler for the TinyL Language Solved

1.1         The tinyL language
tinyL is a simple expression language that allows assignments and basic I/O operations.

<program
::=
<stmt list !
<stmt list
::=
<stmt <morestmts
           <morestmts         ::=       ; <stmt list

<stmt
::=
<assign | <read | <print
<assign
::=
<var = <expr
<read
::=
% <var
<print
::=
$ <var
<expr
::=
<arith expr |

<logical expr |

<var |

<digit
<arith

<logical expr
::=
& <expr <expr |

| <expr <expr
<var
::=
a | b | c | d | e | f
<digit
::=
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
Two examples of valid tinyL programs:

1.    %a;%b;c=&3*ab;d=+c1;$d!

2.    %a;b=-*+1+2a58;$b!

1.2         Target Architecture
The target architecture is a simple RISC machine with virtual registers, i.e., with an unbounded number of registers. All registers can only store integer values. A RISC architecture is a load/store architecture where arithmetic instructions operate on registers rather than memory operands (memory addresses). This means that for each access to a memory location, a load or store instruction has to be generated. Here is the machine instruction set of our RISC target architecture. Rx , Ry , and Rz represent three arbitrary, but distinct registers.

instr. format
description
semantics
 
memory instructions
 
LOADI Rx #<const
load constant value #<const into register Rx
Rx ← <const
LOAD Rx <id
load value of variable <id into register Rx
Rx ← <id
STORE <idRx
store value of register Rx into variable <id
<id ← Rx
 
arithmetic instructions
 
ADD Rx
Ry
Rz
add contents of registers Ry and Rz , and store result into register Rx
Rx
← Ry
+ Rz
SUB Rx
Ry
Rz
subtract contents of register Rz from register

Ry , and store result into register Rx
Rx
← Ry
− Rz
MUL Rx Ry Rz
multiply contents of registers Ry and Rz , and store result into register Rx
Rx ← Ry ∗ Rz
 
logical instructions
 
AND Rx
Ry Rz
apply bit wise AND operation to contents of registers Ry and Rz , and store result into register Rx
Rx ← Ry & Rz
OR Rx Ry Rz
apply bit wise OR operation to contents of registers Ry and Rz , and store result into register Rx
Rx ← Ry | Rz
 
I/O instructions
 
READ <id
read value of variable <id from standard input
read( <id )
WRITE <id
write value of variable <id to standard output
print( <id )
 
 
 
 
 
 
 
 
1.3         Dead Code Elimination
Our tinyL language does not contain any control flow constructs (e.g.: jumps, if-then-else, while). This means that every generated instruction will be executed. However, if the execution of an operation or instruction does not contribute to the input/output behavior of the program, the instruction is considered “dead code” and therefore can be eliminated without changing the semantics of the program. Please note that READ instructions may not be eliminated although the read value may have no impact on the outcome of the program.

All READ instructions are kept since the overall input/output program behavior needs to be preserved.

Your dead-code eliminator will take a list of RISC instructions as input. For example, in the following code

LOADI Rx #c1 LOADI Ry #c2

LOADI Rz #c3

ADD R1 Rx Ry

MUL R2 Rx Ry

STORE a R1

WRITE a

the MUL instruction and the third LOADI instruction can be deleted without changing the

semantics of the program. Therefore, your dead-eliminator should produce the code:

LOADI Rx #c1

LOADI Ry #c2

ADD R1 Rx Ry

STORE a R1

WRITE a

2           Project Description
The project consists of two components:

1.    Complete the partially implemented recursive descent LL(1) parser that generates RISCmachine instructions.

2.    Write a dead-code eliminator that recognizes and deletes redundant, i.e., dead RISCmachine instructions.

The project represents an entire programming environment consisting of a compiler, an optimizer, and a virtual machine (RISC machine interpreter). You are required to implement the compiler and the optimizer (described in Section 2.1 and 2.2). The virtual machine is provided to you (described in Section 2.3).

2.1         Compiler
The recursive descent LL(1) parser implements a simple code generator. You should follow the main structure of the code as given to you in file Compiler.c. As given to you, the file contains code for function digit, assign, and print, as well as partial code for function expr. As is, the compiler is able to generate code only for expressions that contain “+” operations and constants. You will need to add code in the provided stubs to generate correct RISC machine code for the entire program. Do not change the signatures of the recursive functions. Note: The left-hand and right-hand occurrences of variables are treated differently.

2.2         Dead Code Elimination Optimization
The dead code elimination optimizer expect the input file to be provided at the standard input (stdin), and will write the generated code back to standard output (stdout).

The basic algorithm identifies “crucial” instructions. The initial crucial instructions are all READ and WRITE instructions. For all WRITE instruction, the algorithm has to detect all instructions that contribute to the value of the variable that is written out. First, you will need to find the store instruction that stores a value into the variable that is written out. This STORE instruction is marked as critical and will reference a register, let’s say Rm. Then you will find the instructions on which the computation of the register Rm depends on and mark them as critical. This marking process terminates when no more instructions can be marked as critical. If this basic algorithm is performed for all WRITE instructions, the instructions that were not marked critical can be deleted.

Instructions that are deleted as part of the optimization process have to be explicitly deallocated using the C free command in order to avoid memory leaks. You will implement your dead code elimination pass in the file Optimizer.c. All of your “helper” functions should be implemented in this file.

2.3         Virtual Machine
The virtual machine executes a RISC machine program. If a READ <id instruction is executed, the user is asked for the value of <id from standard input (stdin). If a WRITE <id instruction is executed, the current value of <id is written to standard output (stdout). The virtual machine is implemented in file Interpreter.c. DO NOT MODIFY this file. It is there only for your convenience so that you may be able to copy the source code of the virtual machine, for instance, to your laptop and compile it there. Be aware that you are welcome to work on your project using your own computer, however, you need to make sure your code will eventually compile and run on the ilab cluster. All grading will be done on ilab.

The virtual machine assumes that an arbitrary number of registers are available (called virtual registers), and that there are only six memory locations that can be accessed using variable names (‘a’ ... ‘e’ ‘f’). In a real compiler, an additional optimization pass maps virtual registers to the limited number of physical registers of a machine. This step is typically called register allocation. You do not have to implement register allocation in this project. The virtual machine (RISC machine language interpreter) will report the overall number of executed instructions for a given input program. This allows you to assess the effectiveness of your optimization component.

More products