Starting from:

$25

CS39003-Assignment 3 Lexer for tinyC Solved

1         Preamble – tinyC
This assignment follows the lexical specification of C language from the International Standard ISO/IEC 9899:1999 (E). We choose a subset of the specification as given below, and refer to this language as tinyC and subsequently (in a later assignment) specify its grammar from the Phase Structure Grammar given in the C Standard.

The lexical specification quoted here is written using a precise yet compact notation typically used for writing language specifications. We first outline the notation and then present the Lexical Grammar that we shall work with.

2         Notation
In the convention that has been followed, syntactic categories (non-terminals) are indicated by italic type, and literal words and character set members (terminals) by bold type. A colon (:) following a non-terminal introduces its definition. Alternative definitions are listed on separate lines, except when prefixed by the phrase “one of”. An optional symbol is indicated by the subscript “opt”, so that the following indicates an optional expression enclosed in braces.

{ expressionopt }

3         Lexical Grammar of tinyC
1.   Lexical Elements token: keyword

identifier constant string-literal punctuator

2.   Keywords keyword: one of

break
float
static
case
for
struct
char
goto
switch
continue
if
typedef
default
int
union
do
long
void
double
return
wlile
else
short
 
extern
sizeof
 
3.   Identifiers identifier:

identifier-nondigit identifier identifier-nondigit identifier digit

identifier-nondigit: one of

                          a      b      c      d        e
f
g
h
i
j
k
l
m
                            n      o      p      q        r
s
t
u
v
w
x
y
z
                            A      B      C      D        E
F
G
H
I
J
K
L
M
N O            P             Q          R digit: one of
S
T
U
V
W
X
Y
Z
                      0     1     2     3     4     5

4. Constants constant: integer-constant floating-constant character-constant integer-constant: nonzero-digit integer-constant digit nonzero-digit: one of
6
7        8
9
 
 
 
 
 
1      2             3             4          5             6 floating-constant:
7
8        9
 
 
 
 
 
 
fractional-constant exponent-partopt digit-sequence exponent-part fractional-constant:

digit-sequenceopt . digit-sequence digit-sequence . exponent-part:

e signopt digit-sequence

E signopt digit-sequence sign: one of

                       +      –

digit-sequence:

digit digit-sequence digit character-constant:

' c-char-sequence ' c-char-sequence:

c-char c-char-sequence c-char c-char:
any member of the source character set except
the single-quote ', backslash \, or new-line character escape-sequence

escape-sequence: one of

\' \'' \? \\

\a \b \f \n \r \t \v

5. String literals string-literal:
'' s-char-sequenceopt '' s-char-sequence:

s-char s-char-sequence s-char

s-char:

any member of the source character set except the double-quote '', backslash \, or new-line character

escape-sequence

6.   Punctuators punctuator: one of

[ ] ( ) { } . -

++ -- & * + - ~ ! / % << < <= = == != ^ | && || ? : ; ...

= *= /= %= += -= <<= = &= ^= |=

, #

7.   Comments

(a)    Multi-line Comment

Except within a character constant, a string literal, or a comment, the characters /* introduce a comment. The contents of such a comment are examined only to identify multibyte characters and to find the characters */ that terminate it. Thus, /* ... */ comments do not nest.

(b)   Single-line Comment

Except within a character constant, a string literal, or a comment, the characters // introduce a comment that includes all multibyte characters up to, but not including, the next new-line character. The contents of such a comment are examined only to identify multibyte characters and to find the terminating new-line character.

4         The Assignment
a)   Write a flex specification for the language of tinyC using the above lexical grammar. Name of your file should be ass3 roll.l, which should not contain the function main().

b)   Write your main() function in a separate file ass3 roll.c to test your lexer.

c)    Prepare a Makefile to compile the specifications and generate the lexer.

d)   Prepare a test input file ass3 roll test.c that will test all the lexical rules that you have coded.

More products