Introduction > 33 ? <= 34 ?

Introduction to Language Theory and Compiling

Shubham Batra, Nathan Liccardo
November 26, 2017

We Will Write a Custom Essay Specifically
For You For Only $13.90/page!


order now

1

Contents

1  Part 1 : Grammar modifications 3

1.1  Unreachableandunproductivevariables …………….. 4

1.2  Non-ambigousgrammar …………………….. 4

1.3  LeftRecursionandLeftFactoring………………… 4

1.3.1 LeftRecursion………………………. 4
1.3.2 LeftFactoring………………………. 5
1.3.3 Algorithms ……………………….. 5

2  Part 2 : Grammar analyses 6

2.1  First1(X)…………………………….. 6

2.2  Follow1(X)……………………………. 7

2.3  Actiontable…………………………… 8

3 Part 3 : Parser 8

3.1  Recursivedescentparser …………………….. 9

3.2  Implementation…………………………. 10

3.3  ParserTree …………………………… 10

2

1 Part 1 : Grammar modifications

For the first part, we were asked to transform the IMP grammar in order to :
• Remove unreachable and/or unproductive variables, if any;

• Make the grammar non-ambiguous by taking into account the priority and the
associativity of the operators.

• Remove left-recursion and apply factorisation where need be.
Here is the original grammar :

1 ? begin end
2 ? ?
3 ?
4 ?
5 ? ;
6 ?

7 ?
8 ?
9 ?
10 ?
11 ?
12 ? VarName :=
13 ? VarName
14 ? Number
15 ? ( )
16 ? –
17 ?
18 ? +
19 ? –
20 ? *
21 ? /
22 ? if then endif
23 ? if then else endif
24 ?
25 ? not
26 ?
27 ?
28 ? and
29 ? or
30 ? =
31 ? >=
32 ? >
33 ? <= 34 ? < 35 ? <>
36 ? while do done
37 ? for VarName from by
to do done
38 ? for VarName from to do done
39 ? print(VarName)
40 ? read(VarName)

3

1.1 Unreachable and unproductive variables

Firstly, we were asked to find the unreachable and unproductive variables in the gram-
mar, but we do not find any unproductive symbol or inaccessible symbol in the gram-
mar.

1.2 Non-ambigous grammar

Secondly we need to make the grammar non-ambigous by taking into account the fol-
lowing table of priority and associativity of the operators (decreasing order of priority) :

Operators

Associativity

-, not
?, /
+, –
>, <, >=, <=, =, <>
and

or

right
left
left
left
left
left

To do it, we just needed to transform a little bit our grammar. The trick is to detect
firstly the symbols with the lowest priority (or) and to finish by the highest priority (-,
not). The “or” and “and symbols can only appears in condition. We so need to check
that we will firstly detect those elements. This step will be done by transforming the
following rules :

1 ?
2 ? and
3 ? or

by :

6 ? ?
This transforation was aslo applied to +, ? and ?, / operators.

1.3 Left Recursion and Left Factoring

1.3.1 Left Recursion

Left recursion often poses problems for parsers, either because it leads them into infinite
recursion or because they expect rules in a normal form that forbids it Therefore, a
grammar is often pre-processed to eliminate the left recursion. In our current grammar,
we will ilustrated removing left-recursion with the following example :

1 ? or
2 ?
3 ? ?
4 ? and
5 ?

4

1 ? or
2 ?
3 ? ?

To remove it, we firstly transform the and then create the and
rules :

1 ?
2 ?
3 ? or
4 ? ?

The same trick was applied to each left-recursion in the grammar.

1.3.2 Left Factoring

Left factoring is a grammar transformation that is useful for producing a grammar
suitable for predictive parsing. The basic idea is that when it is not clear which of two
alternative productions to use to expand a non-terminal A, we may be able to rewrite
the A-productions to defer the decision until we have seen enough of the input to make
the right choice. For example,

1 ? for VarName from by
to do done
2 ? for VarName from to do done

becomes :

1 ? for VarName from by
to do done
2 ? by
3 ? ?

Again, the same trick was applied to each left-factoring in the grammar.

1.3.3 Algorithms

For information, you can find here the pseudo-code to generate both left factoring and
recursion :

5

2 Part 2 : Grammar analyses

Secondly, we were mandatory to give the action table of an LL(1) parser for the trans-
formed grammar. The construction of a predictive parser is aided by two functions
associated with a grammar G. These functions, FIRST and FOLLOW, allow us to
fill in the entries of a predictive parsing table for G, whenever possible. Sets of tokens
yielded by the FOLLOW function can also be used as synchronizing tokens during error
recovery.

2.1 First1(X)

If A is any symbol of the grammar, let FIRST1(A) be the set of terminals that begin
the strings derived from A. If the A rules contain ? then it is also an element of the
result set. To compute FIRST1(X), we applied to following steps for each identifiers :

RecursiveUpdate :
rules ? Grammar.getRulesWithIdentifier
for each rules :

if first element of rule = terminal :
set ? first element of rule

else if first element of rule = identifier :
RecursiveUpdate(first element of rule)

Which can be resumed to :

If X is terminal or ? , then FIRST(A) ? {X}.

If X is nonterminal and X ? Y1 Y2 … Yk is a production, then call FIRST1(Y1)
and put it on FIRST1(X).

In the project, the First1(X) function is represented by the class VariableSetFirst. This
class is helped by some other utils classes which was only created to make the project
more simple. You will find two importants functions used to make the first set which
are :

firstCalculation : used to iterate each identifiers.

recursiveUpdate : used to find every first values.

6

2.2 Follow1(X)

Define FOLLOW(A), for nonterminal A, to be the set of terminals a that can appear
immediately to the right of A. To compute FOLLOW1(A) for all nonterminals A, we
applied the following code which explain each steps :

private void followCalculation() {
_checked = new ArrayList<>();
for (VariableIdentifier var : _grammar.getIdentifiers()) {

_var = var;

recursive3(var);
}

}

private void recursive3(VariableIdentifier variable) {
_checked.add(variable);
ArrayList rules = _grammar.getNotRulesWith(variable);

for (Rule rule: rules) {

if (_var.equals(variable)) _currentRule = rule;

for (int index: rule.indexs(variable)) {
// CASE 1.1

}
}

if (notLastAndTerminal(rule, index))
caseNotLastAndTerminal(rule, index);

// CASE 1.2
else if (notLastAndIdentifier(rule, index))

caseNotLastAndIdentifier(rule, index);
// CASE 2

else
caseLastElement(rule);

_checked.remove(_checked.size()-1);
}

private boolean notLastAndTerminal(Rule rule, int index) {
if (index+1 < rule.getSize()) { Variable current = rule.getRuleIndex(index+1); if (current instanceof VariableTerminal) return true; } return false; } private boolean notLastAndIdentifier(Rule rule, int index) { if (index+1 < rule.getSize()) { Variable current = rule.getRuleIndex(index+1); if (current instanceof VariableIdentifier) return true; } return false; } private void caseNotLastAndTerminal(Rule rule, int index) { VariableTerminal terminal = (VariableTerminal) rule.getRuleIndex(index+1); 7 _set.addStart(_var, terminal, _currentRule); } private void caseNotLastAndIdentifier(Rule rule, int index) { VariablePairSet set = _first.getFirst(rule.getRuleIndex(index+1)); if(set.removeAllTerminal(EPS) && !(_checked.contains(rule.getIdentifier()))) recursive3(rule.getIdentifier()); for (VariablePair var: set) { _set.addStart(_var, var.getStart(), _currentRule); } } private void caseLastElement(Rule rule) { if (!(_checked.contains(rule.getIdentifier()))) recursive3(rule.getIdentifier()); } 2.3 Action table To create the action table, we need the FIRST and FOLLOW sets for each non-terminal in the grammar. We dram a n*m table from the input we receive. n denotes the num- ber of non-terminals we have in the grammar and we determine m by counting all the terminals in the grammar. Then in each cell, we add the number of the rule that allows a non-terminal to access to a terminal variable. In the project, the action table is constructed using the actionLine and actionLine- Tuple classes. An array is constructed of all the variables in a line with line tuples in it. Integers are only inserted where needed and rest spaces are left blank which helps us to reduce the memory size. Less memory is needed to store the data. The first and follow are needed to construct the table. In fact, we use the in the first time the FIRST1 function to calculate each first values. If we found an ? in the FIRST1 set, then we apply the FOLLOW1 function. Those steps are explained bellow : ActionTable { create ActionTableInstance for each non-terminal X : create ActionTableLineInstance set ? getAllFirst(X) if ? in set : set ? set.removeEps set ? getFollow ActionTableLine.add(set) } 3 Part 3 : Parser For the third part, we were asked to write, in Java, a recursive descent LL(1) parser for this grammar. 8 3.1 Recursive descent parser Recursive descent is a top-down parsing technique that constructs the parse tree from the top and the input is read from left to right. It uses procedures for every terminal and non-terminal entity. This parsing technique recursively parses the input to make a parse tree, which may or may not require back-tracking. But the grammar associated with it (if not left factored) cannot avoid back-tracking. A form of recursive-descent parsing that does not require any back-tracking is known as predictive parsing. Predictive parsing is possible only for the class of LL(k) grammars, which are the context-free grammars for which there exists some positive integer k that allows a re- cursive descent parser to decide which production to use by examining only the next k tokens of input. The LL(k) grammars therefore exclude all ambiguous grammars, as well as all grammars that contain left recursion. Any context-free grammar can be transformed into an equivalent grammar that has no left recursion, but removal of left recursion does not always yield an LL(k) grammar. A predictive parser runs in linear time. Recursive descent with backtracking is a technique that determines which production to use by trying each production in turn. Recursive descent with backtracking is not limited to LL(k) grammars, but is not guaranteed to terminate unless the grammar is LL(k). Even when they terminate, parsers that use recursive descent with backtracking may require exponential time. Although predictive parsers are widely used, and are frequently chosen if writing a parser by hand, programmers often prefer to use a table-based parser produced by a parser generator, either for an LL(k) language or using an alternative parser, such as LALR or LR. This is particularly the case if a grammar is not in LL(k) form, as transforming the grammar to LL to make it suitable for predictive parsing is involved. Predictive parsers can also be automatically generated, using tools like ANTLR. Predictive parser is a recursive descent parser, which has the capability to predict which production is to be used to replace the input string. The predictive parser does not suffer from backtracking. To accomplish its tasks, the predictive parser uses a look-ahead pointer, which points to the next input symbols. To make the parser back- tracking free, the predictive parser puts some constraints on the grammar and accepts only a class of grammar known as LL(k) grammar. Predictive parsing uses a stack and a parsing table to parse the input and generate a parse tree. Both the stack and the input contains an end symbol $ to denote that the stack is empty and the input is consumed. The parser refers to the parsing table to take any decision on the input and stack element combination. 9 3.2 Implementation The implementation of the parser is quite simple. We firstly instantiated a stack and then read every symbol of the input to change the stack states. Note that the function stop if any error occurs. In the other case, it returns true which means that the input is accepted by the grammar. public boolean parse() { // Step One : init stack _stack.push(_grammar.getRule(0)); // Step Two : parse while (!(_stack.isEmpty()) && !(_input.isEmpty())) { // Pop the first variable of the stack Variable current = _stack.pop(); // Get the curent variable in the input VariableTerminal terminal = new VariableTerminal((_input.get(0)).getType()); // If the poped variable is a terminal if (current instanceof VariableTerminal) if (!(caseTerminal((VariableTerminal) current , terminal))) return false; // If the poped variable is an Id if (current instanceof VariableIdentifier) caseIdentifier(terminal, (VariableIdentifier) current); } // Step Three : check the stack System.out.println(); return (_stack.isEmpty()); } 3.3 Parser Tree A parse tree is a graphical depiction of a derivation. It is convenient to see how strings are derived from the start symbol. The start symbol of the derivation becomes the root of the parse tree. In a parse tree: • All leaf nodes are terminals. • All interior nodes are non-terminals. • In-order traversal gives original input string. A parse tree depicts associativity and precedence of operators. The deepest sub-tree is traversed first, therefore the operator in that sub-tree gets precedence over the operator which is in the parent nodes. In the project we added the possibility to show the parse tree. This can be done by printing the number of the called rule at each recursion. 10