Introduction then else endif [24]
Introduction to Language Theory and Compiling
Shubham Batra, Nathan Liccardo
November 26, 2017
1
Contents
1 Part 1 : Grammar modifications 3
1.1 Unreachableandunproductivevariables …………….. 4
1.2 Non-ambigousgrammar …………………….. 4
1.3 LeftRecursionandLeftFactoring………………… 4
1.3.1 LeftRecursion………………………. 4
1.3.2 LeftFactoring………………………. 5
1.3.3 Algorithms ……………………….. 5
2 Part 2 : Grammar analyses 6
2.1 First1(X)…………………………….. 6
2.2 Follow1(X)……………………………. 7
2.3 Actiontable…………………………… 8
3 Part 3 : Parser 8
3.1 Recursivedescentparser …………………….. 9
3.2 Implementation…………………………. 10
3.3 ParserTree …………………………… 10
2
1 Part 1 : Grammar modifications
For the first part, we were asked to transform the IMP grammar in order to :
• Remove unreachable and/or unproductive variables, if any;
• Make the grammar non-ambiguous by taking into account the priority and the
associativity of the operators.
• Remove left-recursion and apply factorisation where need be.
Here is the original grammar :
1 end
2 ? ?
3 ?
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22 endif
23 else
endif
24
25
26
27
28
29
30
31
32
33
36 done
37
to done
38 done
39
40
3
1.1 Unreachable and unproductive variables
Firstly, we were asked to find the unreachable and unproductive variables in the gram-
mar, but we do not find any unproductive symbol or inaccessible symbol in the gram-
mar.
1.2 Non-ambigous grammar
Secondly we need to make the grammar non-ambigous by taking into account the fol-
lowing table of priority and associativity of the operators (decreasing order of priority) :
Operators
Associativity
-, not
?, /
+, –
>, =,
and
or
right
left
left
left
left
left
To do it, we just needed to transform a little bit our grammar. The trick is to detect
firstly the symbols with the lowest priority (or) and to finish by the highest priority (-,
not). The “or” and “and symbols can only appears in condition. We so need to check
that we will firstly detect those elements. This step will be done by transforming the
following rules :
1
2
3
by :
6
This transforation was aslo applied to +, ? and ?, / operators.
1.3 Left Recursion and Left Factoring
1.3.1 Left Recursion
Left recursion often poses problems for parsers, either because it leads them into infinite
recursion or because they expect rules in a normal form that forbids it Therefore, a
grammar is often pre-processed to eliminate the left recursion. In our current grammar,
we will ilustrated removing left-recursion with the following example :
1
2
3
4
5
4
1
2
3
To remove it, we firstly transform the
rules :
1
2
3
4
The same trick was applied to each left-recursion in the grammar.
1.3.2 Left Factoring
Left factoring is a grammar transformation that is useful for producing a grammar
suitable for predictive parsing. The basic idea is that when it is not clear which of two
alternative productions to use to expand a non-terminal A, we may be able to rewrite
the A-productions to defer the decision until we have seen enough of the input to make
the right choice. For example,
1
to done
2 done
becomes :
1
to done
2 ? by
3 ? ?
Again, the same trick was applied to each left-factoring in the grammar.
1.3.3 Algorithms
For information, you can find here the pseudo-code to generate both left factoring and
recursion :
5
2 Part 2 : Grammar analyses
Secondly, we were mandatory to give the action table of an LL(1) parser for the trans-
formed grammar. The construction of a predictive parser is aided by two functions
associated with a grammar G. These functions, FIRST and FOLLOW, allow us to
fill in the entries of a predictive parsing table for G, whenever possible. Sets of tokens
yielded by the FOLLOW function can also be used as synchronizing tokens during error
recovery.
2.1 First1(X)
If A is any symbol of the grammar, let FIRST1(A) be the set of terminals that begin
the strings derived from A. If the A rules contain ? then it is also an element of the
result set. To compute FIRST1(X), we applied to following steps for each identifiers :
RecursiveUpdate :
rules ? Grammar.getRulesWithIdentifier
for each rules :
if first element of rule = terminal :
set ? first element of rule
else if first element of rule = identifier :
RecursiveUpdate(first element of rule)
Which can be resumed to :
If X is terminal or ? , then FIRST(A) ? {X}.
If X is nonterminal and X ? Y1 Y2 … Yk is a production, then call FIRST1(Y1)
and put it on FIRST1(X).
In the project, the First1(X) function is represented by the class VariableSetFirst. This
class is helped by some other utils classes which was only created to make the project
more simple. You will find two importants functions used to make the first set which
are :
firstCalculation : used to iterate each identifiers.
recursiveUpdate : used to find every first values.
6
2.2 Follow1(X)
Define FOLLOW(A), for nonterminal A, to be the set of terminals a that can appear
immediately to the right of A. To compute FOLLOW1(A) for all nonterminals A, we
applied the following code which explain each steps :
private void followCalculation() {
_checked = new ArrayList();
for (VariableIdentifier var : _grammar.getIdentifiers()) {
_var = var;
recursive3(var);
}
}
private void recursive3(VariableIdentifier variable) {
_checked.add(variable);
ArrayList
for (Rule rule: rules) {
if (_var.equals(variable)) _currentRule = rule;
for (int index: rule.indexs(variable)) {
// CASE 1.1
}
}
if (notLastAndTerminal(rule, index))
caseNotLastAndTerminal(rule, index);
// CASE 1.2
else if (notLastAndIdentifier(rule, index))
caseNotLastAndIdentifier(rule, index);
// CASE 2
else
caseLastElement(rule);
_checked.remove(_checked.size()-1);
}
private boolean notLastAndTerminal(Rule rule, int index) {
if (index+1