LL(1) Parsing
采用显式的堆栈,从上而下,从开始符号开始推导到终结符。
Example
需要选择正确的匹配方式,如何做到?
*反序压入堆栈
Parsing Table:(不需要程序之间的递归调用)
Parsing Table and Algorithm
文法上可以出现在A后面的符号
The constructing-process of the above table:
(1) 对于生产:S→(S)S,条目M[S,(]
(2) 对于生产:S→ε,S$=>*(S)S$。其中a=)或a=$。因此,将选择S→ε加到M[S,)]和M[S,$]中。
Definition of LL(1) Grammar:
- A grammar is an LL(1) grammar if the associated LL(1) parsing table has at most one production in each table entry.
- An LL(1) grammar cannot be ambiguous.
LL(1) 代码:
push the start symbol onto the top the parsing stack; while the top of the parsing stack ≠ $ and the next input token ≠ $ do if the top of the parsing stack is terminal a and the next input token = a then (* match *) pop the parsing stack; advance the input; else if the top of the parsing stack is non-terminal A and the next input token is terminal a and parsing table entry M[A,a] contains production A→X1X2…Xn then (* generate *) pop the parsing stack; for i:=n downto 1 do push Xi onto the parsing stack; else error; if the top of the parsing stack = $ and the next input token = $ then accept else error.
Example:
- statement → if-stmt | other
- if-stmt → if (exp) statement else-part
- else-part → else statement | ε
- exp → 0 | 1
left Recursion Removal and Left Factoring
【直接左递归】
Case 1: A → Aα| β 替换为:
- A → βA’
- A’ → αA’| ε
- 左递归变为了右递归。
Case 2:A → A α1 | A α2 | … | A αn |β1|β2|…|βm
每个α都不等于空,每个β都不一以A开头。
替换为:
- A →β1A’|β2A’| …|βmA’
- A’ → α1A’| α2A’| … | αnA’| ε
- 实现了任意多直接左递归的改造。
【简洁、直接左递归】
一个文法消除左递归的条件:
- 不含以ε为右部的产生式
- 不含回路A=>+A
把候选中开头的非终结符替换为它的定义,不断替换。
3. 化简消除永远无法到达的规则。
Left factoring
- 引入非终结符,提取左公因子
Syntax Tree Construction
How to compute the arithmetic value of the expression.
(1) Use a separate stack to store the intermediate values of the computation, called the value stack;
(2) Schedule two operations on that stack: A push of a number; The addition of two numbers.
(3) PUSH can be performed by the match procedure, and
(4) ADDITION should be scheduled on the stack, by pushing a special symbol (such as #) on the parsing stack.
- the rule for E’:
- E → n E’
- E’ →+n#E’|ε
#代表输入串的结束。
First and Follow Sets
【消除回溯】
First Sets
Definition:
Let X be a grammar symbol ( a terminal or non-terminal) or ε. Then First(X) is a set of terminals or ε, which is defined as follows:
1. If X is a terminal or ε, then First(X) = {X};
2. If X is a non-terminal, then for each production choice X→X1 X2 … Xn, First(X) contains First(X1)-{ε}. if for some i<n, all the sets First(X1)….First(Xi) contain ε ,then First(X) also contains First(Xi+1)- {ε}. If all the set First(X1)….First(Xn) contain ε, the First(X) contains ε.
Let α be a string of terminals and non-terminals, X1X2…Xn. First(α) is defined as follows:
1.First(α) contains First(X1)-{ε};
2.For each i=2,…,n, if for all k=1,..,i-1, First(Xk) contains ε, then First(α) constains First(Xk)-{ε}.
3. If all the set First(X1)..First(Xn) contain ε, the First(α) contains ε.
Example:
Grammar for statement sequences:
stmt-sequence →stmt
stmt-seq’ stmt-seq’ →; stmt-sequence|ε
stmt→s
The First sets are as follows:
First(stmt-sequence)={s}
First(stmt-seq’)={;, ε}
First(stmt)={s}
Follow Sets
Given a non-terminal A, the set Follow(A) is defined as follows.
(1) if A is the start symbol, the $ is in the Follow(A).
(2) if there is a production B→ αAγ, then First(γ)-{ε} is in Follow(A).
(3) if there is a production B→ αAγ, such that ε in First(γ), then Follow(A) contains Follow(B).
Note: (1)The symbol $ is used to mark the end of the input.
(2)The empty “pseudotoken” ε is never an element of a follow set.
(3)Follow sets are defined only for non-terminal.
LL(1)文法判定:
Follow sets work “on the right” in production while First sets work “on the left” in the production.
Given a grammar rule A →αB, Follow(B) will contain Follow(A),the opposite of the situation for first sets, if A →Bα, First(A) contains First(B),except possibly for ε.
0 条评论