Eli   Documents Get Eli: Translator Construction Made Easy at SourceForge.net.
    Fast, secure and Free Open Source software downloads

General Information

 o Eli: Translator Construction Made Easy
 o Global Index
 o Frequently Asked Questions
 o Typical Eli Usage Errors

Tutorials

 o Quick Reference Card
 o Guide For new Eli Users
 o Release Notes of Eli
 o Tutorial on Name Analysis
 o Tutorial on Type Analysis
 o Typical Eli Usage Errors

Reference Manuals

 o User Interface
 o Eli products and parameters
 o LIDO Reference Manual
 o Typical Eli Usage Errors

Libraries

 o Eli library routines
 o Specification Module Library

Translation Tasks

 o Lexical analysis specification
 o Syntactic Analysis Manual
 o Computation in Trees

Tools

 o LIGA Control Language
 o Debugging Information for LIDO
 o Graphical ORder TOol

 o FunnelWeb User's Manual

 o Pattern-based Text Generator
 o Property Definition Language
 o Operator Identification Language
 o Tree Grammar Specification Language
 o Command Line Processing
 o COLA Options Reference Manual

 o Generating Unparsing Code

 o Monitoring a Processor's Execution

Administration

 o System Administration Guide

Mail Home

Tree Parsing

Previous Chapter Next Chapter Table of Contents


The Tree Patterns

The tree patterns describe a set of derivations for trees. They are based on the ranked alphabet of symbols represented by tree nodes and also on a finite set of nonterminals. The ranked alphabet and the set of nonterminals are disjoint.

Each nonterminal represents a relevant interpretation of a node. For example, if the tree parser was intended to select machine instructions to implement expression evaluation, the nonterminal IntReg might be used to represent the interpretation "an integer value in a register". A derivation could interpret either a leaf describing an integer constant or a node describing an addition operation in that way. Another derivation could interpret the same addition node as "a floating-point value in a register" (possibly represented by the nonterminal FltReg).

Each rule characterizes a context in which a specific action is to be performed. For code selection there might be one rule characterizing an integer addition instruction and another characterizing a floating-point addition instruction. An integer addition instruction that required both of its operands to be in registers and delivered its result to a register would be characterized by a rule involving only IntReg nonterminals.

Most rules characterize contexts consisting of single tree nodes. Some contexts, however, do not involve any tree nodes at all. Suppose that a node is interpreted as leaving an integer value in a register, and there is an instruction that converts an integer value in a register to a floating-point value in a register. If the original node is the child of a node demanding a floating-point value in a register, the tree parser can supply the implied conversion instruction by using the rule characterizing its context in the derivation.

It is also possible to write a rule characterizing a context consisting of several nodes. Some machines have complex addressing functions that involve summing the contents of two registers and a constant and then accessing a value at the resulting address. In this case, a single rule with a pattern containing two addition operations and placing appropriate interpretations on the operands would characterize the context in which the addressing function action was performed.

The set of patterns is generally ambiguous. In order to disambiguate them, each rule has an associated cost. Costs are non-negative integer values, and default to 1 if left unspecified. The tree parser selects the derivation having the lowest total cost. We will ignore the cost in this chapter (see Summary of the Specification Language).

Rules Describing Tree Nodes

A rule describing a single tree node has the following general form:

N0 ::= s(Ni,aj)
Here N0 is a nonterminal, s an element of the ranked alphabet, Ni a (possibly empty) list of nonterminals, and aj a (possibly empty) list of attribute types. If one of Ni and aj is empty then the comma separating them is omitted; if both are empty both the comma and parentheses are omitted.

Recall that trees describing simple arithmetic expressions could be based upon the following ranked alphabet:

IntegerVal   FloatingVal   IntegerVar   FloatingVar
Negative
Plus         Minus         Star         Slash
Suppose that the tree parser is to select machine instructions that evaluate the expression described by the tree being parsed. Assume that the target machine has a simple RISC architecture, in which all operands must be loaded into registers and every operation leaves its result in a register.

One context relevant to instruction selection is that of an IntegerVal leaf. This context corresponds to the selection of an instruction to load an integer constant operand into a register. It could be characterized by the following rule:

IntReg ::= IntegerVal(int)

This rule describes a single node, and has the form N0 ::= s(a1). N0 is the nonterminal IntReg, which places the interpretation "an integer value in a register" on the node. IntegerVal is the element s of the ranked alphabet. Since IntegerVal has arity 0, no nonterminals may appear between the parentheses. As discussed above (see Decorating Nodes), the leaf has a single associated attribute to specify the value it represents. This value is a string table index of type int, so the rule contains the type identifier int.

Another context related to instruction selection is that of a Plus node. This context corresponds to the selection of an instruction to add the contents of two registers, leaving the result in a register. It could be characterized by the following rule:

IntReg ::= Plus(IntReg,IntReg)

This rule describes a single node, and has the form N0 ::= s(N1,N2). N0 is the nonterminal IntReg, which places the interpretation "an integer value in a register" on the node. Plus is the element s of the ranked alphabet. Since Plus has arity 2, two nonterminals must appear between the parentheses. IntReg is the appropriate nonterminal in this case, because it places the interpretation "an integer value in a register" on both children and the machine's integer addition instruction requires both of its operands in registers.

If the target machine had floating-point operations as well as integer operations, a complete set of rules characterizing the relevant contexts in trees describing simple arithmetic expressions might be:

IntReg ::= IntegerVal(int)
IntReg ::= IntegerVar(DefTableKey)
IntReg ::= Negative(IntReg)
IntReg ::= Plus(IntReg,IntReg)
IntReg ::= Minus(IntReg,IntReg)
IntReg ::= Star(IntReg,IntReg)
IntReg ::= Slash(IntReg,IntReg)

FltReg ::= FloatingVal(int)
FltReg ::= FloatingVar(DefTableKey)
FltReg ::= Negative(FltReg)
FltReg ::= Plus(FltReg,FltReg)
FltReg ::= Minus(FltReg,FltReg)
FltReg ::= Star(FltReg,FltReg)
FltReg ::= Slash(FltReg,FltReg)

It is important to remember that the tree to be parsed involves only the nodes representing the symbols of the ranked alphabet (IntegerVal, Plus, etc.) The tree parser constructs a derivation of that tree in terms of the tree patterns. That derivation consists of applications of the rules, and those rules must be applied consistently with respect to the nonterminals. For example, recall the tree describing `k-3':

`k-3'
Minus(IntegerVar,IntegerVal)

This tree could be derived by applying the following rules:

IntReg ::= IntegerVar(DefTableKey)
IntReg ::= IntegerVal(int)
IntReg ::= Minus(IntReg,IntReg)

Chain Rules

A chain rule has the following general form:

N0 ::= N1

Here N0 and N1 are both nonterminals.

A chain rule is used in the derivation of a tree when the interpretation of a node differs from the interpretation required by its parent. It does not describe any tree node, but simply indicates that the difference in interpretations is allowed.

The patterns in the last section (see Rules Describing Tree Nodes) cannot derive the tree for the expression `k-2.3':

IntReg ::= IntegerVar(DefTableKey)
FltReg ::= FloatingVal(int)
IntReg ::= Minus(IntReg,IntReg)   /* Fails */
FltReg ::= Minus(FltReg,FltReg)   /* Fails also */
Both rules describing the Minus node demand operands of the same interpretation, and in this tree the operands have different interpretations.

Suppose that it is possible to convert an IntReg to a FltReg without loss of information. If this is true, then the value of `k' could be converted to a floating-point value and the result used as the first child of the Minus node. The possibility of such a conversion is indicated by adding the following chain rule to the patterns given in the last section:

FltReg ::= IntReg

If this chain rule is one of the patterns then the derivation of `k-2.3' would be:

IntReg ::= IntegerVar(DefTableKey)
FltReg ::= IntReg
FltReg ::= FloatingVal(int)
FltReg ::= Minus(FltReg,FltReg)

Now consider the expression `k-3' from the last section. With the addition of the chain rule, two derivations are possible:

IntReg ::= IntegerVar(DefTableKey)
IntReg ::= IntegerVal(int)
IntReg ::= Minus(IntReg,IntReg)

IntReg ::= IntegerVar(DefTableKey)
FltReg ::= IntReg
IntReg ::= IntegerVal(int)
FltReg ::= IntReg
FltReg ::= Minus(FltReg,FltReg)

Remember, however, that each rule has an associated cost. That cost defaults to 1 when it isn't specified, so each of the rules in this example has cost 1. The cost of a derivation is simply the sum of the costs of the rules from which it is constituted. Thus the cost of the first derivation above is 3 and the cost of the second is 5. The tree parser always selects the derivation with the lowest cost, so the derivation of `k-3' will be the first of the two given.

Rules Describing Tree Fragments

The right-hand side of a rule describing a tree fragment defines that fragment with nonterminal leaves. Some examples are:

N0 ::= s(t(N1),N2)
N0 ::= s(N1,t(N2))
N0 ::= s(t(N1),u(N2))
N0 ::= s(t(s(N1,N2)),N3)

Here N0 is a nonterminal, s, t and u are elements of the ranked alphabet, and N1, N2 and N3 are nonterminals. No attribute types are allowed in in a rule describing a tree fragment.

Recall the tree used to describe a C conditional expression:

`i>j ? i-j : j-i'
Conditional(
  Greater(IntegerVar,IntegerVar),
  Alternatives(
    Minus(IntegerVar,IntegerVar),
    Minus(IntegerVar,IntegerVar)))
The following rules might be used to describe the tree fragment resulting from the conditional:

IntReg ::= Conditional(IntReg,Alternatives(IntReg,IntReg))
FltReg ::= Conditional(IntReg,Alternatives(FltReg,FltReg))

If these tree fragment rules (and appropriate rules for Greater) are part of the specification then the derivation of `i>j ? i-j : j-i' would be:

IntReg ::= IntegerVar(DefTableKey)
IntReg ::= IntegerVar(DefTableKey)
IntReg ::= Greater(IntReg,IntReg)
IntReg ::= IntegerVar(DefTableKey)
IntReg ::= IntegerVar(DefTableKey)
IntReg ::= Minus(IntReg,IntReg)
IntReg ::= IntegerVar(DefTableKey)
IntReg ::= IntegerVar(DefTableKey)
IntReg ::= Minus(IntReg,IntReg)
IntReg ::= Conditional(IntReg,Alternatives(IntReg,IntReg))
Notice that there are no derivation steps corresponding to the components of the tree fragment resulting from the conditional; there is only a single derivation step corresponding to the entire fragment.


Previous Chapter Next Chapter Table of Contents