Eli   Documents

General Information

 o Eli: Translator Construction Made Easy
 o Global Index
 o Frequently Asked Questions
 o Typical Eli Usage Errors

Tutorials

 o Quick Reference Card
 o Guide For new Eli Users
 o Release Notes of Eli
 o Tutorial on Name Analysis
 o Tutorial on Scope Graphs
 o Tutorial on Type Analysis
 o Typical Eli Usage Errors

Reference Manuals

 o User Interface
 o Eli products and parameters
 o LIDO Reference Manual
 o Typical Eli Usage Errors

Libraries

 o Eli library routines
 o Specification Module Library

Translation Tasks

 o Lexical analysis specification
 o Syntactic Analysis Manual
 o Computation in Trees

Tools

 o LIGA Control Language
 o Debugging Information for LIDO
 o Graphical ORder TOol

 o FunnelWeb User's Manual

 o Pattern-based Text Generator
 o Property Definition Language
 o Operator Identification Language
 o Tree Grammar Specification Language
 o Command Line Processing
 o COLA Options Reference Manual

 o Generating Unparsing Code

 o Monitoring a Processor's Execution

Administration

 o System Administration Guide

Mail Home

LIDO -- Computations in Trees

Next Chapter Table of Contents


Tree Structure

The central data structure of a specified language processor is a tree. It usually represents the abstract structure of the particular input text and is built by actions of the scanner and parser. The trees a language processor operates on are specified by a context-free grammar, the tree grammar. It is part of the specification in LIDO. Figure 1 shows a tree grammar for simple expressions that consist of numbers and binary operators.

   RULE:    Root ::= Expr               END;
   RULE:    Expr ::= Expr Opr Expr      END;
   RULE:    Expr ::= Number             END;
   RULE:    Opr  ::= '+'                END;
   RULE:    Opr  ::= '*'                END;
Figure 1: Expression Tree Grammar

Root, Expr, and Opr are the nonterminals of this context-free grammar. Number, '+' and '*' are its terminals. Trees are built such that their nodes represent occurrences of nonterminals of the tree grammar. Terminals are not represented in the tree. Each production specifies that the symbol on the left-hand side has a sequence of subtrees according to the nonterminals on the right-hand side. In our example the first production specifies one, the second three, and the others no subtree. Expr and Opr have two alternative productions each.

Figure 2 shows an example for a tree specified by this grammar, which may represent the input expression 1 + 2 * 3. (The terminals in the bottom line do not belong to the tree.)

                           Root
                            |
                            |
                           Expr
                            |
                 -----------|-----------
                 |          |          |
               Expr         Opr      Expr
                 |          |          |
                 -          -     ------------
                                  |    |     |
                                Expr  Opr   Expr
                                  |    |     |
                                  -    -     -

               Number       +  Number  *   Number

Figure 2: An Expression Tree

A tree node together with its immediate descendent nodes represents the application of a production, called a rule context. In Figure 2, there are for example two instances of the rule context for the second rule of the grammar. The two productions for Opr describe different rule contexts, although both have no subtrees.

If we consider a node of a tree, then it connects two adjacent contexts, an upper context and a lower context. For example the upper context of an Expr node may be an application of the first or the second rule, and the lower context may be an application of the second or third rule. Rule contexts and adjacent contexts are the central concepts for association of computation, for dependencies between computations, and for the tree walk executing them.

Two kinds of terminals are distinguished: Literal terminals like '+' and '*' do not carry any information. They are only used to identify the production and to relate it to the concrete grammar. Named terminals like Number may carry some information usually computed from the input token by the scanner, e. g. the value of the number. That information may be used in computations of the rule context where the terminal occurs.

When designing a tree grammar one often needs to specify that a certain kind of nodes has an arbitrary number of subtrees. For example a block may consist of a sequence of definitions and statements. That can be expressed using LISTOF productions, like

        RULE: BLOCK ::= '{' Sequence '}' END;
        RULE: Sequence LISTOF Definition | Statement END;
Here, a Sequence node is specified to have an arbitrary number (including zero) of subtrees rooted by nodes of type Definition or Statement.

The LISTOF productions abstract from the fine-grained tree structure used to compose the elements of the sequence. The root context and the elements contexts of the sequence are not adjacent. Hence, when we associate computations to these contexts, they can not refer directly to each other; techniques of remote access have to be used instead (see See Remote Dependencies in Trees).

The above LISTOF production can be considered as an abbreviation of the following set of tree grammar productions.

        RULE: Sequence ::=  S  END;
        RULE: S :: = S Definition  END;
        RULE: S :: = S Statement  END;
        RULE: S :: = END;
This is also one of the forms of productions that could be specified in the concrete grammar for the parser. Several other forms, for example right recursive productions, would match to the LISTOF rule as well.


Next Chapter Table of Contents