General Information

	Eli: Translator Construction Made Easy
	Global Index
	Frequently Asked Questions
	Typical Eli Usage Errors

Tutorials

	Quick Reference Card
	Guide For new Eli Users
	Release Notes of Eli
	Tutorial on Name Analysis
	Tutorial on Scope Graphs
	Tutorial on Type Analysis
	Typical Eli Usage Errors

Reference Manuals

	User Interface
	Eli products and parameters
	LIDO Reference Manual
	Typical Eli Usage Errors

Libraries

	Eli library routines
	Specification Module Library

Translation Tasks

	Lexical analysis specification
	Syntactic Analysis Manual
	Computation in Trees

Tools

	LIGA Control Language
	Debugging Information for LIDO
	Graphical ORder TOol

FunnelWeb User's Manual

	Pattern-based Text Generator
	Property Definition Language
	Operator Identification Language
	Tree Grammar Specification Language
	Command Line Processing
	COLA Options Reference Manual

Generating Unparsing Code

Monitoring a Processor's Execution

Administration

System Administration Guide

LIDO -- Computations in Trees

Tree Structure

The central data structure of a specified language processor is a tree. It usually represents the abstract structure of the particular input text and is built by actions of the scanner and parser. The trees a language processor operates on are specified by a context-free grammar, the tree grammar. It is part of the specification in LIDO. Figure 1 shows a tree grammar for simple expressions that consist of numbers and binary operators.

   RULE:    Root ::= Expr               END;
   RULE:    Expr ::= Expr Opr Expr      END;
   RULE:    Expr ::= Number             END;
   RULE:    Opr  ::= '+'                END;
   RULE:    Opr  ::= '*'                END;

Figure 1: Expression Tree Grammar

Root, Expr, and Opr are the nonterminals of this context-free grammar. Number, '+' and '*' are its terminals. Trees are built such that their nodes represent occurrences of nonterminals of the tree grammar. Terminals are not represented in the tree. Each production specifies that the symbol on the left-hand side has a sequence of subtrees according to the nonterminals on the right-hand side. In our example the first production specifies one, the second three, and the others no subtree. Expr and Opr have two alternative productions each.

Figure 2 shows an example for a tree specified by this grammar, which may represent the input expression 1 + 2 * 3. (The terminals in the bottom line do not belong to the tree.)

                           Root
                            |
                            |
                           Expr
                            |
                 -----------|-----------
                 |          |          |
               Expr         Opr      Expr
                 |          |          |
                 -          -     ------------
                                  |    |     |
                                Expr  Opr   Expr
                                  |    |     |
                                  -    -     -

               Number       +  Number  *   Number

Figure 2: An Expression Tree

A tree node together with its immediate descendent nodes represents the application of a production, called a rule context. In Figure 2, there are for example two instances of the rule context for the second rule of the grammar. The two productions for Opr describe different rule contexts, although both have no subtrees.

If we consider a node of a tree, then it connects two adjacent contexts, an upper context and a lower context. For example the upper context of an Expr node may be an application of the first or the second rule, and the lower context may be an application of the second or third rule. Rule contexts and adjacent contexts are the central concepts for association of computation, for dependencies between computations, and for the tree walk executing them.

Two kinds of terminals are distinguished: Literal terminals like '+' and '*' do not carry any information. They are only used to identify the production and to relate it to the concrete grammar. Named terminals like Number may carry some information usually computed from the input token by the scanner, e. g. the value of the number. That information may be used in computations of the rule context where the terminal occurs.

When designing a tree grammar one often needs to specify that a certain kind of nodes has an arbitrary number of subtrees. For example a block may consist of a sequence of definitions and statements. That can be expressed using LISTOF productions, like

        RULE: BLOCK ::= '{' Sequence '}' END;
        RULE: Sequence LISTOF Definition | Statement END;

Here, a Sequence node is specified to have an arbitrary number (including zero) of subtrees rooted by nodes of type Definition or Statement.

The LISTOF productions abstract from the fine-grained tree structure used to compose the elements of the sequence. The root context and the elements contexts of the sequence are not adjacent. Hence, when we associate computations to these contexts, they can not refer directly to each other; techniques of remote access have to be used instead (see See Remote Dependencies in Trees).

The above LISTOF production can be considered as an abbreviation of the following set of tree grammar productions.

        RULE: Sequence ::=  S  END;
        RULE: S :: = S Definition  END;
        RULE: S :: = S Statement  END;
        RULE: S :: = END;

This is also one of the forms of productions that could be specified in the concrete grammar for the parser. Several other forms, for example right recursive productions, would match to the LISTOF rule as well.