General Information

	Eli: Translator Construction Made Easy
	Global Index
	Frequently Asked Questions
	Typical Eli Usage Errors

Tutorials

	Quick Reference Card
	Guide For new Eli Users
	Release Notes of Eli
	Tutorial on Name Analysis
	Tutorial on Scope Graphs
	Tutorial on Type Analysis
	Typical Eli Usage Errors

Reference Manuals

	User Interface
	Eli products and parameters
	LIDO Reference Manual
	Typical Eli Usage Errors

Libraries

	Eli library routines
	Specification Module Library

Translation Tasks

	Lexical analysis specification
	Syntactic Analysis Manual
	Computation in Trees

Tools

	LIGA Control Language
	Debugging Information for LIDO
	Graphical ORder TOol

FunnelWeb User's Manual

	Pattern-based Text Generator
	Property Definition Language
	Operator Identification Language
	Tree Grammar Specification Language
	Command Line Processing
	COLA Options Reference Manual

Generating Unparsing Code

Monitoring a Processor's Execution

Administration

System Administration Guide

Name Analysis Reference Manual

Tree Grammar Preconditions

The ScopeGraphs specification module provides computational roles to be inherited by nonterminal symbols of the abstract syntax tree. To make effective use of these roles, the abstract syntax tree must be designed with them in mind. Generally speaking, that means that the abstract syntax tree structure will differ somewhat from the structure of the parse tree. Eli offers tools that aid in mapping a parse tree to an appropriate abstract syntax tree (see The Relationship Between Phrases and Tree Nodes of Syntactic Analysis).

A developer can often use the parsing grammar provided by the language designer, and define simple mapping rules to obtain the desired abstract syntax tree (see Syntax development hints of Syntactic Analysis). Sometimes, however, a change in the parsing grammar is required. That can lead to difficulties, depending on the parsing algorithm (see How to Resolve Parsing Conflicts of Syntactic Analysis).

Representation of identifiers

An identifier is a basic symbol of the language being analyzed, and will correspond to a named terminal symbol of the concrete grammar describing the input text. (Typically the terminal symbol is named `Identifier' or `Ident'.) Named terminal symbols do not contribute to the trees specified by the tree grammar, but a value derived from the basic symbol may be used in computations associated with the rule of a production or with the symbol on the left-hand side of a production (see Productions of LIDO -- Reference Manual). A unique integer value is typically derived from each identifier by the token processor mkidn (see Available processors of Lexical Analysis).

Identifiers usually play different roles in different syntactic contexts. We recommend that these distinctions not be made in the concrete syntax. The concrete syntax should use the same named terminal (e.g. Ident) to represent identifiers in every context. Syntactic contexts should be distinguished in the abstract syntax by LIDO RULEs.

Consider the concrete productions

   ObjDecl:        TypeDenoter Ident.
   TypeDenoter:    Ident.
   Variable:       Ident.

We distinguish the different roles of identifiers by introducing new symbol names in the corresponding LIDO RULEs:

   RULE: ObjDecl     ::= TypeDenoter DefIdent END;
   RULE: TypeDenoter ::= TypeUseIdent END;
   RULE: Variable    ::= UseIdent END;

The new symbol names must, of course, be defined:

   RULE: DefIdent     ::= Ident END;
   RULE: UseIdent     ::= Ident END;
   RULE: TypeUseIdent ::= Ident END;

Representation of ranges

Recall that a range is a subtree of the abstract syntax tree that encompasses a scope. Most programming languages allow nested scopes, and allow the meaning of an identifier in an inner scope to differ from the meaning in an outer scope. In that case, we say that the identifier's binding from the outer scope is not visible in the inner scope. The developer's goal for the abstract syntax tree should be that each range encompasses a scope containing all occurrences of identifiers defined in that scope but not containing any defining occurrence whose binding is visible outside of that scope.

For example, consider the following grammar fragment:

ProcedureDeclaration:
  'procedure' ProcedureHeading ProcedureBody /
  Type 'procedure' ProcedureHeading ProcedureBody .
ProcedureHeading:
  ProcedureIdentifier FormalParameterPart ';'
    ValuePart SpecificationPart .
ProcedureIdentifier: Identifier .
ProcedureBody: Statement .

A ProcedureDeclaration is nested inside a block (not shown) and the semantics of the language make the ProcedureIdentifier visible in the range containing that block. FormalParameterPart, ValuePart, SpecificationPart, and Statement all belong to a single range whose bindings are visible within the ProcedureDeclaration. An abstract syntax tree corresponding to this grammar does not meet our goal: ProcedureDeclaration is the only region containing all identifier occurrences defined in the procedure, but it also contains the defining occurrence ProcedureIdentifier that is visible outside of the region.

The solution is to add a nonterminal ProcedureRange to the grammar:

ProcedureDeclaration:
  'procedure' ProcedureIdentifier ProcedureRange /
  Type 'procedure' ProcedureIdentifier ProcedureRange .
ProcedureIdentifier: Identifier .
ProcedureRange:
  ProcedureHeading ProcedureBody .
ProcedureHeading:
  FormalParameterPart ';' ValuePart SpecificationPart .
ProcedureBody: Statement .

Here the ProcedureRange subtree contains all occurrences of identifiers defined in the procedure, and does not contain the defining occurrence ProcedureIdentifier.

The ProcedureHeading can be considered to be the interface to the procedure, and the ProcedureBody to be the implementation of that interface. Suppose that the language allows the interface and its implementation to be widely separated, instead of packaged into a single declaration:

ProcedureInterface:
  'procedure' ProcedureIdentifier ProcedureHeading /
  Type 'procedure' ProcedureIdentifier ProcedureHeading .
ProcedureImplementation:
  'implementation' ProcedureIdentifier ProcedureBody .

The ProcedureInterface and ProcedureImplementation are both nested within the same block (not shown) and the semantics of the language make the ProcedureIdentifier visible in the range containing that block.

The ProcedureHeading and ProcedureBody are now completely disjoint subtrees; there is no ProcedureRange to contain them. Together, they contain all occurrences of identifiers defined in the procedure and therefore together play the same role as the ProcedureRange subtree in the previous example.

We have defined a scope as a contiguous sequence of program text, and said that the developer needs to map each scope to an abstract syntax subtree encompassing that scope (see Fundamentals of Name Analysis). In order to deal with separate interface and implementation, it is convenient to generalize these concepts by allowing a scope to be a set of disjoint contiguous sequences of program text, and requiring the developer to map these regions into an abstract syntax tree subforest that encompasses those sequences. The subforest then has the desired properties of a range: it contains all of the occurrences in the scope, and does not contain any defining occurrence whose binding is visible outside.