next up previous
Next: Abstract syntax Up: An Analyzer for Java Previous: Support code


Tree Structure

An attribute grammar is used to describe both the abstract syntax tree and the computations carried out over it. The fragment describing the overall structure of a Java AST serves to illustrate the notation:

Tree.lido[1]:

RULE: Goal    ::=    Cluster         END;
RULE: Cluster LISTOF CompilationUnit END;

RULE: CompilationUnit ::=
  PackageDeclarationOpt ImportJavaLang
  ImportDeclarationsOpt TypeDeclarationsOpt
END;

RULE: ImportJavaLang ::= END;

RULE pknm: PackageDeclarationOpt ::= 'package' PackageName ';' END;
RULE nonm: PackageDeclarationOpt ::=                           END;

RULE: ImportDeclarationsOpt LISTOF
  SingleTypeImportDeclaration | TypeImportOnDemandDeclaration
END;

RULE: TypeDeclarationsOpt LISTOF TypeDeclaration END;

RULE: TypeImportOnDemandDeclaration ::= 'import' QualInhName '.' '*' END;
RULE: SingleTypeImportDeclaration   ::= 'import' QualInhName         END;

RULE: QualInhName ::=                 InhBaseId END;
RULE: QualInhName ::= QualInhName '.' InhQualId END;

RULE: InhBaseId ::= Identifier END;
RULE: InhQualId ::= Identifier END;

Abstract syntax[2]
This macro is attached to a product file.

Each rule describes a node of the AST, and corresponds to a class in an object-oriented implementation. The identifier following the keyword RULE names the class of the object that would represent such a node. These are the only classes that can be instantiated. (It is not necessary to name the rules in a LIDO specification, because Eli will generate unique names if none are given, but rules can be named to make it easier to discuss them.)

An identifier that precedes ::= or LISTOF in some rule is called a nonterminal; all other identifiers are terminals. Nonterminals following ::= represent subtrees, while terminals represent values that are not subtrees. (For example, Identifier is a terminal representing an identifier appearing in the source program.)

Terminals can also be thought of as class names, but these classes are defined outside of the LIDO specification. They do not represent tree nodes, but rather values that are components of a rule's object. Objects of class pgpr therefore have no children, but each stores a representation of an identifier appearing in the source program.

A nonterminal (such as PackageDeclarationOpt) names the abstract class that characterizes the contexts in which the construct can appear. Each rule class (such as pknm and nonm) is a subclass of the abstract class named by the nonterminal preceding ::= or LISTOF.

Each rule containing ::= describes a node with a fixed number of children and/or component values. Nonterminals following the ::= specify children, in left-to-right order. Each child must be represented by an object belonging to a subclass of the abstract class named by the nonterminal.

Each rule containing LISTOF describes a node with an arbitrary number of children (including none at all). Nonterminals following LISTOF (separated by vertical bars) specify the possible children. There may be any number of children corresponding to each nonterminal.

Literals do not represent information present in the abstract syntax tree; they are used solely to establish a correspondence between the abstract syntax tree nodes and the phrase structure of the input.

A CompilationUnit constitutes the text in a single file. Types declared in different compilation units can depend on each other, circularly. The compiler must compile such types all at the same time, which implies that it must deal with an arbitrary number of compilation units:



Subsections
next up previous
Next: Abstract syntax Up: An Analyzer for Java Previous: Support code
2008-09-11