Eli   Documents

General Information

 o Eli: Translator Construction Made Easy
 o Global Index
 o Frequently Asked Questions
 o Typical Eli Usage Errors

Tutorials

 o Quick Reference Card
 o Guide For new Eli Users
 o Release Notes of Eli
 o Tutorial on Name Analysis
 o Tutorial on Scope Graphs
 o Tutorial on Type Analysis
 o Typical Eli Usage Errors

Reference Manuals

 o User Interface
 o Eli products and parameters
 o LIDO Reference Manual
 o Typical Eli Usage Errors

Libraries

 o Eli library routines
 o Specification Module Library

Translation Tasks

 o Lexical analysis specification
 o Syntactic Analysis Manual
 o Computation in Trees

Tools

 o LIGA Control Language
 o Debugging Information for LIDO
 o Graphical ORder TOol

 o FunnelWeb User's Manual

 o Pattern-based Text Generator
 o Property Definition Language
 o Operator Identification Language
 o Tree Grammar Specification Language
 o Command Line Processing
 o COLA Options Reference Manual

 o Generating Unparsing Code

 o Monitoring a Processor's Execution

Administration

 o System Administration Guide

Mail Home

Tutorial for Name Analysis Using ScopeGraphs

Next Chapter Table of Contents


Kernel Language

This chapter considers the name analysis problem for a very simple subset of a language called NameLan. We begin with specifications of the phrase structure and basic symbols. A processor built from these specifications will accept a text in NameLan and build a tree. These specifications can then be augmented by others, enhancing that processor to make it solve the subset's name analysis problem. The sample text that we will analyze is:

machar.nl[1]==

int radix;
{ float a = 1.0;
  while (((a + 1.0) - a) - 1.0 == 0.0)
    a = a + a;

  float b = 1.0;
  while ((a + b) - a == 0.0)
    b = b + b;

  radix = (a + b) - a;
}
This macro is attached to a non-product file.

This program determines the radix of the floating point representation of the machine on which it runs, and stores that value in a global variable. NameLan does not require fully-parenthesized expressions, but parentheses are needed here to guarantee the order of evaluation within the expressions. (Optimizers do not generally violate the integrity of parenthesized expressions.)

Text structure

Text is considered to be made up of a sequence of comments and basic symbols. Comments are character sequences to be ignored, while basic symbols are character sequences that correspond to terminal symbols of the grammar describing the phrase structure of the text. Basic symbols can be grouped according to the information they carry:

Delimiter
A marker that serves to establish the structure of the source text. (Examples: while and (.)

Denotation
A representation of a source language entity. Different occurrences of the same denotation all represent the same entity. (Examples: 0.0 and int.)

Identifier
A name for some entity. Different occurrences of the same identifier may name different entities. (Examples: radix and a.)

Name analysis is concerned with establishing the entity named by each identifier in the text.

Since basic symbols correspond to terminal symbols of the grammar describing a language's phrase structure, it is generally simplest to start by specifying that grammar (see Context-Free Grammars and Parsing of Syntactic Analysis). The terminal symbols are then precisely the symbols that do not occur on the left-hand side of any grammar rule.

The concrete syntax of a language is a specification of the phrase structure of a program written in that language. Here is a concrete syntax for the NameLan subset of interest here:

Phrase structure[2]==

Program:        Declaration* Block.

Declaration:    VarDecl.

Block:          '{' DeclStmt* '}'.
DeclStmt:       VarDecl / Statement.

VarDecl:        Type VarDefs ';'.
VarDefs:        VarDef // ',' .
VarDef:         Ident [ '=' Expr ].

Type:           'int' / 'float' .

Statement:      Name '=' Expr ';' /
                'if' Expr Statement $'else' /
                'if' Expr Statement  'else' Statement /
                'while' '(' Expr ')' Statement /
                Block.

Expr:           AExpr [ Relop AExpr ] .
AExpr:          [Addop] Term / AExpr Addop Term .
Term:           Factor / Term Mulop Factor .
Factor:         Name / Integer / Real / '(' Expr ')' .

Relop:          '<' / '<=' / '==' / '!=' / '>=' / '>'.
Addop:          '+' / '-'.
Mulop:          '*' / '/'.

Name:           Ident.
This macro is defined in definitions 2, 25, 30, 32, 41, 50, 55,
   67, 76, 97, 118, and 124.
This macro is invoked in definition 3.

(See The effect of a $-modification of Syntactic Analysis, for an explanation of the if statement definition.)

NameLan.con[3]==

Phrase structure[2]
This macro is attached to a product file.

Notice that many of the terminal symbols of this grammar are given literally (for example, '=', ';', and 'if'). Clearly, these literal terminals need no further specification. We are left with only three terminals (and the comment) whose form the grammar leaves unspecified:

NameLan.gla[4]==

Ident:          C_IDENTIFIER
Integer:        C_INTEGER
Real:           C_FLOAT
                C_COMMENT
This macro is attached to a product file.

This specification indicates that the formats of these NameLan terminal symbols and comments are identical to those of their counterparts in C (see Top of Lexical Analysis).

Exercises

These exercises are based on files defined in the Tutorial. To obtain copies of those files in your current directory, enter Eli and give the following command:

-> $elipkg/Name/LearnSG%Kernel > .

None of these files will have write permission in your current directory. You will need to add write permission in order to do the exercises.

  1. Recall that we suggested that you begin by specifying the grammar that describes a language's phrase structure. File `NameLan.con' is such a grammar specification. Eli can't generate a processor from `NameLan.con' because that file does not specify the structure of the non-literal terminals. Nevertheless, ask Eli to do so and examine the resulting errors:

    -> NameLan.con :exe :error >
    

    Compare the error report with file `NameLan.gla'. How do they differ, and why?

  2. Eli is able to generate a processor from the two files `NameLan.con' and `NameLan.gla' together. In order to generate a processor from a number of specification files, Eli needs a single file of type `specs' listing those specifications:

    Text.specs[5]==

    NameLan.con
    NameLan.gla
    
    This macro is attached to a product file.
    

    Ask Eli to generate a processor from `Text.specs' and save the executable for that processor in the current directory:

    -> Text.specs :exe > kern
    

    Apply kern to `machar.nl':

    -> !./kern machar.nl
    
    (You can also exit Eli and run kern as you would any program.)

    1. Why is there no output?

    2. Delete the second line of `machar.nl' and again apply kern to it. Is the output what you expected?

    3. Briefly explain what your generated processor is actually doing.

Tree Structure

`NameLan.con' is the concrete syntax of our NameLan subset. It describes all of the phrases that can be used to construct a program like `machar.nl'. The processor generated from `NameLan.specs' in the last section verifies that a text file represents a syntactically-correct program in the NameLan subset.

An Eli-generated processor represents a program internally by a tree, and carries out tasks like name analysis by computing the values of attributes attached to the nodes of the tree (see Top of LIDO -- Computation in Trees). The abstract syntax of a language describes all of the possible trees that a generated processor could build to represent syntactically-correct programs in that language. Eli uses LIDO notation to define the abstract syntax of a language (see LIDO -- Reference Manual).

Eli can deduce an abstract syntax from a concrete syntax, but the trees described by that abstract syntax might not be the best ones for subsequent analysis. The rules describing expressions and operators in `NameLan.con' provide an excellent example. They implement the operator precedence and association rules of NameLan, and guarantee that the phrase structure reflects the relationship between operators and operands (see Using structure to convey meaning of Syntactic Analysis). Those rules determine the structure of the tree to be built.

Consider the expression in the second while statement of `machar.nl':

(a + b) - a == 0.0

According to the grammar, this is an Expr phrase with the following structure:

AExpr Relop AExpr
The interpretation of this structure is that the equality operator == compares the subexpression (a + b) - a with the number 0.0. It reflects the fact that addition and subtraction operations should be carried out before comparison operations, i.e. that they should have higher precedence (see Operator precedence of Syntactic Analysis). `NameLan.con' lists groups of operators in order of increasing precedence. This program text would therefore be represented internally by a tree with three subtrees: one representing (a + b) - a, one representing ==, and one representing 0.0.

The parentheses in the expression to the left of == indicate that a + b is the left operand of the subtraction. Suppose that those parentheses were omitted. According to the grammar, a + b - a is an AExpr phrase with the structure:

AExpr Addop Term
Thus the structure is the same as that of the parenthesized form.

The + and - operators have the same precedence; `NameLan.con' implements the rule that these operators associate to the left (see Operator associativity of Syntactic Analysis).

There is no need to retain the distinction between different levels of expression (e.g. AExpr and Term), or between different precedences of operators (e.g. Relop and Addop), in a tree representing `machar.nl' because the information they provided is already built into the tree structure. We therefore map all expression symbols to Expr and all operator symbols to Oper (see The Relationship Between Phrases and Tree Nodes of Syntactic Analysis).

Mappings from concrete symbols to abstract symbols[6]==

MAPSYM
Expr ::= AExpr Term Factor.
Oper ::= Relop Addop Mulop.
This macro is defined in definitions 6.
This macro is invoked in definition 7.

NameLan.map[7]==

Mappings from concrete symbols to abstract symbols[6]
This macro is attached to a product file.

`NameLan.map' removes the distinction between symbols of the concrete syntax. We also need to be able to introduce distinctions where a single symbol of the concrete syntax plays different roles in different contexts. For example, identifiers are always represented by the concrete symbol Ident. But two identifiers can require very different computations depending on the context in which they occur. Because computations are associated with symbols in the tree, we need different abstract syntax symbols to represent different identifier occurrences.

We recommend that, wherever possible, abstract syntax symbols needed to distinguish identifier contexts not be added to the concrete syntax (see Representation of identifiers of Name Analysis Reference Manual). The reason for this recommendation is that the parser often does not have sufficient contextual information to recognize the phrase, but that information is available when the tree is being built.

The abstract syntax derived from `Text.specs' contains the following rule:

RULE rule_024:
Name ::= Ident
END;
In order to distinguish the identifier context of a simple name, let's add the symbol SimpleName to the abstract syntax. This can be done by writing two explicit LIDO rules:

Abstract syntax of identifiers[8]==

RULE: Name       ::= SimpleName END;
RULE: SimpleName ::= Ident      END;
This macro is defined in definitions 8, 9, 26, 31, 33, 42, 51, and 120.
This macro is invoked in definition 10.

When the parser reports that it has recognized the phrase correponding to rule_024, the tree builder will automatically build two tree nodes described by these two new rules instead of building a node described by rule_024.

We use a similar strategy to distinguish the two contexts in which an identifier is declared:

Abstract syntax of identifiers[9]==

RULE: VarDef     ::= VarDefName          END;
RULE: VarDef     ::= VarDefName '=' Expr END;
RULE: VarDefName ::= Ident               END;
This macro is defined in definitions 8, 9, 26, 31, 33, 42, 51, and 120.
This macro is invoked in definition 10.

As we extend NameLan to demonstrate more complex name analysis issues, we will need to distinguish other identifier contexts.

Abstract syntax tree[10]==

Abstract syntax of identifiers[8]
This macro is defined in definitions 10, 18, 22, 38, 47, 52, 53,
   65, 69, 71, 88, 89, 92, 93, 95, 96, 101, 105, 108, 110, 111,
   112, 113, 122, 133, 134, 135, 136, 137, 144, and 153.
This macro is invoked in definition 11.

NameLan.lido[11]==

Abstract syntax tree[10]
This macro is attached to a product file.

We can now specify a processor that will take account of these mappings when deducing the abstract syntax, and produce a simpler tree from `machar.nl':

Specification files[12]==

NameLan.con
NameLan.gla
NameLan.map
NameLan.lido
This macro is defined in definitions 12, 16, 23, 27, 74, 78, 81,
   91, 117, 121, 139, 142, 146, and 152.
This macro is invoked in definition 13.

NameLan.specs[13]==

Specification files[12]
This macro is attached to a product file.

Exercises

These exercises are based on files defined in the Tutorial. To obtain copies of those files in your current directory, enter Eli and give the following command:

-> $elipkg/Name/LearnSG%Tree > .

None of these files will have write permission in your current directory.

  1. Extract the description of the abstract syntax deduced directly from the concrete syntax for the kernel language and open it in your editor:

    -> Text.specs :absyntax <
    

    Using these abstract syntax rules, draw the tree representing the following expression from `machar.nl':

    (a + b) - a == 0.0
    
    Label each node with the rule name, and indicate any additional information contained in a leaf. To get you started, the root is a rule_011 node and its second child is a rule_028 node. One of the leaves is a rule_015 node whose additional information is the internal representation of the denotation 0.0.

  2. Extract the description of the mapped abstract syntax for the kernel language and open it in your editor:

    -> NameLan.specs :absyntax <
    

    1. Draw the tree for (a + b) - a == 0.0 using these abstract syntax rules.

    2. Does this tree provide any less descriptive power than the tree drawn from the original abstract syntax? Explain briefly.

    3. Search for Expr ::= and Oper ::=. Verify that the nodes of the tree described by these rules can represent any possible expression in the language.

Basic name analysis

When you read `machar.nl', your programming experience tells you that that it is manipulating three variables. Two of these variables are named a and b, and are used to do floating point computations that explore the properties of the number representation. The third variable, named radix, is set to an integer value that is the radix of the representation.

In order to formalize this intuition, the designer of NameLan makes use of four concepts that relate identifiers to the entities they name (see Fundamentals of Name Analysis of Name Analysis Reference Manual).

  • A binding is a pair consisting of an identifier `i' and an entity `k'.

  • A scope is a contiguous sequence of program text associated with a set of bindings. We say that a binding in the set has the scope.

  • A defining occurrence is an occurrence of an identifier `i' that could legally be the only occurrence of that identifier in the program. A binding `(i,k)' is associated with each defining occurrence.

    • Language rules specify the scope of a defining occurrence, and thus a scope of the binding associated with that defining occurrence. Other language rules may specify further scopes of a binding that are not in the context of the defining occurrence.

  • An applied occurrence is an occurrence of an identifier `i' that is legal only if it identifies a binding `(i,k)' in some set associated with its context.

    • Language rules specify the set(s) in which an applied occurrence may identify bindings.

As language designers, we have decided to specify the rules for the kernel language as follows:

  • The scope of a defining occurrence VarDefName is the smallest enclosing Block or Program.

  • An applied occurrence that is a SimpleName `i' identifies the binding `(i,k)' having the smallest scope encompassing `i'.

A binding `(radix,k)', where `k' is the integer variable entity named by radix, is associated with the defining occurrence of radix in the first line of `machar.nl'. According to the first rule, the scope of that defining occurrence (and hence the scope of `(radix,k)') is the Program phrase -- the complete program text. Since the applied occurrence of radix in line 10 lies within that scope, it identifies `(radix,k)' by the second rule. We can therefore conclude that this applied occurrence names the integer variable named by the defining occurrence of radix in the first line of `machar.nl'. Similar statements can be made about the variables a and b, the scope of whose defining occurrences is the Block phrase.

Consider the following trivial program:

float x = 0;
{int x; x = 2;}
The scope of the defining occurrence of x on line 1 is the Program phrase, while that of the defining occurrence of x on line 2 is the Block phrase making up line 2. Suppose that the binding associated with the defining occurrence on line 1 is `(x,kf)' and the one associated with the defining occurrence on line 2 is `(x,ki)'. Which of these two bindings will the applied occurrence of x on line 2 identify?

Both of the scopes encompass the applied occurrence, but `(x,ki)' has the smaller scope. Therefore the applied occurrence will identify `(x,ki)'. The term usually used for this effect is hiding. We say that the binding in the Block phrase hides any binding for the same identifier in a surrounding context.

Name analysis is a computation carried out on the tree that is the internal representation of the program being analyzed (see Fundamentals of Name Analysis of Name Analysis Reference Manual). That computation is specified in terms of the abstract syntax, and takes the form of assignments of values to attributes of tree nodes (see Computations of LIDO - Reference Manual). The goal of the computation is to determine the entities named by the program's identifiers, based on the concepts and language rules discussed above.

Defining occurrences, applied occurrences, and scopes are concepts related to the program text; how are these concepts represented in the tree? Each identifier occurrence is a single basic symbol in the text, and is represented by a leaf of the tree. A scope, on the other hand, extends over some region of the text. The developer needs to map each such text region to an abstract syntax subtree encompassing that region. We call such a subtree a range; several scopes may map into the same range. A scope is represented by the root node of its range.

For example, consider the defining occurrence radix in the first line of `machar.nl'. According to the scope rules of the kernel language, the scope of that defining occurrence is the Program phrase. The only abstract syntax subtree encompassing that scope is the entire tree, rooted in the Program node. We therefore conclude that the scope of the defining occurrence radix is represented by the Program node.

Applying similar reasoning to the defining occurrence of a in `machar.nl', we see that its scope is encompassed both by a Block subtree and by the Program tree. By our definition, either of the Program node or the Block node could be chosen to represent the scope of the defining occurrence a. The developer should choose the range that reflects the semantics of the language, and will simplify the name analysis computations. In the case of the kernel language, the scope of a defining occurrence VarDefName is the smallest enclosing Block or Program phrase. Therefore the developer should choose the Block node to represent the scope of a. (For a more complex situation, see Position control.)

Specific identifiers are character sequences in an input text, and are represented internally by integer values (see Character String Storage of The Eli Library). Identifiers with the same spelling are represented by the same integer (see Available Processors of Lexical Analysis).

Conventionally we use the attribute Sym to hold the internal representation of an identifier. To cut down on the number of attribute declarations, we use an attribute name specification to specify that all attributes named Sym have type int (see Types and Classes of Attributes of LIDO Reference Manual). That allows us to use the attribute name Sym without having to write an individual declaration for each symbol to which it is attached:

Attribute representing an identifier[14]==

ATTR Sym: int;
This macro is invoked in definition 18.

An entity is represented internally by a definition table key, of type DefTableKey (see How to create and use definition table keys of Definition Table). Definition table keys must be passed around the tree and be stored as attributes at many different nodes, and we conventionally use the attribute name Key for this purpose:

Attribute referencing an entity[15]==

ATTR Key: DefTableKey;
This macro is invoked in definition 18.

Name analysis determines a value of the Key attribute for each identifier occurrence in the program.

The value of the Key attribute of a defining occurrence is known in the context of that defining occurrence. For example, the fact that radix names a distinct program entity is known at the defining occurrence because the context is a declaration. A DefTableKey value representing that distinct entity can therefore be assigned to the Key attribute of the corresponding tree node, without regard to any other computations (see Defining Occurrences of Name Analysis Reference Manual). The properties of that definition table key (e.g. the fact that the entity is an integer variable) depend on other computations, but the key itself does not (see Top of Property Definition Language).

The value of the Key attribute of an applied occurrence is determined by identifying a binding of that applied occurrence. When the applied occurrence is a simple name, it identifies the binding whose scope was mapped to the smallest range containing that applied occurrence.

Scope rules determining the relationships between identifiers and entities follow patterns that do not vary significantly among programming languages, and standard attribute computations have therefore been developed to implement name analysis. Eli has packaged those computations in a specification module (see Top of Introduction of specification modules). The developer can obtain them by instantiating that module:

Specification files[16]==

$/Name/ScopeGraphs.gnrc :inst
This macro is defined in definitions 12, 16, 23, 27, 74, 78, 81,
   91, 117, 121, 139, 142, 146, and 152.
This macro is invoked in definition 13.

An Eli specification module provides computational roles: constellations of attributes and operations that are relevant for particular contexts in the tree. For example, a tree node playing the role of a defining occurrence is given three attributes and a particular set of computations to be carried out at specific visits to that node (see Defining Occurrences of Name Analysis Reference Manual). Computational roles must be associated with appropriate tree nodes so that the computations they describe can be carried out as those nodes are visited during traversals of the tree. This is done by using LIDO inheritance (see Inheritance of Computations of LIDO - Reference Manual).

The ScopeGraphs module provides three roles that can be used to implement basic name analysis computations:

Our scope rule for the kernel language defines the scope of a VarDefName as the smallest enclosing Block, which we have mapped into a range rooted in a Block node. The appropriate specifications associating the name analysis roles with symbols of the kernel language abstract syntax are thus:

Tree nodes playing ScopeGraphs roles[17]==

SYMBOL Block      INHERITS RangeScope   END;
SYMBOL VarDefName INHERITS IdDefScope   END;
SYMBOL SimpleName INHERITS GCSimpleName END;
This macro is invoked in definition 18.

The VarDefName rule also describes Program as a possible scope. Program is the symbol at the root of the tree, which automatically inherits the RootScope role (see The RootScope role of Name Analysis Reference Manual). RootScope combines the behavior of RangeScope with attributes and operations that are specific to the root of the tree. Therefore Program should not inherit RangeScope. We also recommend that no symbol explicitly inherit RootScope.

The computations associated with the three roles use an internal data structure called a scope graph (see Fundamentals of Name Analysis of Name Analysis Reference Manual). Each scope graph node specifies a set of bindings; scope graph edges describe search paths used to find a binding for a given identifier.

Any tree node that plays the RangeScope or RootScope role is the root node of a range, and corresponds to a scope graph node. (There may also be scope graph nodes that do not correspond to tree nodes.) The set of bindings specified by a scope node corresponding to the root node of a range is exactly the set of bindings mapped to that range.

A basic understanding of the way in which the computations attached to the abstract syntax tree by computational roles use the scope graph will help you to understand why we do things in a certain way, and how to extend name analysis to more complex situations. We will therefore briefly look at the name analysis of `machar.nl'.

A computation inherited by the root node of the tree from the RootScope role creates a scope graph node for the Program range. Computations inherited from RangeScope by the Block node of the tree also create a scope graph node, and use the tree structure to determine that the newly-created node should be connected by a parent edge to the created ancestor node (see Scope graphs of Name Analysis Reference Manual). Parent edges represent the enclosure relation for ranges. The Block is enclosed by the Program in `machar.nl', so the parent edge is directed from the scope node being created (the tail of the parent edge) to the created ancestor scope node (the tip of the parent edge).

Computations inherited from the IdDefScope role use the tree structure to find the enclosing range of each VarDefName and specify the binding (VarDefName.Sym,VarDefName.Key) at the scope node corresponding to that range.

The search for a binding for the applied occurrence of radix in `machar.nl' begins in the scope node corresponding to the smallest range containing that SimpleName (see The generic lookup of Name Analysis Reference Manual). This scope node specifies two bindings, for a and b, but no binding for radix. Because this scope node is the tail of a parent edge, the search continues in the scope node that is the tip of that edge and is successful.

The complete specification of a processor that does name analysis on the kernel language requires the ScopeGraphs module instantiation, the concrete syntax and symbol mapping for NameLan, and the LIDO file connecting the abstract syntax to the module:

Abstract syntax tree[18]==

Attribute representing an identifier[14]
Attribute referencing an entity[15]
Tree nodes playing ScopeGraphs roles[17]
This macro is defined in definitions 10, 18, 22, 38, 47, 52, 53,
   65, 69, 71, 88, 89, 92, 93, 95, 96, 101, 105, 108, 110, 111,
   112, 113, 122, 133, 134, 135, 136, 137, 144, and 153.
This macro is invoked in definition 11.

A processor generated from this specification will build a tree and perform name analysis, but it will not produce any output. For the purpose of this tutorial, however, we want our processor to make the results of the name analysis visible. We want to be able to see the relationships between applied and defining occurrences so that we can check our understanding. Eli provides another module that we can instantiate for this purpose (see Name Analysis Testing of Name Analysis Reference Manual).

Bindings.specs[19]==

NameLan.specs
$/Name/SGProof.gnrc +referto=Ident :inst
This macro is attached to a product file.

Exercises

These exercises are based on files defined in the Tutorial. To obtain copies of those files in your current directory, enter Eli and give the following command:

-> $elipkg/Name/LearnSG%Basic > .

None of these files will have write permission in your current directory. You will need to add write permission in order to do the exercises.

  1. Extract the description of the abstract syntax on which name analysis is based, and open it in your editor:

    -> NameLan.specs :absyntax <
    

    1. Verify that the tree for `machar.nl' will contain nodes that have the GCSimpleName role.

    2. How can you distinguish rules that were deduced from the concrete syntax from rules that were added explicitly?

  2. Draw the scope graph that would be constructed for `machar.nl'. Use circles for the nodes, and write an identifier in the circle if that identifier has a binding specified by the node. Explain briefly how the generic lookup would use your scope graph to find an appropriate binding for the occurrence of radix in the last line of `machar.nl' (see The generic lookup of Name Analysis Reference Manual).

  3. Use `Bindings.specs' to generate a processor named `bind' that will show bindings for applied occurrences, executable for that processor in the current directory:

    -> Bindings.specs :exe > bind
    

    Apply bind to `machar.nl':

    -> !./bind machar.nl
    
    (You can also exit Eli and run bind as you would any program.)

    1. Does the output indicate that the name analysis is correct according to the language rules? Explain briefly.

    2. Delete the variable declaration from the first line of `machar.nl' and again apply bind to it. Is the output what you expected?

    3. Duplicate the variable declaration in the first line of `machar.nl' and again apply bind to it. Is the output what you expected?

Error reporting

If the first line were removed from `machar.nl', the applied occurrence of radix in the last line would not lie in the scope of any defining occurrence of radix. The scope rules of the previous section do not define the result when an applied occurrence does not lie in the scope of a defining occurrence of its identifier.

Recall that the search for the meaning of the applied occurrence of radix begins in the scope node for the Block, and follows the parent edge to the scope node for the Program. There is now no binding for radix in that node, and the node has no parent edge to follow. At that point, the computation sets the attribute SimpleName.Key to the value NoKey (see How to create and use definition table keys of Definition Table).

A NoKey result always indicates that the name analysis computation has been unable to find a suitable binding for an applied occurrence. Therefore the ScopeGraphs module provides a role, ChkIdUse, which can be inherited by an applied occurrence to provide an error report when the entity is NoKey:

Report an undefined identifier[20]==

SYMBOL SimpleName INHERITS ChkIdUse END;
This macro is invoked in definition 22.

Now suppose that the first line of `machar.nl' were:

int radix; float radix;
Both defining occurrences of radix have the Program range. The scope rules of the previous section yield an ambiguous result when several defining occurrences of the same identifier have the same range: the assignment in the last line of `machar.nl' could be setting either the integer or the floating point variable.

The ScopeGraphs module creates a new definition table key to represent the entity named by the first defining occurrence the attribute computation encounters in a specific range. (That defining occurrence may not be textually first; attribute computations are not necessarily carried out in textual order.) Any subsequent defining occurrence of the same identifier in that range will be represented by the same key.

For NameLan, we should report multiple defining occurrences bound to the same definition table key as an error. The ScopeGraphs module does not provide an error-reporting role for this condition because there are a number of situations in which it is legal. However, a simple computation based on the Unique role provided by the Eli module library allows us to produce an appropriate report (see Check for Unique Object Occurrences of Property Library). Since we will need to report multiply-defined identifiers in several contexts, we wrap this computation in a role named MultDefChk that VarDefName and other symbols can inherit:

Report a multiply-defined identifier[21]==

SYMBOL MultDefChk INHERITS Unique COMPUTE
  IF(NOT(THIS.Unique),
    message(
      ERROR,
      CatStrInd("Identifier is multiply defined: ", THIS.Sym),
      0,
      COORDREF));
END;

SYMBOL VarDefName INHERITS MultDefChk END;
This macro is invoked in definition 22.

(See Source Text Coordinates and Error Reporting of The Eli Library, for an explanation of the message routine.)

Combining these computations into a single specification that we add to those of the previous section, and instantiating the Unique module, gives us:

Abstract syntax tree[22]==

Report an undefined identifier[20]
Report a multiply-defined identifier[21]
This macro is defined in definitions 10, 18, 22, 38, 47, 52, 53,
   65, 69, 71, 88, 89, 92, 93, 95, 96, 101, 105, 108, 110, 111,
   112, 113, 122, 133, 134, 135, 136, 137, 144, and 153.
This macro is invoked in definition 11.

Specification files[23]==

$/Prop/Unique.gnrc :inst
This macro is defined in definitions 12, 16, 23, 27, 74, 78, 81,
   91, 117, 121, 139, 142, 146, and 152.
This macro is invoked in definition 13.

Exercises

These exercises are based on files defined in the Tutorial. To obtain copies of those files in your current directory, enter Eli and give the following command:

-> $elipkg/Name/LearnSG%Report > .

None of these files will have write permission in your current directory. You will need to add write permission in order to do the exercises.

  1. Ask Eli to generate a processor and save the executable for that processor in the current directory:

    -> NameLan.specs :exe > rept
    

    Apply rept to `machar.nl'.

    1. Why is there no output?

    2. Delete the variable declaration from the first line of `machar.nl' and again apply rept to it. Is the output what you expected?

    3. Duplicate the variable declaration in the first line of `machar.nl' and again apply rept to it. Is the output what you expected?

Procedures

Suppose that we want to be able to define and invoke procedures in our simple language. Here's an example of a program that defines a procedure to compute the greatest common divisor (gcd) of two integers. It also defines a global variable, and sets that variable to the gcd of two specific values:

gcd.nl[24]==

int gcd(int x, int y) {
  while (x != y) {
    if (x > y) x = x - y;
    else y = y - x;
  }
  return x;
}

int x;

{ x = gcd(9, 12); }
This macro is attached to a non-product file.

Our extension of the phrase structure reflects object-oriented terminology because we will eventually make NameLan an object-oriented language:

Phrase structure[25]==

Declaration:    MethodDecl.

MethodDecl:     Type Ident MethodBody.
MethodBody:     '(' Parameters ')' '{' DeclStmt* '}'.
Parameters:     [ Parameter // ',' ].
Parameter:      Type Ident.

Type:           'void'.

Statement:      Name '(' Arguments ')' ';' / 'return' [ Expr ] ';'.

Factor:         Name '(' Arguments ')'.
Arguments:      [ Expr // ',' ].
This macro is defined in definitions 2, 25, 30, 32, 41, 50, 55,
   67, 76, 97, 118, and 124.
This macro is invoked in definition 3.

We need abstract syntax symbols to distinguish the two new identifier contexts found in `Method.con' (see Representation of identifiers of Name Analysis Reference Manual).

Abstract syntax of identifiers[26]==

RULE: MethodDecl        ::= Type MethodDefName MethodBody END;
RULE: MethodDefName     ::= Ident                         END;

RULE: Parameter         ::= Type ParamDefName             END;
RULE: ParamDefName      ::= Ident                         END;
This macro is defined in definitions 8, 9, 26, 31, 33, 42, 51, and 120.
This macro is invoked in definition 10.

Our extension of NameLan introduces new defining occurrences, applied occurrences, and ranges. These changes require us to modify the scope rule for VarDefName from the kernel language, and provide additional rules:

  • The scope of a defining occurrence VarDefName is the smallest enclosing Block, MethodBody, or Program.

  • The scope of a defining occurrence MethodDefName is the smallest enclosing Program.

  • The scope of a defining occurrence ParamDefName is the smallest enclosing MethodBody.

This is the first section in which the exercises will ask you to create a specification file on your own. You will need to add the names of such specification files to those of the files supplied by the tutorial. We support that by adding the name of the file `Solutions.specs' to the tutorial's list of specification files:

Specification files[27]==

Solutions.specs
This macro is defined in definitions 12, 16, 23, 27, 74, 78, 81,
   91, 117, 121, 139, 142, 146, and 152.
This macro is invoked in definition 13.

`Solutions.specs' is not supplied by the tutorial, but by you. In the introdution to this manual, we suggested that you create a directory containing an empty file `Solutions.specs' before beginning the exercises. (If you decided to use a workbook, `Solutions.specs' should contain the name of the workbook file.)

`Solutions.specs' is entirely under your control, as is your workbook if you use one. When you need to create a specification to be used in generating a processor, simply add its name to your `Solutions.specs' or define it in your workbook.

Exercises

These exercises are based on files defined in the Tutorial. To obtain copies of those files in your current directory, enter Eli and give the following command:

-> $elipkg/Name/LearnSG%Procs > .

None of these files will have write permission in your current directory. You will need to add write permission in order to do the exercises.

  1. Consider the program `gcd.nl'.

    1. Describe the four ranges. Where does each begin and end? What are the enclosure relations?

    2. Identify the four defining occurrences. To which range does each belong?

    3. Identify one applied occurrence that has a defining occurrence in the smallest range containing that applied occurrence.

    4. Identify one applied occurrence that does not have a defining occurrence in the smallest range containing that applied occurrence,

  2. Write LIDO rules that use inheritance to connect the grammar in `NameLan.lido' to the scope rules for the language. Will any symbols need to inherit MultDefChk? Explain briefly.

  3. The LIDO rules that connect the grammar in `NameLan.lido' to the scope rules for the language constitute the first specification that you have created on your own. You need to combine those rules with the specification developed in the tutorial to generate a processor that analyzes NameLan procedures.

    If you are using a workbook, define a FunnelWeb output file named `Method.lido' containing your LIDO rules in that workbook (see Output Files of FunnelWeb).

    If you are not using a workbook, create a file named `Method.lido' containing your LIDO rules and add the name `Method.lido' to your `Solutions.specs' file.

    1. Could you have named your specification file `Procs.lido'? Explain briefly.

    2. Could you have named your specification file `Method.con'? Explain briefly.

  4. Use `Bindings.specs' to generate a processor named `p1' that will show bindings for applied occurrences, and apply it to `gcd.nl'. Do you get the output you expected? Explain briefly.

  5. Draw the scope graph that was built by `p1' for `gcd.nl'.

  6. Apply `p1' to the file `lcl.nl', which declares a local variable x in the body of gcd:

    lcl.nl[28]==

    int gcd(int x, int y) {
      while (x != y) {
        if (x > y) x = x - y;
        else y = y - x;
      }
      return x;
      int x;
    }
    
    int x;
    
    { x = gcd(9, 12); }
    
    This macro is attached to a non-product file.
    

    What is the effect of the local variable declaration? Briefly explain the output.

  7. Open the file `NameLan.con' in your editor and change the definition of a MethodBody to:

    MethodBody:     '(' Parameters ')' Block.
    

    Use `Bindings.specs' to generate another processor named `p2'.

    1. Apply `p2' to `gcd.nl'. Briefly explain the output.

    2. Apply `p2' to `lcl.nl'. Briefly explain why p1 and p2 give different results.

    3. The difference between `p1' and `p2' represents a language design choice. Briefly evaluate the two possibilities and state your reasons for preferring one over the other.


Next Chapter Table of Contents