Type Analysis
Type analysis is a complex process, involving several different kinds
of entity.
Each kind of entity has properties, which are stored in the definition
table under the entity's key.
Those properties are set and used in a variety of contexts.
The result is a collection of implicit dependence relations among the type
analysis computations, and these relations depend on the language being
analyzed.
The modules described in this document make the implicit relations
explicit, using void attributes and dependent expressions in LIDO
(see Dependent Expressions of LIDO - Reference Manual).
Although the explicit dependences work for a wide range of typical
programming languages, one or more of them must sometimes be overridden
because of the rules of a particular language.
This chapter explains the implicit dependences that must be made explicit,
how the various modules make them explicit, and some typical circumstances
in which the default treatment fails.
The void attributes that make these dependences explicit are summarized
here; the remainder of this chapter explains them in more detail:
TypeDenotation.GotType
- The new type key has been created, and any properties that are not
dependent on final types have been stored in the definition table
as properties of that key.
TypeDefDefId.GotDefer
- Information that can be used to find the final type
has been stored in the definition table as properties of the key
assigned to the identifier by the name analyzer.
RootType.GotUserTypes
- Computations for all type denotations have reached the state represented by
TypeDenotation.GotType and computations for all type identifier
definitions have reached the state represented by
TypeDefDefId.GotDefer .
RootType.GotAllTypes
- All final types have been determined.
TypedDefId.TypeIsSet
- Information that can be used to find the final type
has been stored in the definition table as properties of the key
assigned to the identifier by the name analyzer.
RootType.TypeIsSet
- The state represented by
RootType.GotAllTypes has been reached,
and computations for all typed identifier definitions have reached
the state represented by TypedDefId.TypeIsSet .
TypedUseId.TypeIsSet
- All information needed to find the final type of this typed identifier
is available.
OperatorDefs.GotOper
- All operator descriptions associated with this construct have been
entered into the operator data base.
RootType.GotAllOpers
- Computations for all symbols inheriting
OperatorDefs have reached
the state represented by OperatorDefs.GotOper .
Consider the following program, written in a C-like notation:
{ Measurement Length;
typedef Inches Measurement;
typedef int Inches;
Length = 3; printf("%d\n", Length + 7);
}
Suppose that the language definition states that type identifiers are
statically bound, with the scope of a declaration being the entire block.
Thus all of the type identifier occurrences have valid bindings.
(That would not be the case in C, because in C the scope of a
declaration is from the end of the declaration to the end of the block.)
The type analysis of each of the two occurrences of Length in the
last line of the program is described by the following specifications
discussed earlier:
SYMBOL ExpIdUse INHERITS TypedUseId, ChkTypedUseId END;
RULE: Expr ::= ExpIdUse COMPUTE
PrimaryContext(Expr,ExpIdUse.Type);
END;
The value of ExpIdUse.Type should be intType , the known
definition table key created for the language-defined integer type.
Recall that intType was associated with the int keyword
by the following specification:
RULE: Type ::= 'int' COMPUTE
Type.Type=intType;
END;
The problem is to make the intType value of the Type.Type
attribute in this context the value of the ExpIdUse.Type attribute
in the context quoted above.
A human has no trouble seeing how this problem could be solved:
-
The type definition rule sets the
TypIdDef.Type attribute of the
occurrence of Inches in the third line of the program to
intType .
-
The value of a property of the
Inches entity could be set from the
value of the TypIdDef.Type attribute in that context.
-
That property could be used to set the
TypIdUse.Type attribute of
the occurrence of Inches in the second line of the program.
-
The type identifier use rule sets the value of the
Type.Type
attribute in that context the value of the TypIdUse.Type attribute.
-
Similar reasoning results in the value of the
Type.Type attribute
in the variable definition context of the first line of the program
becoming intType .
-
Finally, a property of the
Length entity is set in the context of
the first line of the program and used to make intType the value
of the ExpIdUse.Type attributes in the two contexts of the last line.
Unfortunately, this solution is based on the human's ability to see the
dependence among the type identifiers and process the lines of the program
in an order determined by that dependence.
One cannot, for example, blindly process the lines in the order in which
they were written.
The dependence among the lines in our example is a result of our use of the
known key intType as the value of a property of the type identifier
entities.
This strategy is actually an example of a premature evaluation:
There is no need to know the key representing the type of
Length until the ExpIdUse.Type attribute is evaluated.
We can avoid the constraint on the order of rule processing by a
"lazy evaluation" strategy in which we use properties of the type
identifier entities to establish a means for determining the value of the
ExpIdUse.Type attribute rather than establishing the value itself.
Recall that there are three possible Type contexts: a keyword, a
type denotation, and a type identifier
(see Referring to a type).
In the first two, we can set the value of the Type.Type attribute to
the definition table key for the type itself.
In the third, however, the only information that we are guaranteed to have
is the definition table key for the type identifier.
However, this information is sufficient to find the definition table key
for the type once all of the type identifiers have been defined.
Thus we can simply set the value of the Type.Type attribute to the
definition table key for the type identifier itself in this context.
The computation provided by the Typing module for the
TypeDefDefId context sets a property of the type identifier entity
to the value of the TypeDefDefId.Type attribute
(see Type identifiers).
Effectively, this computation creates a linear list of type identifier
entities ending in a type entity.
When all of the entities corresponding to type identifiers have this
property set, the definition table key for a type should be the last
element of each list.
In our example, the value of this property of the identifier
Length 's definition table key would be the definition table key of
the identifier Measurement .
The value of its property would be the definition table key of the
identifier Inches , whose property would be intType .
There is no guarantee, of course, that the last element of the list is
actually a type.
For example, consider the following incorrect program:
{ Measurement Length;
typedef Inches Measurement;
int Inches;
Length = 3; printf("%d\n", Length + 7);
}
Here the last element of the list beginning at Measurement would be
Inches , a variable identifier.
The ChkTypeDefUseId role checks the IsType property of the
key that is the last element of the list to report errors of this kind
(see Verifying type identifier usage).
The void attribute RootType.GotUserTypes represents the state of the
computation at which all of the type denotations and type identifiers have
been formed into lists.
The structural equivalence computation must be carried out after all
Type.Type attributes have been set and linked as described in the
previous section, and
all of the possibly-equivalent types have been added to the appropriate
blocks of the initial partition
(see Computing equivalence classes).
The latter condition is represented by all of the void attributes
TypeDenotation.GotType having been set.
A user can override the computation of the void attribute
RootType.GotType to signal dependence of the structural equivalence
computation on any additional information.
RootType.GotAllTypes is the post-condition for the structural
equivalence algorithm.
After that computation is complete, however, some definition table keys
that were thought to represent types have had their properties changed
so that they represent type identifiers
(see Structural Type Equivalence).
Thus scanning a list of definition table keys to find the last one
is only meaningful after RootType.GotAllTypes has been established.
The TypeKey attributes of TypeDenotation ,
TypeDefDefId , and TypeDefUseId reflect this fact.
Sometimes a designer uses a C routine to access type properties.
If the keys defining the types have been obtained from Type
attributes of TypeDenotation , TypeDefDefId , or
TypeDefUseId (rather than from TypeKey attributes of those
nodes), then
FinalType can be used to obtain the key at the end of the list.
The C program must include the header file Typing.h , and the code
must enforce a dependence on RootType.GotAllTypes .
If that dependence is not enforced, the results of invoking
FinalType are undefined.
The operator identification database used for type analysis within
expressions is initialized from the specifications of language-defined
types, operators, and operator indications.
Database representations for function operators and operators associated
with user-defined types cannot be constructed until the pre-condition
RootType.GotAllTypes has been established.
Moreover, type analysis of expressions cannot be carried out until that
information has been entered into the database.
Computations that define operators must establish the GotOper
post-condition at their associated OperatorDefs nodes.
The computation of RootType.GotOper can be overridden to provide
dependence on computations not associated with an OperatorDefs node.
RootType.GotAllOpers represents the state in which the database has
been completely populated.
All expression analysis computations have RootType.GotAllOpers as a
pre-condition.
The computation provided for the TypedDefId context sets the
TypeOf property of that identifier's key to the value of
TypedDefId.Type .
TypedDefId.TypeIsSet is the post-condition for that computation.
RootType.AllTypesAreSet is the conjunction of all of the
TypedDefId.TypeIsSet post-conditions plus
RootType.GotAllTypes .
If other property values of the identifier's key are set by
user computations in the lower context of TypedDefId
that establish the postcondition SYNT.GotProp ,
then the setting of these properties is also guaranteed by the
post-condition TypedDefId.TypeIsSet .
(SYNT.GotProp defaults to the empty postcondition.)
Note that if any of these user computations depend on any results from type
analysis, a cycle will be created.
A computation supplied by the module sets the TypedUseId.Type
attribute to the value of the TypeOf property of that identifier's
definition table key.
TypedUseId.TypeIsSet is a precondition for that computation.
It must guarantee that the TypeOf property of the identifier
has actually been set.
The module provides a default computation for TypedUseId.TypeIsSet
in the lower context of the TypedUseId node, requiring the
pre-condition RootType.TypeIsSet .
Some languages provide initialized variable declarations, and allow the
user to omit either the type specification or the initializing expression
but not both.
If the type specification is omitted, the variable's type is the type
returned by the initializing expression.
Here are some examples of such declarations in Modula-3:
VAR Both: INTEGER := 3;
VAR NoType := Both + 7;
VAR NoInit: INTEGER;
The default computations for the TypeIsSet attributes in this
example lead to a cycle:
-
The
TypedDefId.Type attribute of NoType
depends on TypedUseId.Type for Both .
-
The computation of
TypedUseId.Type for Both
has the pre-condition TypedUseId.TypeIsSet .
-
TypedUseId.TypeIsSet depends on RootType.TypeIsSet in the
default computation.
-
RootType.TypeIsSet is the conjunction of all
TypedDefId.TypeIsSet attributes.
-
TypedDefId.TypeIsSet for NoType is the post-condition
for a computation involving the TypedDefId.Type attribute
of NoType .
If the language requires that the initialized declaration of a variable
precede any uses of that variable,
then we can override the default dependence as follows:
CHAIN TypeDepend: VOID;
CLASS SYMBOL ROOTCLASS COMPUTE
CHAINSTART HEAD.TypeDepend=THIS.GotType;
END;
RULE: VrblDecl ::= 'VAR' VarIdDef ':' Type ':=' Expr COMPUTE
VrblDecl.Type=Type.Type;
VrblDecl.TypeDepend=VarIdDef.TypeIsSet <- Expr.TypeDepend;
END;
RULE: VrblDecl ::= 'VAR' VarIdDef ':' Type COMPUTE
VrblDecl.Type=Type.Type;
VrblDecl.TypeDepend=VarIdDef.TypeIsSet <- Expr.TypeDepend;
END;
RULE: VrblDecl ::= 'VAR' VarIdDef ':=' Expr COMPUTE
VrblDecl.Type=Expr.Type;
VrblDecl.TypeDepend=VarIdDef.TypeIsSet <- Expr.TypeDepend;
END;
SYMBOL VarIdUse COMPUTE
SYNT.TypeIsSet=THIS.TypeDepend;
THIS.TypeDepend=SYNT.TypeIsSet;
END;
If there is no ordering requirement, then a fixed-point computation is
required to determine the variable types.
In addition, code generated from the initializing expressions must be
arranged to ensure that a variable's value is computed before it is used.
Finally, such out-of-order dependence makes the program hard to understand.
We strongly recommend that declaration before use be required if variables
are allowed to obtain their type from their initializers.
|