General Information
Tutorials
Reference Manuals
Libraries
Translation Tasks
Tools
Administration
|
Type AnalysisExpressionsAn expression node represents a program construct that yields a value, and an expression tree is a subtree of the abstract syntax tree made up entirely of expression nodes. Type analysis within an expression tree is uniform; additional specifications are needed only at the roots and leaves. (Note that these need not be roots and leaves in the sense of the abstract syntax tree.) A designer often chooses to represent a programming language expression by more than one expression tree, in order to capture special relationships within that expression. For example, each argument of a function call might be a separate expression tree because more type conversions are allowed in that context than in the context of an operator.
The
The expression module is usually instantiated by:
$/Type/Expression.gnrc :instFor a discussion of alternatives, see Selecting an operator at an expression node.
Type analysis of expression trees
The symbol on the left-hand side of a rule defining an expression node
characterizes the expression's result.
It inherits the
An expression node is type-correct if the type specified by its
In order to support incremental development, Rules defining expression nodes in an abstract syntax tree for a typical programming language describe constants, variables, and computations: SYMBOL Expr INHERITS ExpressionSymbol END; RULE: Expr ::= Number END; RULE: Expr ::= ExpIdUse END; RULE: Expr ::= Expr Operator Expr END; RULE: Expr ::= Expr '[' Subscript ']' END;
The first two rules describe leaves of expression subtrees.
Any node described by the first rule is a leaf of the abstract syntax tree
as well as a leaf of some expression subtree.
Nodes described by the second rule are not leaves of the abstract syntax
tree because each has an
A leaf of an expression subtree delivers a value whose type
must be determined from the context of that leaf according to the
definition of the language.
For example, the The type analyzer models most interior expression nodes by operators applied to operands:
For example, in the following rule, the indication is provided by the
RULE: Expr ::= Expr Operator Expr END;Usually, a set of several operators (such as { iAddOp ,
fAddOp }) is associated with that indication.
An operator is then selected from that set as discussed in the next section.
In the fourth rule, we might assume that each array type definition adds a dyadic access operator to an indication fixed by the rule (see Operator, function, and method definitions):
RULE: Expr ::= Expr '[' Subscript ']' END;The left operand of that operator is the array type, the right operand is the index type, and the result is the element type. The operator/operand model provides support for expression node semantics that are ubiquitous in programming languages. Several other models, useful in special circumstances, are supported and will be introduced in later sections of this chapter. It is clear, however, that there will be situations in which the semantics of an expression context do not fit any of the supported models. Our advice is to consider such a context as a place where several disjoint expression subtrees meet: The expression symbol on the left-hand side of the rule defining the context is a leaf of an expression tree above the context, and each expression symbol on the right-hand side is the root of an expression tree below the context.
Selecting an operator at an expression nodeIf an indication is associated with a singleton operator set, that operator is selected regardless of operand or result types. There are two standard algorithms for selecting an operator if the indication's set has more than one element. The simplest ignores the type required by the context. For each operator in the set, it checks whether each operand is acceptable as the type required by that operator. If more than one operator in the set satisfies this condition, the algorithm chooses the one requiring the fewest coercions. If no operator can be identified, then the unknown operator is chosen.
To select this algorithm, instantiate the
$/Type/Expression.gnrc :inst Ada is a language in which the selection of an operator depends on the type required by the context as well as the types delivered by the operands. In that case, a two-pass algorithm is required. Starting with the leaves of an expression tree and working towards the root of that tree, the algorithm determines a possible type set for each expression node. Every type in the possible type set at a node is either a leaf type or is a type associated with an operator whose result is that type, and whose operands are elements of the possible type sets of the node's children. The algorithm associates a cost with each type in a possible type set. OIL allows one to specify an arbitrary integer cost for each operator and coercion independently. If this specification is omitted, a cost of 1 is assigned to the operator or coercion. For the remainder of this section, we assume that all cost specifications have been omitted and therefore all costs are 1. Consider a very simple expression node, the integer constant 3. One element of the possible type set for this node is the language-defined integer type, which has cost 0 because no operations are needed to create a value of that type. If a coercion has been defined with the language-defined integer type as its operand and the language-defined floating-point type as its result, then another element of the possible type set of this node is the language-defined floating-point type. It has cost 1, the total number of operators required to produce it. Similarly, if there is a coercion from floating-point to double, double will be an element of the possible type set of the node and it will have cost 2.
When the algorithm computes the possible type set of an interior expression
node, it considers the operators in that node's indication set.
For each operator, it checks whether the type required by that
operator for a given argument is in the possible type set of the
corresponding operand.
Each operator meeting that condition is a possible operator at the node.
The cost of using an operator is one more than the sum of the costs
associated with its argument types in the possible type sets of its
operands.
Finally, the possible type set of the node is the set of all types Often a particular element of the possible type set of an interior node can be obtained in more than one way. For example, consider a node representing the sum of two integers. Assuming that the integer type is acceptable as the floating-point type, the possible type set of each child contains both integer and floating-point types. Thus both integer addition and floating-point addition are possible selections at this node. There are then two ways to obtain a floating-point result:
The cost of using an integer addition in this context is 1 because the cost of the integer elements of the possible type sets are both 0. Converting the result to floating-point costs one coercion, for a total cost of 2. The cost of using a floating-point addition, on the other hand, is 3 because the cost of the floating-point elements of the possible type sets are both 1.
The cost of obtaining a value of type
If the
The second pass starts with the root of the tree and works towards the
leaves.
At each node, the value of the
If the possible type set of an expression node does not contain an element
equal to the value of that node's
To select the two-pass algorithm, instantiate the
$/Type/Expression.gnrc +referto=Result :inst
Expression contexts without operators
Let `e1' be a grammar symbol playing the The constant and variable expressions in C are examples of primary contexts: SYMBOL Expr INHERITS ExpressionSymbol END; RULE: Expr ::= Number COMPUTE PrimaryContext(Expr,intType); END; SYMBOL ExpIdUse INHERITS TypedUseId END; RULE: Expr ::= ExpIdUse COMPUTE PrimaryContext(Expr,ExpIdUse.Type); END;The type of an integer constant is the language-defined integer type (see Language-defined types), and the type of a variable is the type with which it was declared (see Accessing the type of an entity).
Let `e1' and `e2' be grammar symbols playing the
The comma expression in C is an example of a transfer context:
RULE: Expr ::= Expr ',' Expr COMPUTE TransferContext(Expr[1],Expr[3]); END;
Notice that the left operand of the comma,
Let `e1', `e2', and `e3' be grammar symbols playing the
The conditional expression of C is an example of a balance context:
RULE: Expr ::= Expr '?' Expr ':' Expr COMPUTE BalanceContext(Expr[1],Expr[3],Expr[4]); Expr[2].Required=scalarType; END;
The condition, Some languages generalize the conditional expression to a case expression. For example, consider an ALGOL 68 computation of the days in a month:
begin int days, month, year; days := case month in 31, (year mod 4 and year mod 100 <> 0 or year mod 400 = 0 | 28 | 29), 31,30,31,30,31,31,30,31,30,31 esac end
The number of cases is not fixed, and the balancing process therefore
involves an arbitrary list.
This is the purpose of the SYMBOL CaseExps INHERITS BalanceListRoot END; SYMBOL CaseExp INHERITS BalanceListElem END; RULE: Expr ::= 'case' Expr 'in' CaseExps 'esac' COMPUTE TransferContext(Expr[1],CaseExps); END; RULE: CaseExps LISTOF CaseExp END; RULE: CaseExp ::= Expr COMPUTE TransferContext(CaseExp,Expr); END;
Notice that these rule computations simply interface with the
Operators with explicit operands
Tree symbols in the abstract syntax that correspond to operator symbols in
a source program usually inherit the
In order to support incremental development,
Let `e1', `e2', and `e3' all play the
Contexts with arbitrary numbers of operands are discussed in the next section. SYMBOL Operator INHERITS OperatorSymbol END; RULE: Expr ::= Expr Operator Expr COMPUTE DyadicContext(Expr[1],Operator,Expr[2],Expr[3]); END; RULE: Operator ::= '+' COMPUTE Operator.Indic=PlusInd; END;
The array access rule also fits the
Let `ind' be a definition table key representing an indication.
If the indication SYMBOL Subscript INHERITS ExpressionSymbol END; RULE: Expr ::= Expr '[' Subscript ']' COMPUTE DyadicContext(Expr[1],,Expr[2],Subscript); Indication(indexInd); END;Note that Expr[2] can be any expression yielding
an array value; it need not be a simple array name.
In some cases it is useful to know the name of the operator selected from
the indication set.
The
Operators with operand lists
Function calls and multidimensional array references are common examples
of expression contexts whose operators have operand lists
rather than explicit operands.
One symbol on the right-hand side of the rule defining such a context
characterizes the entire list of operands.
It inherits the
The symbol defining an operand in the list inherits the
Let `e' be a grammar symbol playing the If the language has multi-dimensional array references, they can be implemented using a strategy that differs from that of the previous section: SYMBOL Subscripts INHERITS OpndExprListRoot END; SYMBOL Subscript INHERITS OpndExprListElem END; RULE: Expr ::= Expr '[' Subscripts ']' COMPUTE ListContext(Expr[1],,Subscripts); Indication(GetAccessor(Expr[2].Type,NoKey)); END; RULE: Subscripts LISTOF Subscript END; RULE: Subscript ::= Expr COMPUTE TransferContext(Subscript,Expr); END;
This computation assumes that the indication is the value of the
Some languages have variadic operators -- operators taking a variable
number of operands.
The most common of these are For type checking purposes, the variadic operator can be considered to have a single operand, whose type is determined by balancing the elements of the list (see Expression contexts without operators). Of course this form of operand list must be distinguished syntactically from a normal list operand: SYMBOL VarRands INHERITS BalanceListRoot END; SYMBOL VarRand INHERITS BalanceListElem END; RULE: Expr ::= VarOper '(' VarRands ')' COMPUTE MonadicContext(Expr,VarOper,VarRands) END; RULE: VarRands LISTOF VarRand END; RULE: VarRand ::= Expr COMPUTE TransferContext(Actual,Expr); END;
Type conversion
The Additional type conversions such as those taking place on assignment can be modeled by specific conversion operators. An indication is associated with each context in which additional type conversions are possible, and the indication's set contains exactly the conversions allowable in that context.
Let `e1' and `e2' play the
In C, an actual argument to a function call may be implicitly converted to
the type of the corresponding formal parameter prior to the function call.
The same set of conversions can be used in assignment contexts, so assume
that the indication is SYMBOL Actual INHERITS OpndExprListElem END; RULE: Actual ::= Expr COMPUTE ConversionContext(Actual,,Expr); Indication(assignCvt); END;
Let `e1' and `e2' play the
The C cast expression is an example of a cast context.
Here we assume that RULE: Expr ::= '(' Type ')' Expr COMPUTE CastContext(Expr[1],,Expr[2],Type.Type); Indication(castInd); END;
Let `e2' play the
The C return statement is an example of a root context.
It is not itself an expression, but it has an expression operand.
That operand must yield the return type of the function,
which is inherent in the context of the return statement,
and can be obtained from the RULE: Statement ::= 'return' Expr COMPUTE RootContext(INCLUDING (Function.ResultType),,Expr); Indication(assignInd); END;
|