Abstract Syntax Tree Unparsing
Recall the example of the pretty-printer that was defined by the file following type-`specs' file (see Using an Unparser):
example.fw example.fw :idem Add.fw
The first line is the name of a file defining a processor that builds a tree from a sentence in the expression language. The second line is a request to derive a textual unparser from the definition of the expression language. Finally, the third line is the name of a file containing the computation that outputs the unparsed tree. These three lines constitute the complete definition of the pretty-printer, which could be derived from this type-`specs' file in the usual way.
Here we are concerned only with the problem of deriving an unparser, exemplified by the second line above. Such a derivation always yields a FunnelWeb file that defines the desired unparser. Since the derivation occurs as a component of a type-`specs' file, the derived unparser becomes a component of the processor defined by that type-`specs' file.
All of the information needed to construct the unparser must be derivable from its basis (file `example.fw' in this case). Different derivations are applied to the basis to create different kinds of unparsers, to control the representation language of the unparsed text, and to obtain a definition of the output structure.
In the simplest case, the only information needed to derive an unparser is the tree grammar rules defining the set of trees to be unparsed.
Since the generated unparser will be a component of some processor, all of the rules defining trees to be unparsed must be valid rules of the tree grammar for that processor. The easiest way to satisfy this requirement is for the basis of the unparser derivation to define a complete tree grammar for the processor. This is the situation in our example; file `example.fw' defines the complete tree grammar for the expression language and therefore for the pretty-printer. (See Deriving structural unparsers, for applications in which unparsers are derived from parts of the tree grammar for a processor.)
Suppose that we were to create a file `evaluate.fw' containing computations that evaluate sentences in the expression language. A "desk calculator" could then be defined by a file `calculator.specs' with the content:
In this case, `calculator.specs' still defines the complete tree grammar for the expression language. Thus the following type-`specs' file would define a processor that reads sentences in the expression language, evaluates them, and prints them in a standard format:
calculator.specs calculator.specs :idem Add.fw
The situation is more complex when some PTG patterns must be overridden to obtain the desired output. Overriding patterns must be specified as part of the basis from which the unparser is derived, and they will be incorporated into the generated unparser definition.
One way to include overriding PTG patterns in the basis of an unparser derivation is to make them a part of the overall processor specification. Thus, for example, they could be included in `example.fw' of the specifications above. Then either of the derivations shown (the one based on `example.fw' or the one based on `calculator.specs') would produce an unparser with the specified patterns overridden. It is important to note that the tree grammar and the PTG patterns are the only things defined by `calculator.specs' (or by `example.fw' in the earlier derivation) that are relevant to deriving an unparser. All other information is ignored. PTG patterns whose names do not match prefixed rule names from the tree grammar are also ignored.
It is often a violation of modularity to combine overriding patterns with the overall processor specification. For example, consider an unparser that outputs a postfix representation of a sentence in the expression language (see Overriding PTG patterns). The overriding patterns are specific to this particular processor, and have nothing to do with the definition of the expression language itself. Including them in `example.fw' would pollute the language specification, tying it to this application.
We can easily avoid this violation of modularity by adding a
Idem_PlusExp: $1 $2 "+" [Separator] Idem_StarExp: $1 $2 "*" [Separator] Idem_Parens: $1 Idem_CallExp: $1 $2 [Separator]
This file is then supplied as the value of the
The unparser derivation would then be:
example.fw +patterns=(Postfix.ptg) :idem
A complete processor accepting a sentence in the expression language and printing its postfix equivalent in standard form would then be defined by the following type-`specs' file:
example.fw example.fw +patterns=(Postfix.ptg) :idem Add.fw
A basis may include any number of file-valued
Any unparser can be derived with a prefix other than
The desired prefix is supplied as the value of the
All PTG pattern names in an unparser derived from this basis would begin
Target_PlusExp: $1 $2 "+" [Separator] Target_StarExp: $1 $2 "*" [Separator] Target_Parens: $1 Target_CallExp: $1 $2 [Separator]
The basis of such an unparser consists of the the specification file for the tree grammar, modified by the two parameters (which may be given in any order see Parameterization Expressions of Eli User Interface Reference Manual):
example.fw +prefix=Target +patterns=(Postfix.ptg)
In the remainder of this document, `Basis' will be used to denote the
basis of an unparser derivation.
As we have seen in this section, `Basis' consists of a single file
defining a tree grammar, possibly parameterized by a set of overriding
PTG patterns and/or a prefix to replace the default
The result of this derivation is a FunnelWeb file defining a textual unparser. That FunnelWeb file contains:
A PostScript version of the unparser definition can also be derived for documentation purposes:
Basis :idem :fwTex :ps
The result of this derivation is a FunnelWeb file defining a structural unparser. That FunnelWeb file contains:
A PostScript version of the unparser definition can also be derived for documentation purposes:
Basis :tree :fwTex :ps
A structural unparser produces a description of the tree in some language.
Recall that a generic functional representation is used by default.
Any other standard representation language can be specified by supplying
an appropriate value of the
Basis +lang=XML :tree
See Languages describing tree structure, for a list of the standard representation languages.
When none of the standard representation languages is appropriate, you
can specify your own unparser generator.
This unparser generator can be invoked by supplying its executable file to
the derivation as the value of the
The most common way to specify a new unparser generator is to modify an
existing specification and then use Eli to produce an executable file from
that modified specification.
We have already given an example of this technique
(see Languages describing tree structure).
In that example, file
`M3.specs' defined a generator producing an
unparser that represents a tree by a Modula-3 program.
The executable version of that generator could be obtained in the usual way
by deriving the
Basis +lang=(M3.specs :exe) :tree
Here the executable file supplied as the value of the
Structural unparser generators producing application-language code also deliver a definition of the data structure(s) described by that code. For example, an unparser generator producing tree descriptions in XML also delivers a "document type declaration" (DTD) file defining a grammar for those descriptions. Here's the DTD file for the expression language:
<!ENTITY % Axiom "(rule_000)"> <!ENTITY % Expression "(PlusExp | StarExp | Parens | IdnExp | IntExp | CallExp)"> <!ENTITY % Arguments "(ArgList)"> <!ELEMENT rule_000 (%Expression;)> <!ELEMENT PlusExp (%Expression;, %Expression;)> <!ELEMENT StarExp (%Expression;, %Expression;)> <!ELEMENT Parens (%Expression;)> <!ELEMENT IdnExp (#PCDATA)> <!ELEMENT IntExp (#PCDATA)> <!ELEMENT CallExp (#PCDATA, %Arguments;)> <!ELEMENT ArgList ((%Expression;)*)>
This definition depends only on the tree grammar, not on any particular tree defined by that grammar. Thus it is built separately:
Basis +lang=XML :treeStruc
You can list the files in the set with the following request:
Basis +lang=XML :treeStruc :ls >
To obtain copies of the definition files, make a copy of the set itself (see Extracting and Editing Objects of Eli User Interface Reference Manual):
Basis +lang=XML :treeStruc > Structure
(This request copies the generated files into a sub-directory named `Structure' of the current directory; the destination name `Structure' could be replaced by any directory name. The directory must exist before the request is made.)
Consider a translator that builds a target program tree corresponding to the source program presented to it. Perhaps we would like to make that translator output a listing of the source text formatted according to standard rules and also an XML file that defined the target program tree. This can be done by including two unparsers, one textual and the other structural.
To make the discussion concrete, let `Source_i.specs' define a processor that reads a sentence in language `i' and builds a corresponding decorated tree. `Translator.fw' specifies computations over such a source program tree that build a target program tree according to the structure defined by file `Target_j.specs'. File `Translator.specs', consisting of the following three lines, would then define a translator that would build a target program tree corresponding to a sentence in language `i':
Source_i.specs Target_j.specs Translate.fw
If the root of the tree grammar defined in `Source_i.specs'
RULE Axiom: Root ::= Source $ Target COMPUTE Target.GENTREE=Source.Code; END;
This computation takes the target program tree that has been built as the
value of attribute
Given `Translator.specs', one way to define a processor producing a listing of the source text formatted according to standard rules and also an XML file defining the target program tree would be to write the following type-`specs' file:
Translator.specs Translator.specs :idem Translator.specs +prefix=Target +lang=XML :tree Add.fw
The first line of this file defines the translator itself, and the second
line defines a textual unparser computing
SYMBOL Source COMPUTE Sep_Out ( THIS.IdemPtg); END; SYMBOL Target COMPUTE BP_OutFile("xml",THIS.TargetPtg); END;
These computations will write the pretty-printed source program to the standard output, and the XML representation of the target program tree to file `xml'.
Suppose that the tree grammars defined by `Source_i.specs' and
`Target_j.specs' in the example of the previous section are disjoint.
In that case, the processor defined there will compute
Translator.specs Source_i.specs :idem Target_j.specs +prefix=Target +lang=XML :tree Add.fw
Note that no other changes are needed in any of the files.
Each of the two tree grammars on which the unparsers are based defines a complete, rooted sub-tree of the complete tree. Moreover, because the tree grammar defined by `Target_j.specs' describes a tree created by attribution, each of its rules has been given an explicit name (see Tree Construction Functions of LIDO - Reference Manual).
The fact that no more than one of the tree grammars contains unnamed rules is crucial to the success of the complete processor derivation. Recall that an unparser definition contains the definition of the tree grammar on which it is based, and every rule in that tree grammar is named. If the names were not explicit in the unparser's basis, the names in the unparser definition will have been created as part of the unparser generation. The same name creation process is applied during every unparser generation, and therefore if two unparsers generated from disjoint tree grammars with unnamed rules are combined there will be name clashes.