General Information
Tutorials
Reference Manuals
Libraries
Translation Tasks
Tools
Administration
|
Abstract Syntax Tree UnparsingAvailable Kinds of UnparserEli is capable of generating specifications for the following kinds of unparsers:
Textual unparserA textual unparser creates source text which, when parsed, results in the tree that was unparsed. For example, the pretty-printer described above accepted a sentence in the expression language and built a tree. It then unparsed that tree to produce an equivalent sentence in the expression language that was formatted in a particular way. If the resulting sentence were parsed according to the rules of the expression language, the result would be the tree from which it had been created. Consider the tree representing the following sentence in the expression language:
a((b+c)*d, e) None of the terminal symbols ( ) + * is stored explicitly in the tree. Thus the textual unparser must reconstruct these terminal symbols from the LIDO rules defining the tree nodes.
The terminal symbol , doesn't appear in any of the LIDO rules, and
therefore it cannot be automatically reconstructed by the textual unparser.
Additional information must be provided by the user to insert it into the
unparsed text.
This is a common consequence of using
Our definition of the tree grammar for the expression language contains the following rule:
RULE Parens: Expression ::= '(' Expression ')' END;
The purpose of this rule is to support the unparser by retaining
information about the presence of parentheses used to
override the normal operator precedence.
Such parentheses result in a
One important aspect of the textual form of the program that is missing from
the LIDO rules is how to separate the basic symbols.
For example, consider a
a(b, c) There is no information in the LIDO rules about whether a space should precede and/or follow a ( or ,. Spacing is important for making the text readable, however, and cannot simply be ignored.
Computations for plain productions
A generated textual unparser defines the following computation (two
attributes are used to simplify overriding,
see Changing
ATTR IdemPtg, IdemOrigPtg: PTGNode; CLASS SYMBOL IdemReproduce COMPUTE SYNT.IdemOrigPtg= RuleFct("PTGIdem_", RHS.IdemPtg, TermFct("PTGIdem_")); SYNT.IdemPtg=THIS.IdemOrigPtg; END;
The class symbol
This computation invokes a function specific to the LIDO rule and, if the
rule contains any instances of non-literal terminal symbols, a function
specific to each.
For example, the effect of
RULE StarExp: Expression ::= Expression '*' Expression COMPUTE Expression[1].IdemOrigPtg= PTGIdem_StarExp(Expression[2].IdemPtg,Expression[3].IdemPtg); Expression[1].IdemPtg=Expression[1].IdemOrigPtg; END; RULE CallExp: Expression ::= Identifier '(' Arguments ')' COMPUTE Expression[1].IdemOrigPtg= PTGIdem_CallExp(Arguments.IdemPtg,PTGIdem_Identifier(Identifier)); Expression[1].IdemPtg=Expression[1].IdemOrigPtg; END;
(For details about Here are several PTG patterns appearing in a textual unparser generated from the expression language definition that illustrate how the PTG functions are specified:
Idem_StarExp: $1 "*" [Separator] $2 Idem_IdnExp: $1 [Separator] Idem_CallExp: $2 [Separator] "(" [Separator] $1 ")" [Separator] Notice how these patterns reconstruct the terminal symbols *, (, and ).
The different orders of the indexed insertion points in the patterns
The separator module provides the following output functions, which must be used instead of the corresponding PTG output functions (see Output Functions of PTG: Pattern-based Text Generator):
PTGNode Sep_Out(PTGNode root); PTGNode Sep_OutFile(char *filename, PTGNode root); PTGNode Sep_OutFPtr(FILE *fptr, PTGNode root); The module library contains two modules that implement different strategies for selecting layout characters:
If none of the available modules is satisfactory,
then you must create your own.
The simplest approach is to modify one from the library.
Here is a sequence of Eli requests that will
extract `C_Separator.fw' as file
-> $elipkg/Output/C_Separator.fw > My_Separator.fw -> My_Separator.fw !chmod +w -> My_Separator.fw <
In order to change the decision about what (if any) separator is to be
inserted in a given context, you need to change the function called
Computations for LISTOF productions
A generated textual unparser defines the following computation for a
ATTR IdemPtg, IdemOrigPtg: PTGNode; CLASS SYMBOL IdemReproduce_X COMPUTE SYNT.IdemOrigPtg= PTG_r( CONSTITUENTS (Y.IdemPtg, Z.IdemPtg) SHIELD (Y, Z) WITH (PTGNode, PTGIdem_2r, PTGIdem_1r, PTGNull)); SYNT.IdemPtg=THIS.IdemOrigPtg; END;
The class symbol
CLASS SYMBOL IdemReproduce_Arguments COMPUTE SYNT.IdemOrigPtg= PTG_ArgList( CONSTITUENTS (Expression.IdemPtg) SHIELD (Expression) WITH (PTGNode, PTGIdem_2ArgList, PTGIdem_1ArgList, PTGNull)); SYNT.IdemPtg=THIS.IdemOrigPtg; END;
The computation for the class symbol invokes three functions
specific to the LIDO rule.
Here are the three PTG patterns specifying those functions for the
Idem_ArgList: $ Idem_2ArgList: $ $ Idem_1ArgList: $
PTG patterns for other
Structural unparserA structural unparser creates a textual description of the tree in terms of rule names and non-literal terminal symbols. For example, the sentence `a(b,c)' in the expression language could be unparsed as the XML file:
<rule_000> <CallExp> a <ArgList> <IdnExp>b</IdnExp> <IdnExp>c</IdnExp> </ArgList> </CallExp> </rule_000>
The entire sentence is output as a Appropriate layout, with meaningful line breaks and indentation, is important for a human trying to understand the output of a structural unparser. This formatting depends only on structure, however, not on the content of the output. Structural unparser generators producing both simple descriptions of trees and descriptions in several standard languages are available. It is also possible for a user to create an unparser generator that describes the tree in a language of their own choosing.
Computations for plain productions
A generated structural unparser defines the following computation (two
attributes are used to simplify overriding,
see Changing
ATTR IdemPtg, IdemOrigPtg: PTGNode; CLASS SYMBOL IdemReproduce COMPUTE SYNT.IdemOrigPtg= RuleFct("PTGIdem_", RHS.IdemPtg, TermFct("PTGIdem_")); SYNT.IdemPtg = THIS.IdemOrigPtg; END;
The class symbol
This computation invokes a function specific to the LIDO rule and, if the
rule contains any instances of non-literal terminal symbols, a function
specific to each.
For example, the effect of
RULE StarExp: Expression ::= Expression '*' Expression COMPUTE Expression[1].IdemOrigPtg= PTGIdem_StarExp(Expression[2].IdemPtg,Expression[3].IdemPtg); Expression[1].IdemPtg=Expression[1].IdemOrigPtg; END; RULE CallExp: Expression ::= Identifier '(' Arguments ')' COMPUTE Expression[1].IdemOrigPtg= PTGIdem_CallExp(Arguments.IdemPtg,PTGIdem_Identifier(Identifier)); Expression[1].IdemPtg=Expression[1].IdemOrigPtg; END;
(For details about Here are several PTG patterns from a structural unparser generated from the expression language definition that illustrate how those functions are specified:
Idem_StarExp: "<StarExp>" [BP_BeginBlockI] [BP_BreakLine] $1 [BP_BreakLine] $2 [BP_BreakLine] [BP_EndBlockI] "</StarExp>" Idem_IdnExp: "<IdnExp>" [BP_BeginBlockI] [BP_BreakLine] $1 [BP_BreakLine] [BP_EndBlockI] "</IdnExp>" Idem_CallExp: "<CallExp>" [BP_BeginBlockI] [BP_BreakLine] $2 [BP_BreakLine] $1 [BP_BreakLine] [BP_EndBlockI] "</CallExp>" These patterns are the ones generated if the output is to be an XML file (see Languages describing tree structure).
The different orders of the indexed insertion points in the patterns
Generated structural unparsers use the block print module (see Typesetting for Block Structured Output of Tasks related to generating output) to provide layout. The generated PTG patterns invoke functions of this module to mark potential line breaks and the boundaries of logical text blocks. The block print module provides the following output functions, which must be used instead of the corresponding PTG output functions (see Output Functions of PTG: Pattern-based Text Generator):
PTGNode BP_Out(PTGNode root); PTGNode BP_OutFPtr(FILE *fptr, PTGNode root); PTGNode BP_OutFile(char *filename, PTGNode root); Note that the textual representation of the children of every node is considered to be a logical text block. A line break can occur before each child. The effect of this specification is to keep the textual representation of a node on a single line if that is possible. Otherwise, the sequence of children is written one per line, indented from the name of the block's rule.
Computations for LISTOF productions
A generated structural unparser defines the following computation for a
ATTR IdemPtg, IdemOrigPtg: PTGNode; CLASS SYMBOL IdemReproduce_X COMPUTE SYNT.IdemOrigPtg= PTG_r( CONSTITUENTS (Y.IdemPtg, Z.IdemPtg) SHIELD (Y, Z) WITH (PTGNode, PTGIdem_2r, PTGIdem_1r, PTGNull)); SYNT.IdemPtg=THIS.IdemOrigPtg; END;
The symbol
CLASS SYMBOL IdemReproduce_Arguments COMPUTE SYNT.IdemOrigPtg= PTG_ArgList( CONSTITUENTS (Expression.IdemPtg) SHIELD (Expression) WITH (PTGNode, PTGIdem_2ArgList, PTGIdem_1ArgList, PTGNull)); SYNT.IdemPtg=THIS.IdemOrigPtg; END;
The computation for the class symbol invokes three functions
specific to the LIDO rule.
Here are the three PTG patterns specifying those functions for the
Idem_ArgList: "<ArgList>" [BP_BeginBlockI] [BP_BreakLine] $ [BP_BreakLine] [BP_EndBlockI] "</ArgList>" Idem_2ArgList: $ { [BP_BreakLine] } $ Idem_1ArgList: $
PTG patterns for other
Languages describing tree structureBy default, a structural unparser generator uses a generic functional representation to describe the tree. Here's the default representation of the sentence `a(b,c)' in the expression language:
rule_000(CallExp(a,IdnExp(b),IdnExp(c)))
(Recall that the entire sentence is output as a Four other standard representations are available:
It is also possible to build structural unparser generators for other application languages by modifying existing generator specifications. All unparser generators have the same general organization: they analyze the tree grammar and produce class symbol computations and PTG patterns to output any tree defined by that grammar. Much of the analysis is common, with differences appearing only in the final output of the generated FunnelWeb file. The unparser generator specifications available in the library are:
Suppose that you wanted to create an unparser generator that would
produce Modula-3 code to build the tree, and a separate interface file
defining the tree structure.
Because Modula-3 is quite similar to Java in its structure,
you might start by modifying `$/Unparser/Java.fw' from the library.
Here is a sequence of Eli requests that will
extract `Java.fw' as file
-> $elipkg/Unparser/Java.fw > Modula-3.fw -> Modula-3.fw !chmod +w -> Modula-3.fw < After suitable modification, `Modula-3.fw' could be combined with the library specification `$/Unparser/Analysis.fw' to define the new unparser generator. Thus you might create a file called `M3.specs' with the following content:
Modula-3.fw $/Unparser/Analysis.fw The unparser generator could then be derived from `M3.specs' as usual (see exe -- Executable Version of the Processor of Products and Parameters Reference):
-> M3.specs :exe
|