Eli   Documents Get Eli: Translator Construction Made Easy at SourceForge.net.
    Fast, secure and Free Open Source software downloads

General Information

 o Eli: Translator Construction Made Easy
 o Global Index
 o Frequently Asked Questions
 o Typical Eli Usage Errors

Tutorials

 o Quick Reference Card
 o Guide For new Eli Users
 o Release Notes of Eli
 o Tutorial on Name Analysis
 o Tutorial on Type Analysis
 o Typical Eli Usage Errors

Reference Manuals

 o User Interface
 o Eli products and parameters
 o LIDO Reference Manual
 o Typical Eli Usage Errors

Libraries

 o Eli library routines
 o Specification Module Library

Translation Tasks

 o Lexical analysis specification
 o Syntactic Analysis Manual
 o Computation in Trees

Tools

 o LIGA Control Language
 o Debugging Information for LIDO
 o Graphical ORder TOol

 o FunnelWeb User's Manual

 o Pattern-based Text Generator
 o Property Definition Language
 o Operator Identification Language
 o Tree Grammar Specification Language
 o Command Line Processing
 o COLA Options Reference Manual

 o Generating Unparsing Code

 o Monitoring a Processor's Execution

Administration

 o System Administration Guide

Mail Home

Pattern-based Text Generator

Previous Chapter Next Chapter Table of Contents


Pattern Specifications

A pattern is specified by a named sequence of C string literals and $ tokens that denote insertion points, e.g.

   Pair: "(" $ "." $ ")" /* S-expression */

C style comments may be inserted anywhere in a PTG specification.

The pattern describes an output text that consists of the specified sequence of strings with the results of pattern applications being inserted at each insertion point.

A pattern is applied by calling a PTG generated function that has the name of the pattern preceded by PTG, PTGPair in this case. The result of such a call yields a pointer of type PTGNode which represents that pattern application.

The pattern function takes as many arguments of type PTGNode as the pattern has insertion points. The arguments are obtained from other calls of pattern functions. Their order corresponds to that of the insertion points in the pattern. (Alternative forms of are described in See Indexed Insertion Points and See Typed Insertion Points.)

The pattern function for the example above has the following signature:

   PTGNode PTGPair (PTGNode a, PTGNode b)
Examples for applications of this pattern are:

   x = PTGPair (PTGNil(), PTGNil());
   y = PTGPair (x, x);

Restrictions:

For every two patterns in all .ptg specifications, the following condition must hold: If any two patterns are not equal, their names must be different. Additionally, the names of the patterns must not collide with identifiers predefined for PTG.

PTG does not insert any additional white space before or after elements of the pattern sequence. Token separation and new line characters (especially at the end of a file) have to be specified explicitly.

Indexed Insertion Points

The insertion points of a pattern may be identified by numbers, e.g. $1, $2. This facility allows to insert an argument of a pattern function call at several positions in the pattern, and it allows to modify patterns without the need to change their application calls.

In the following example the first argument (the module name) is inserted at two positions:

   Module: "module " $1 "\nbegin" $2 "end " $1 ";\n" 
The pattern function is called with two arguments.

The correspondence between insertion points and function parameters is specified by the numbers of the insertion points, i.e. the first argument is inserted at the insertion points $1.

This facility also makes the calls of pattern functions more independent of pattern modifications. For example, the following pattern describing declarations

   Decl: $1 /* type */ " " $2 /* identifiers */ ";\n"
would be applied by a call PTGDecl (tp, ids), with the first argument inserting the type and the second inserting the identifiers. Those calls are invariant against changing the pattern to Pascal-like declaration style:

   Decl: $2 ":" $1 ";"
In the same way one variant of a pattern may omit an argument specified in an other variant.

In general a pattern may contain several occurrences of any of the insertion point markers $i. There is practically no upper bound for $i. The generated function has n parameters, where n is the maximal i occurring in a $i of the pattern. The i-th function argument is substituted at each occurrence of $i in the pattern.

If a pattern does not mention all $i between $1 and the maximum $n, e.g. $1 and $3 but not $2, the function has n parameters, but some are not used.

Restrictions:

Indexed and non-indexed insertion points must not be mixed in a single pattern.

Typed Insertion Points

Data items can be inserted into the output text by specifying insertion points to have one of the types int, string, long, short, char, float or double, e.g.

   Matrix:  "float " $1 "[" $2 int "][" $3 int "];\n"
The generated pattern function has the following signature:

   PTGNode PTGMatrix (PTGNode a, int b, int c)
Function calls must supply arguments of corresponding types. They are output in a standard output representation.

Another typical application of typed insertion point is generating identifiers:

   Ident: $ string $ int
This pattern may be used to compose identifiers from a string and a number, e.g. a call PTGIdent ("abc", 5) producing abc5. The string item is often taken from the input of the language processor, and the number is used to guarantee uniqueness of identifiers in the output. (This construct also substitutes the outdated leaf patterns, See Outdated Constructs.)

A typical example for composition of output text fragments from data of basic types is given by a pattern that produces German car identifications:

   CarId: $ string $ string $int
which is applied for example by PTGCarId ("PB-", "AB-", 127).

Restrictions:

If an indexed insertion point occurs multiply in a pattern its type must be the same for all occurrences.

Function Call Insertion

There are situations where it is inconvenient or impossible to specify an output component by a pattern. In such cases calls of user defined functions can be specified instead of insertion points in a pattern.

Assume as an example that indentation shall be specified for the output of a block structured language:

   Block:   [NewLine] "{" [Indent] $1   /* declarations */
                                   $2   /* statements */
                          [Exdent]
            [NewLine] "}"
This pattern for producing a block has two ordinary insertion points, one for the declarations of the block and one for its statements. The pattern function is called as usual with two corresponding arguments. When the output text is produced the user defined functions NewLine, Indent, and Exdent are called in order to insert text at the specified pattern positions.

In this case the functions must have exactly one parameter that is a PTG_OUTPUT_FILE:

   extern void NewLine (PTG_OUTPUT_FILE f);
   extern void Indent (PTG_OUTPUT_FILE f);
   extern void Exdent (PTG_OUTPUT_FILE f);
The type PTG_OUTPUT_FILE can be supplied by the user. If it is not, a default is supplied by the generated file ptg_gen.h. The function has to be implemented such that a call outputs the desired text to the file pointed to by f by using some provided output macros, See Influencing PTG Output.

Note: These function calls are executed when the output text is produced. The functions are not yet called when the patterns are applied. PTG guarantees that those calls occur in left-to-right order of the produced output text. Hence, the above triple of functions may use global variables to keep track of the indentation level.

Functions, that support indentation ready to use in a PTG specification can be found in the module library, See Indentation of Specification Module Library: Creating Output.

Such function calls may also take arguments which are passed through from the call of the pattern function. They are specified by occurrences of insertion points within the call specification:

   Block:   [NewLine] "{" [Indent $3 int] $1   /* declarations */
                                          $2   /* statements */
                          [Exdent $3 int]
            [NewLine] "}"
In this case the indentation depth is determined individually for each application of the Block pattern which is called for example by PTGBlock (d, s, 3). The last argument is passed through to the calls of Indent and Exdent which now must have the signatures

   extern void Indent (PTG_OUTPUT_FILE f, int i);
   extern void Exdent (PTG_OUTPUT_FILE f, int i);

Note: The arguments supplied with a pattern application are stored until the functions are called when the output is produced.

In general several arguments may be specified to be passed through to a function call. They may be typed by one of the types int, string, long, short, char, float, double or pointer. In case of type pointer the supplied argument of the pattern function call must have a pointer type that is defined for the corresponding parameter of the user function. If no type is specified an argument of type PTGNode is passed through.

Arguments specified of type pointer are typically used if the translation of certain data structures by user specified functions is to be inserted into pattern driven translations.

Optional Parts in Patterns

Parts of a pattern can be marked as optional by surrounding them with braces. Using this notation, the optional parts will only be printed in the output, if all other insertions of the pattern (insertions not marked optional by being included in a brace) produce output. This can be applied to simplify the construction of lists considerably.

   CommaSeq:    $1 {", "} $2

A call of the pattern function PTGCommaSeq(a,b) produces the separator only if neither a nor b is empty; otherwise a and b are just concatenated, leaving out the optional part. This facility is especially useful if such separated lists are composed by pattern function calls that occur in loops or in separated contexts. See A Complete Example, for a more sophisticated example.

Note: The result of a pattern call is the unique value PTGNULL if the empty output string is produced. (There is no way to further inspect the intermediate results of pattern applications.) Certain pattern constructs do not yield PTGNULL even if they may represent empty strings:

  • Typed insertions and function call insertions,
  • empty strings and empty literals.
are considered not to be empty.

Another example for optional parts in patterns is the following:

   Paren:       {"("} $ {")"}
The pattern function PTGParen(a) will produce parentheses around a if a is not empty. Otherwise, PTGParen(a) will be empty.

Restrictions:

An optional pattern is printed, if all non-optional insertions in the node are not PTGNULL. However, if there are no non-optional insertions, the braces are ignored and a warning is issued.

It is possible to include more than one pattern in braces. Multiple optional parts can be included in one Rule. However, the braces marking an optional pattern cannot be used recursively inside an optional pattern.


Previous Chapter Next Chapter Table of Contents