Eli   Documents

General Information

 o Eli: Translator Construction Made Easy
 o Global Index
 o Frequently Asked Questions
 o Typical Eli Usage Errors

Tutorials

 o Quick Reference Card
 o Guide For new Eli Users
 o Release Notes of Eli
 o Tutorial on Name Analysis
 o Tutorial on Scope Graphs
 o Tutorial on Type Analysis
 o Typical Eli Usage Errors

Reference Manuals

 o User Interface
 o Eli products and parameters
 o LIDO Reference Manual
 o Typical Eli Usage Errors

Libraries

 o Eli library routines
 o Specification Module Library

Translation Tasks

 o Lexical analysis specification
 o Syntactic Analysis Manual
 o Computation in Trees

Tools

 o LIGA Control Language
 o Debugging Information for LIDO
 o Graphical ORder TOol

 o FunnelWeb User's Manual

 o Pattern-based Text Generator
 o Property Definition Language
 o Operator Identification Language
 o Tree Grammar Specification Language
 o Command Line Processing
 o COLA Options Reference Manual

 o Generating Unparsing Code

 o Monitoring a Processor's Execution

Administration

 o System Administration Guide

Mail Home

FunnelWeb

Previous Chapter Next Chapter Table of Contents


Input Processing

Special Sequences

The scanner scans the input file from top to bottom, left to right, treating the input as ordinary text (to be handed directly to the parser as a text token) unless it encounters the special character which introduces a special sequence. Thus, the scanner partitions the input file into ordinary text and special sequences. (The control character is often referred to as the escape character or the control character in other systems. However, as there is great potential to confuse these names with the escape character (ASCII 27) and ASCII control characters, the term special has been chosen instead. This results in the terms special character and special sequence.)

input_file ::= {ordinary_text / special_sequence} .

Upon startup, the special character is @, but it can be changed using the <special>=<new_special> special sequence. Rather than using <special> whenever the special character appears, this document uses the default special character @ to represent the current special character. More importantly, FunnelWeb's error messages all use the default special character in their error messages even if the special character has been changed.

An occurrence of the special character in the input file introduces a special sequence. The kind of special sequence is determined by the character following the special character. Only printable characters can follow the special character.

The following table gives all the possible characters that can follow the special character, and the legality of each sequence. The item headings give the ASCII number of each ASCII character and the special sequence for that character. The descriptions start with one of three characters: - means that the sequence is illegal. S indicates that the sequence is a simple sequence (with no attributes or side effects) that appears exactly as shown and is converted directly into a token and fed to the parser. Finally, C indicates that the special sequence is complex, possibly having a following syntax or producing funny side effects.

000--008
Unprintable characters and hence illegal.
009
Tab. Converted by Eli (not FunnelWeb) into the appropriate number of spaces.
010--031
Unprintable characters and hence illegal.
032 @
- Illegal (space).
033 @!
C Comment.
034 @"
S Parameter delimiter.
035 @#
C Short name sequence.
036 @$
S Start of macro definition.
037 @%
- Illegal.
038 @&
- Illegal.
039 @'
- Illegal.
040 @(
S Open parameter list.
041 @)
S Close parameter list.
042 @*
- Illegal.
043 @+
C Insert newline.
044 @,
S Parameter separator.
045 @-
C Suppress end of line marker.
046 @.
- Illegal.
047 @/
S Open or close emphasised text.
048 @0
- Illegal.
049 @1
S Formal parameter 1.
050 @2
S Formal parameter 2.
051 @3
S Formal parameter 3.
052 @4
S Formal parameter 4.
053 @5
S Formal parameter 5.
054 @6
S Formal parameter 6.
055 @7
S Formal parameter 7.
056 @8
S Formal parameter 8.
057 @9
S Formal parameter 9.
058 @:
- Illegal.
059 @;
- Illegal.
060 @<
S Open macro name.
061 @=
C Set special character.
062 @>
S Close macro name.
063 @?
- Illegal. Reserved for future use.
064 @@
C Insert special character into text.
065 @A
S New section (level 1).
066 @B
S New section (level 2).
067 @C
S New section (level 3).
068 @D
S New section (level 4).
069 @E
S New section (level 5).
070 @F
- Illegal.
071 @G
- Illegal.
072 @H
- Illegal.
073 @I
C Include file.
074 @J
- Illegal.
075 @K
- Illegal.
076 @L
- Illegal.
077 @M
S Tag macro as being allowed to be called many times.
078 @N
- Illegal.
079 @O
S New macro attached to product file. Has to be at start of line.
080 @P
C Pragma.
081 @Q
- Illegal.
082 @R
- Illegal.
083 @S
- Illegal.
084 @T
C Typesetter directive.
085 @U
- Illegal.
086 @V
- Illegal.
087 @W
- Illegal.
088 @X
- Illegal.
089 @Y
- Illegal.
090 @Z
S Tags macro as being allowed to be called zero times.
091 @[
- Illegal. Reserved for future use.
092 @\
- Illegal.
093 @]
- Illegal. Reserved for future use.
094 @^
C Insert control character into text
095 @_
- Illegal.
096 @`
- Illegal.
097 @a--@z
Identical to @A--@Z.
123 @{
S Open macro body/Open literal directive.
124 @|
- Illegal.
125 @}
S Close macro body/Close literal directive.
126 @~
- Illegal.
127--255
Not standard printable ASCII characters and are illegal.

The most important thing to remember about the scanner is that nothing happens unless the special character is seen. There are no funny sequences that will cause strange things to happen. The best way to view a FunnelWeb document at the scanner level is as a body of text punctuated by special sequences that serve to structure the text at a higher level.

The remaining description of the scanner consists of a detailed description of the effect of each complex special sequence.

Setting the Special Character

The special character can be set using the sequence <special>=<newspecialchar>. For example, @=# would change the special character to a hash (#) character. The special character may be set to any printable ASCII character except the blank character (ie. any character in the ASCII range 33--126). In normal use, it should not be necessary to change the special character of FunnelWeb, and it is probably best to avoid changing the special character so as not to confuse FunnelWeb readers conditioned to the @ character. However, the feature is very useful where the text being prepared contains many @ characters (eg. a list of internet electronic mail addresses).

Inserting the Special Character into the Text

The special sequence <special>@ inserts the special character into the text as if it were not special at all. The @ of this sequence has nothing to do with the current special character. If the current special character is P then the sequence P@ will insert a P into the text. Example: @@#@=#@#@#=@@@ translates to @#@#@.

Inserting Arbitrary Characters into the Text

While FunnelWeb does not tolerate unprintable characters in the input file (except for the end of line character and the tabs that Eli expands into spaces), it does allow the user to specify that unprintable characters appear in the product file. The @^ sequence inserts a single character of the user's choosing into the text. The character can be specified by giving its ASCII number in one of four bases: binary, octal, decimal, and hexadecimal. Here is the syntax:

control_sequence ::= `@^' char_spec .

char_spec ::= binary / octal / decimal / hexadecimal .

binary ::= (`b' / `B') `(' {binary_digit}8 `)' .

octal ::= (`o' / `O' / `q' / `Q') `(' {octal_digit}3 `)' .

decimal ::= (`d' / `D') `(' {decimal_digit}3 `)' .

hexadecimal ::= (`h' / `H' / `x' / `X') `(' {hex_digit}2 `)' .

binary_digit ::= `0' / `1' .

octal_digit ::= binary_digit / `2' / `3' / `4' / `5' / `6' / `7' .

decimal_digit ::= octal_digit / `8' / `9' .

hex_digit ::= decimal_digit / `A' / `B' / `C' / `D' / `E' / `F' /
                       `a' / `b' / `c' / `d' / `e' / `f' .

Example:

@! Unix Make requires that productions commence with tab characters.
@^D(009)prog.o <- prog.c

Note that the decimal 9 is expressed with leading zeros as 009. FunnelWeb requires a fixed number of digits for each base. Eight digits for base two, three digits for base ten, three digits for base eight and two digits for base sixteen.

FunnelWeb treats the character resulting from a @^ sequence as ordinary text in every sense. If your input file contains many instances of a particular control character, you can package it up in a macro like any other text. In particular, quick names can be used to great effect:

@! Unix "Make" requires that productions commence with tab characters.
@! So we define a macro with a quick name as a tab character.
@$@#T@{@^D(009)@}
@! And use it in our productions.
@#Tprog.o <- prog.c
@#Ta.out <- prog.o

Warning: If you insert a Unix newline character (decimal 10) into the text, FunnelWeb will treat this as an end of line sequence regardless of what the character sequence for end of line is on the machine upon which it is running. Unix EOL is FunnelWeb's internal representation for end of line. Thus, in the current version of FunnelWeb, inserting character 10 into the text is impossible unless this also happens to be the character used by the operating system to mark the end of line.

Comments

When FunnelWeb encounters the @! sequence during its left-to-right scan of the line, it throws away the rest of the line (including the EOL) without analysing it further. Comments can appear in any line except @i, @t, and @p lines.

FunnelWeb comments can be used to insert comments into your input file that will neither appear in the product files nor in the documentation file, but will be solely for the benefit of those reading and editing the input file directly. Example:

@! I have used a quick macro for this definition as it will be used often.
@$@#C@{--@}

Because comments are defined to include the end-of-line marker, care must be taken when they are being added or removed within the text of macro bodies. For example the text fragment

for (i=0;i<MAXVAL;i++)      @! Print out a[0..MAXVAL-1].
   printf("%u\n",a[i]);
will expand to

for (i=0;i<MAXVAL;i++)         printf("%u\n",a[i]);

This problem really has no solution; if FunnelWeb comments were defined to omit the end of line marker, the expanded text would contain trailing blanks! As it is, FunnelWeb comments are designed to support single line comments which can be inserted and removed as a line without causing trouble. For example:

@! Print out a[0..MAXVAL-1].
for (i=0;i<MAXVAL;i++)
   printf("%u\n",a[i]);

If you want a comment construct that does not enclose the end of line marker, combine the insert end of line construct @+ with the comment construct @! as in

for (i=0;i<MAXVAL;i++)      @+@! Print out a[0..MAXVAL-1].
   printf("%u\n",a[i]);

FunnelWeb comments should really only be used to comment the FunnelWeb constructs being used in the input file. Comments on the target code are best placed in comments in the target language or in the documenting text surrounding the macro definitions. In the example above, a C comment would have been more appropriate.

Quick Names

FunnelWeb provides a quick name syntax as an alternative, for macros whose name consists of a single character, to the angle bracket syntax usually used (eg. @<Sloth@>). A quick name sequence consists of @#x where x, the name of the macro, can be any printable character except space.

quick_name ::= `@#' non_space_printable .

The result is identical to the equivalent ordinary name syntax, but is shorter. For example, @#X is equivalent to @<X@>. This shorter way of writing one-character macro names is more convenient where a macro must be used very often. For example, the macro calls in the following fragment of an Ada program are a little clumsy.

@! Define @<D@> as "" to turn on debug code and "--" to turn it off.
@$@<D@>@{--@}
@<D@>assert(b>3);
@<D@>if x>7 then write("error") end if

The calls can be shortened using the alternative syntax.

@! Define @#| as "" to turn on debug code and "--" to turn it off.
@$@#|@{--@}
@#|assert(b>3);
@#|if x>7 then write("error") end if

Inserting End of Line Markers

An end of line marker/character can be inserted into the text using the @+ sequence. This is exactly equivalent to a real end of line in the text at the point where it occurs. While this feature may sound rather useless, it is very useful for laying out the input file. For example, the following input data for a database program

Animal = Kangaroo
Size   = Medium
Speed  = Fast

Animal = Sloth
Size   = Medium
Speed  = Slow

Animal = Walrus
Size   = Big
Speed  = Medium
can be converted into

Animal = Kangaroo  @+Size = Medium  @+Speed = Fast    @+
Animal = Sloth     @+Size = Medium  @+Speed = Slow    @+
Animal = Walrus    @+Size = Big     @+Speed = Medium  @+
which is easier to read, and more easily allows comparisons between records.

Suppressing End of Line Markers

End of line markers can be suppressed by the @- sequence. A single occurrence of a @- sequence serves to suppress only the end of line marker following it and must appear exactly before the end of line marker to be suppressed. No trailing spaces, @! comments, or any other characters are permitted between a @- sequence and the end of line that it is supposed to suppress. The @- sequence is useful for constructing long output lines without them having to appear in the input. It can also be used in the same way as the @+ was used in the previous section to assist in exposing the structure of output text without affecting the output text itself. Finally, it is invaluable for suppressing the EOL after the opening macro text @{ construct. For example:

@$@<Walrus@>@{@-
I am the walrus!@}
is equivalent to

@$@<Walrus@>@{I am the walrus!@}

The comment construct (@!) can also be used to suppress end of lines. However, the @- construct should be preferred for this purpose as it makes explicit the programmer's intent to suppress the end of line.

Include Files

FunnelWeb provides an include file facility with a maximum depth of 10. When FunnelWeb sees a line of the form @i <filename>, it replaces the entire line (including the EOL) with the contents of the specified include file. FunnelWeb's include file facility is intended to operate at the line level. If the last line of the include file is not terminated by an EOL, FunnelWeb issues a warning and inserts one (in the copy in memory). In Eli, an include file must have type `.fwi'.

The @i construct is illegal if it appears anywhere except at the start of a line. The construct must be followed by a single blank. The file name is defined to be everything between the blank and the end of the line (no comments (@!) please!). Example: If the input file is

"Uh Oh, It's the Fuzz.  We're busted!" said Baby Bear.
@i mr_plod.txt
"Quick! Flush the stash down the dunny and let's split." said Father Bear.
and there is a file called `mr_plod.txt' containing

"'Ello, 'Ello, 'Ello! What's all this 'ere then?" Mr Plod exclaimed.
then the scanner translates the input file into

"Uh Oh, It's the Fuzz.  We're busted!" said Baby Bear.
"'Ello, 'Ello, 'Ello! What's all this 'ere then?" Mr Plod exclaimed.
"Quick! Flush the stash down the dunny and let's split." said Father Bear.

As a point of terminology, FunnelWeb calls the original input file the input file and calls include files and their included files include files.

The include file construct operates at a very low level. An include line can appear anywhere in the input file regardless of the context of the surrounding lines.

FunnelWeb sets the special character to the default (@) at the start of each include file and restores it to its previous value at the end of the include file. This allows macro libraries to be constructed and included that are independent of the prevailing special character at the point of inclusion. The same goes for the input line length limit which is reset to the default value at the start of each include file and restored to its previous value afterwards.

Maximum Input Line Length

FunnelWeb generates an error for each input line that exceeds a certain maximum number of characters. At the start of the processing of each input file and each include file, this maximum is set to a default value of 80. However, the maximum can be changed using a maximum input line length pragma.

pragma_mill ::= ps `maximum_input_line_length' s `=' s numorinf .

ps ::= (`@p' / `@P') ` ' .

number ::= { decimal_digit }+ .

numorinf ::= number / `infinity' .

s ::= {` '}+ .

The maximum input line length can be varied dynamically throughout the input file. Each maximum input line length pragma's scope covers the line following the pragma through to and including the next maximum input line length pragma, but not covering any intervening include files. At the start of an include file, FunnelWeb resets the maximum input line length to the default value. It restores it to its previous value at the end of the include file.

This pragma is useful for detecting text that has strayed off the right side of the screen when editing. If you use FunnelWeb, and set the maximum input line length to be the width of your editing window, you will never be caught by, for example, off-screen opening comment symbols. You can also be sure that your source text can be printed raw, if necessary, without lines wrapping around.


Previous Chapter Next Chapter Table of Contents