Next: Describing a Complete Compiler Up: Beyond LEX and YACC: Previous: Beyond LEX and YACC:

Introduction

LEX and YACC (or FLEX and BISON) are useful because they allow a designer to describe the scanning and parsing problems to be solved, rather than requiring detailed solutions. Such descriptions, which explicitly list properties that a solution must have, are called declarative specifications. A program is an operational specification, which gives a recipe that has the desired properties without listing those properties explicitly.

Declarative specifications are easier than operational specifications to develop and maintain, because they describe the properties of the problem rather than the details of the solution. The cost of this advantage, however, is that designers and maintainers must learn more languages in order to use declarative specifications than to use operational specifications. For example, to specify scanners and parsers declaratively, a designer must learn the languages of regular expressions and context-free grammars; only a programming language must be learned to specify them operationally.

A declarative specification must ultimately be turned into code if the problem it describes is to be solved. The mechanism for this translation may be a program, or it may be a human. Having a program to do the translation greatly increases the usefulness of the specification, but it also requires an investment in software. Such investments only make sense when many instances of the given problem, each with different properties, must be solved. Clearly the scanning and parsing problems fit this profile, and hence a number of software tools are available for translating regular expressions and context-free grammars into scanning and parsing code.

If a technique is to be applicable to a large number of problem instances, it is usually incapable of describing any of those instances completely. Thus a mixture of declarative and operational specifications will be required. For example, a context-free grammar can describe a parsing problem, but that parsing problem is only a part of the compilation problem. The declarative context-free grammar is therefore augmented by operational specifications in the form of code fragments to be executed when the parser recognizes certain phrases.

As our understanding of a class of problems improves, we usually are able to recognize more things that are common to solutions of problems in that class. These common aspects of the problem solutions can then be embodied in additional declarative specification techniques for the problem class. The compilation problem has certainly followed this route: Algorithmic conversion of expressions to a machine-oriented form was first described in 1952 [Rut52], context-free grammars appeared in 1956 [Cho56] and were used as a declarative specification in the ALGOL 60 Report [Nau63]. Context-free grammars were initially translated to code by hand using recursive descent, a technique that remains viable today [Knu68].

In Section 2 I will show how the use of declarative specifications can be extended beyond the scanning and parsing subproblems of the compilation problem. Section 3 explains the concept of a domain-specific programming environment that embodies our understanding of the structure of a compiler, and shows how it is used to integrate the tools that translate the declarative specifications. A complete example of a simple compiler is given in Section 4.

Next: Describing a Complete Compiler Up: Beyond LEX and YACC: Previous: Beyond LEX and YACC:

2007-05-18