Parsifal Software


XIDEK
Extensible Interpreter Development Kit
Reference documentation


Abstract Syntax Tree Parser

Introduction

The baseline Abstract Syntax Tree parsers, cll\base\ast.syn and pll\base\ast.syn are used to create Abstract Syntax Trees for the CLL and PLL languages respectively. These trees, in turn, are used by the astxi and astci interpreters, which implement the buildTree function defined in support\base\astdefs.h.

The reduction procedures use the public member functions of the AbstractSyntaxTree class. Each of these functions creates a node for the parse tree and returns a pointer to the node. When the entire tree has been completed, a pointer to the root node is stored in the root member field of the tree object.

In what follows, we describe first the features of the CLL version of the ast.syn syntax file and how it differs from the cll.syn file. The differences between the PLL version of ast.syn and pll.syn are identical.


C Prologue

The C prologue in ast.syn contains the declaration of the Tree struct, derived from AbstractSyntaxTree.

The Tree struct provides linkage between the parser and the syntax tree code. It contains one member field:

  parser_pcb_struct *pcbPointer;
a pointer to the parser control block, and overrides context(), defined in AbstractSyntaxtTree to provide the correct context information for each node in the tree. The implementation of context() and the class constructor are found in the embedded C portion of the syntax file.

The macro AST is used simply to make the reduction procedures appear less forbidding.


Configuration Section

Configuration parameters that are common to all the parsers in the kit are discussed here. The following configuration parameters are set in ast.syn:

wrapper {AgStack<AstNode *>}
A wrapper declaration tells AnaGram to ensure that constructors and destructors are properly called when AgStack objects are stored on the parser value stack. If the wrapper declaration were not used, the objects would be stored on the stack by coercing a pointer. This would cause the AgStack class to malfunction. The rule of thumb is that if a class or any of its member fields overrides the assignment operator, it should have a wrapper declared.

parser name = parse
This statement causes the generated parser function to be named parse(). The struct defining the parser control block will be named parse_pcb_struct and a typedef parse_pcb_type equivalent to struct parse_pcb_struct is also defined.

default token type = AstNode *
This declaration causes the nonterminal tokens in the grammar, unless otherwise specified, to have the type AstNode *.


Parser Control Block Extensions

In order to support reentrancy, it is convenient to declare local data used by the parser in the parser control block. It is also convenient to declare functions used by the parser as members of the parser control block, though this is not, strictly speaking, necessary.


Added Fields

Tree ast;
This is the abstract syntax tree that the parser builds. The root node is set only as a final step. Until that time, the tree object largely serves to log the nodes that are created in the course of building the tree, so that if there is an error, memory will be correctly housekept.

int loopDepth;
loopDepth is used to keep track of nesting of loops at runtime. It is initialized to zero by the constructor. It is incremented on entering a while, do-while, or for loop and decremented on exit. It is inspected when a break or continue statement is encountered so that an error can be reported if a break or continue statement occurs outside a loop.


Local Functions

The actual implementations of the following functions are found in the embedded C portion of the syntax file:


Reduction Procedures

The useful work of any parser is carried out by the reduction procedures which indicate what is to be done when a grammar rule is matched. In the case of the abstract syntax tree parser, parse(), apart from the lexical rules, reduction procedures simply combine nodes furnished by the tokens in a rule to create a new node for the tree. The functions used are defined in described in astdefs.htm.


Embedded C

The embedded C section of the syntax file contains support code for the parser.

Macro Definitions

There are two macro definitions:
SYNTAX_ERROR
This definition overrides the default definition of SYNTAX_ERROR that AnaGram provides. The default definition simply writes the diagnostic to stderr and causes the parser to return with the exit_flag field of the parser control block set to AG_SYNTAX_ERROR_CODE. The overriding definition calls reportError() which formats an error message and throws an exception.

GET_CONTEXT
The GET_CONTEXT macro implements the context tracking feature of the AnaGram parser. In particular, it creates a FileLocation object that describes the current location in the input file and stores it on the context stack.


Tree Member Functions

FileLocation Tree::context() const;
This function overrides the place-holder function defined in the AbstractSyntaxTree class. The purpose of the function is to provide location information in each node of the tree.

Parser Control Block Member Functions

The following member functions of parse_pcb_struct are declared in the extensions to the parser control block.
parse_pcb_struct(const char *text);
A constructor is a convenient way to initialize the pointer field of the parser control block.

void reportError();
The parsing engine calls this function (as directed by the SYNTAX_ERROR macro) when it encounters a syntax error. The function formats the error information and throws an exception.

void checkLoop();
The checkLoop() function is used to verify that break and continue statements are inside loops. If no loop is active an exception is thrown.


External Interface Function

AbstractSyntaxTree buildTree(const char *text);
This function implements the external interface to the ast parser as defined here. It accomplishes this by declaring a parser control block, invoking the parser, and then returning the ast field of the parser control block.



Table of Contents | Parsifal Software Home Page


XIDEK
Extensible Interpreter Development Kit
Copyright © 1997-2002, Parsifal Software.
All Rights Reserved.