The Parser Class

Setting up the Parser Class

To make a C++ class for the parser, you should have a grammar file that looks like
    %{ 
    ... // Normal declaration stuff goes here 
    #define YLMM_PARSER_CLASS parser
    #define YLMM_LEX_STATIC 
    #include <ylmm/yaccmm.hh> 
    %}
   ...// The rest is as usual 

The file ylmm/yaccmm.hh redefines the usual Bison macros to forward calls to the user provided class (or it's base class ylmm::basic_parser). The macro YYPRINT is defined to forward calls to ylmm::basic_parser::print, YYFPRINTF to ylmm::basic_parser::message, YYFPRINTF to ylmm::basic_parser::trace, yyerror to ylmm::basic_parser::error. YYLTYPE is defined to be ylmm::location, and YYLLOC_DEFAULT to call ylmm::location::last. To customize the behaviour, overload the corresponding member functions in you derived classess.

You are free to define macros YYPARSE_PARAM, YYLEX_PARAM, YYERROR_VERBOSE, YYLSP_NEEDED, and so on.

The file ylmm/yaccmm.hh defines the interface to the generated C function via a C++ class, where YLMM_PARSER_CLASS is the name of the user parser class. The developer can either specify this as a specific instantation of ylmm::basic_parser, or it can be a sub-class of specific instantation of ylmm::basic_parser.

The macro YLMM_LEX_STATIC must be defined if the Yacc input file isn't a pure parser. If it's defined, the static function int yylex() will be defined. If your grammar uses location information, then you need to define YLMM_LEX_STATIC_LOCATION instead of YLMM_LEX_STATIC. Location information comes about in a Bison grammar via the use of @$ or @N (where N is a number) in the grammar rules or an explicit definition of the preprocessor constant YYLSP_NEEDED in the declaration part.

If the sematic token type isn't defined before inclussion of ylmm/yacmm.hh, then it is defined to be ylmm::basic_parser::token_type of the specific instantation.

If the grammar defines a Pure (that is reentrant) parser (via the Bison directive pure_parser), then YLMM_LEX_STATIC must not be defined.

A derived class must define the member function scan. Hence, a minimal derived class looks like.

    #ifndef YLMM_basic_parser
    #include <ylmm/basic_parser.hh>
    #endif

    class parser : public ylmm::basic_parser<YYSTYPE> { 
    public:
      int  scan(); 
    }
The scan member function should be defined in the regular way of yylex - that is, it should read a token from the input and return it's sematic token number (see also the Bison documentation).

If the application uses the ylmm::basic_parser template directly, the ylmm::basic_parser::scan member function must be defined for the particular instantitation, as the default behaviour is to end parsing immediately.

See also the example simple_parser.yy for and example of simple usage, and toycalc_parser.yy for a more complex usage.

Example Grammar

The grammar is defined in the usual way. Note, that a pointer to the parser is passed as the parameter _parser to the Bison generated C function. Hence, you can use that pointer in actions of the grammar.

Below follows a simple example of a parser of integers. This is reproduced from the example simple_parser.yy . See also the example toycalc_parser.yy for a more complex example.

First, we setup the usual stuff (see Setting up the Parser Class).

%{
  /* Declarations */
#include "simple_parser.hh"
#define YLMM_PARSER_CLASS simple_parser
#define YLMM_LEX_STATIC
#include <ylmm/yaccmm.hh>
%}

Next we define all the tokens and precedence rules.

%token NUM
%token NEWLINE
%%

And then finally we make the production rules, where we use the pointer to the parser object (via the _parser static variable), to make the actual productions

input   : /* empty string */
        | input line
        ;

line    : NEWLINE       { $$ = _parser->result();   }
        | NUM NEWLINE   { $$ = _parser->result($1); }
        | error NEWLINE { yyerrok;                 }
        ;
%%

The Parser Class

The user defined parser class can then be setup to make a parse tree, using its member functions.

Below follows a simple example of a parser of integers. This is reproduced from the example simple_parser.hh . See also toycalc_parser.hh for a more complex example.

Here, we use an object of a ylmm::basic_scanner derived class to do the lexical scanning. That class will be explored in Example Lexical Definition below.

class simple_parser : public ylmm::basic_parser<int> 
{
private:
  ylmm::basic_scanner<int>& _scanner; /** Reference to scanner */
public:
  /** Constructor
      @param s Reference to scanner */
  simple_parser(ylmm::basic_scanner<int>& s) : _scanner(s) 
  {}
  /** Destructor  */
  virtual ~simple_parser() {}
  /** Scan the input via forwarded call to the scanner 
      @param arg Optional argument. 
      @return the scanner token ID */
  int scan(void* arg=0) 

We define the needed member functions (see Setting up the Parser Class), as well as a couple a utility function. The error member functions uses the data member ylmm::basic_parser::_err_stream for output. The point is, that the user can set these so that all error messages are treated the same when using other libraries, etc.

  { 
    return _scanner.next(token()); 
  }
  /** On errors, advance one line on error stream and show the prompt.  
      @param m The message to print */
  void fatal(const char* m) 
  { 
    ylmm::basic_parser<int>::fatal(m); 
    if (_messenger) _messenger->error_stream() << std::endl; 

And finally, we define the member functions to deal with the semantic tokens to generated by the production rules in the parser file. Notice that we again use the data member ylmm::basic_parser::_msg_stream for output so that we can insure coherient output from the client application.

  }
  /** Process an expression 
      @param val The value of the message 
      @return @a val */
  int result(int val=0) 
  { 
    message("\t'%d'\n",  val); return val; 
  }
};
Top of page
Christian Holm (home page)
Last update Fri Jul 8 12:58:03 2005
Created by DoxyGen 1.4.3-20050530