Aachen University of Technology,
Dept. of CS,
Lehrstuhl für Informatik II,
Research,
FP
The Haskell Parser
Purpose
We developed a parser for Haskell for two reasons:
Implementation
The first version of our Haskell parser has been written by the student Philipp van Hüllen with the parser combinators of Graham Hutton and Erik Meijer, including their extension for handling the offside rule.
The current version has the following main limitations:
- syntax errors do not cause any error messages but lead to failure
- it is much too slow
- it is hardly tested
- it is documented in German
However, there exists a lot of literature on parser combinators and I have some ideas of my own on how to attack the first two problems.
You may download the current version from here.
Please tell us if you found it useful.
Some remarks about writing a Haskell parser
I don't consider it a good idea to have the parser directly construct
the algebraic data structure of the abstract syntax tree. You never know
the needs of users of the parser. Hence our parser constructs the
abstract syntax tree by using functions, not data constructors. These
functions are defined in another module. Currently in the only such
module we have, all functions are defined as data constructors of the
algebraic data types which are defined in a third module. For defining
these algebraic data types we did use GHC's HsSyn as model (substituting
our parser for GHC's would be a really good test).
There is still the problem of which information should be passed to the
functions that construct the syntax tree. Source location information is
already collected because of the off-side rule. So every syntactic
construct is told its start location. However, should it also know its
end location?
Some applications may require parsing comments as well.
There is also the problem of how to handle fixities of imported
operators. GHC takes the approach of first parsing all operators as
being left-associative with identical priority. After all imports are
processed (and hence fixities are known), another compiler phase
modifies the syntax tree accordingly. We currently take this approach as well.
Laziness would actually allow the neat trick of calling the (complete)
parser function with a table of fixities, which is calculated from the
syntax tree which is produced by the parser. I just fear that this
method may lead to an obscure order of error messages.
Olaf Chitil,
chitil@i2.informatik.rwth-aachen.de,
Last update: November 14, 1997