Parsley: optimising and improving parser combinators
Author(s)
Willis, Jamie
Type
Thesis or dissertation
Abstract
Parser combinators are a functional abstraction for parsing that abstracts hand-written recursive-descent
parsers behind a high-level set of combinators. While these kinds of parsers are popular in the
functional programming community, they have been historically criticised:
* Parser combinator performance is sub-par compared with handwritten parsers.
* The high-level grammar is obscured by the combinators compared with parser generators.
* The error messages generated by parser combinators are not of fantastic quality.
This dissertation addresses each of these complaints with the work split across two libraries,
both called `parsley`: one in Haskell and the other in Scala. Within these libraries,
different issues are tackled.
Haskell `parsley` addresses the performance concerns of parser combinators by modeling
them as a strongly-typed deep embedding, allowing for optimisations and analysis to be
performed. To eliminate the overheads of interpretation and achieve high performance,
`parsley` makes use of staged metaprogramming to convert the continuation-passing
style automaton into Haskell code resembling hand-written recursive descent parsers; this is
faster than contemporary parser combinator libraries in Haskell.
Writing parsers is often an ad-hoc exercise; this dissertation introduces parser combinator
design patterns that help structure and standardise how these parsers should be written.
These patterns focus on a handful of issues: cleanly handling precedence hierarchies and
expression parsing; organising and distinguishing between low-level tokens and higher-level
parsing; and abstracting away semantic actions and meta-data processing. This helps to make
clean, maintainable, parsers.
Finally, Scala `parsley` has focused on improving the design of parser combinator
error systems. The design of this improved system is explored as well as how it can be
implemented efficiently, minimising impact on the "happy path." This gives rise to more
parsing patterns as well as enriching existing patterns to incorporate errors. The new
patterns help provide the tools to build bespoke, descriptive, errors.
parsers behind a high-level set of combinators. While these kinds of parsers are popular in the
functional programming community, they have been historically criticised:
* Parser combinator performance is sub-par compared with handwritten parsers.
* The high-level grammar is obscured by the combinators compared with parser generators.
* The error messages generated by parser combinators are not of fantastic quality.
This dissertation addresses each of these complaints with the work split across two libraries,
both called `parsley`: one in Haskell and the other in Scala. Within these libraries,
different issues are tackled.
Haskell `parsley` addresses the performance concerns of parser combinators by modeling
them as a strongly-typed deep embedding, allowing for optimisations and analysis to be
performed. To eliminate the overheads of interpretation and achieve high performance,
`parsley` makes use of staged metaprogramming to convert the continuation-passing
style automaton into Haskell code resembling hand-written recursive descent parsers; this is
faster than contemporary parser combinator libraries in Haskell.
Writing parsers is often an ad-hoc exercise; this dissertation introduces parser combinator
design patterns that help structure and standardise how these parsers should be written.
These patterns focus on a handful of issues: cleanly handling precedence hierarchies and
expression parsing; organising and distinguishing between low-level tokens and higher-level
parsing; and abstracting away semantic actions and meta-data processing. This helps to make
clean, maintainable, parsers.
Finally, Scala `parsley` has focused on improving the design of parser combinator
error systems. The design of this improved system is explored as well as how it can be
implemented efficiently, minimising impact on the "happy path." This gives rise to more
parsing patterns as well as enriching existing patterns to incorporate errors. The new
patterns help provide the tools to build bespoke, descriptive, errors.
Version
Open Access
Date Issued
2023-08
Date Awarded
2024-03
Copyright Statement
Creative Commons Attribution Licence
License URL
Advisor
Wu, Nicolas
Publisher Department
Computing
Publisher Institution
Imperial College London
Qualification Level
Doctoral
Qualification Name
Doctor of Philosophy (PhD)