简体   繁体   中英

Generalized Bottom up Parser Combinators in Haskell

I am wondered why there is no generalized parser combinators for Bottom-up parsing in Haskell like a Parsec combinators for top down parsing.
( I could find some research work went during 2004 but nothing after
https://haskell-functional-parsing.googlecode.com/files/Ljunglof-2002a.pdf http://www.di.ubi.pt/~jpf/Site/Publications_files/technicalReport.pdf )

Is there any specific reason for not achieving it?

This is because of referential transparency. Just as no function can tell the difference between

let x = 1:x
let x = 1:1:1:x
let x = 1:1:1:1:1:1:1:1:1:...  -- if this were writeable

no function can tell the difference between a grammar which is a finite graph and a grammar which is an infinite tree. Bottom-up parsing algorithms need to be able to see the grammar as a graph, in order to enumerate all the possible parsing states.

The fact that top-down parsers see their input as infinite trees allows them to be more powerful, since the tree could be computationally more complex than any graph could be; for example,

numSequence n = string (show n) *> option () (numSequence (n+1))

accepts any finite ascending sequence of numbers starting at n . This has infinitely many different parsing states. (It might be possible to represent this in a context-free way, but it would be tricky and require more understanding of the code than a parsing library is capable of, I think)

A bottom up combinator library could be written, though it is a bit ugly, by requiring all parsers to be "labelled" in such a way that

  • the same label always refers to the same parser, and
  • there is only a finite set of labels

at which point it begins to look a lot more like a traditional specification of a grammar than a combinatory specification. However, it could still be nice; you would only have to label recursive productions, which would rule out any infinitely-large rules such as numSequence .

As luqui's answer indicates a bottom-up parser combinator library is not a realistic. On the chance that someone gets to this page just looking for haskell's bottom up parser generator , what you are looking for is called the Happy parser generator . It is like yacc for haskell.

As luqui said above: Haskell's treatment of recursive parser definitions does not permit the definition of bottom-up parsing libraries. Bottom-up parsing libraries are possible though if you represent recursive grammars differently. With apologies for the self-promotion, one (research) parser library that uses such an approach is grammar-combinators . It implements a grammar transformation called the uniform Paull transformation that can be combined with the top-down parser algorithm to obtain a bottom-up parser for the original grammar.

@luqui essentially says, that there are cases in which sharing is unobservable. However, it's not the case in general: many approaches to observable sharing exist. Eg http://www.ittc.ku.edu/~andygill/papers/reifyGraph.pdf mentions a few different methods to achieve observable sharing and proposes its own new method:

This looping structure can be used for interpretation, but not for further analysis, pretty printing, or general processing. The challenge here, and the subject of this paper, is how to allow trees extracted from Haskell hosted deep DSLs to have observable back-edges, or more generally, observable sharing. This a well-understood problem, with a number of standard solutions.

Note that the "ugly" solution of @liqui is mentioned by the paper under the name of explicit labels . The solution proposed by the paper is still "ugly" as it uses so called "stable names", but other solutions such as http://www.cs.utexas.edu/~wcook/Drafts/2012/graphs.pdf (which relies on PHOAS) may work.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM