I am attempting to parse a boolean expression using the Happy library. The problem is that the result is not as good as I would want it when I introduce parentheses. I have made the following grammar.
Query : Expr { $1 }
Expr : Expr "OR" Term { ActOp Or $1 $3 }
| Expr "AND" Term { ActOp And $1 $3 }
| Term { Term $1 }
Term : '"' string '"' { QuotedWord $2 }
| string { Word $1 }
| '(' Expr ')' { Brack $2}
Below is the string to be parsed and the result.
"(computer AND science) OR cs" -> ActOp Or (Term (Brack (ActOp And (Term (Word "computer")) (Word "science")))) (Word "cs")
I would prefer if it was something like the following since it would be easier to interpret:
ActOp Or (ActOp And (Word "computer") (Word "science")) (Word "cs")
Edit - Full code
{
module BoolAst where
import Data.Char
import Data.List
}
%name translate
%tokentype { Token }
%token
string { TokenString $$ }
'"' { TokenQuote}
"AND" { TokenAnd }
"OR" { TokenOr }
'(' { TokenOb }
')' { TokenCb }
%%
Query : Expr { $1 }
Expr : Expr "OR" Term { ActOp Or $1 $3 }
| Expr "AND" Term { ActOp And $1 $3 }
| Term { Term $1 }
Term : '"' string '"' { QuotedWord $2 }
| string { Word $1 }
| '(' Expr ')' { Brack $2}
{
happyError :: [Token] -> a
happyError _ = error ("Parse error\n")
type Query
= Expr
data Expr
= ActOp Op Expr Term
| Term Term
deriving Show
data Op
= Or
| And
deriving Show
data Term
= QuotedWord String
| Word String
| Brack Expr
deriving Show
data Token
= TokenQuote
| TokenAnd
| TokenOr
| TokenString String
| TokenOb
| TokenCb
deriving Show
lexer :: String -> [Token]
lexer [] = []
lexer cs
| isPrefixOf "AND" cs = TokenAnd : (lexer $ drop 3 cs)
| isPrefixOf "OR" cs = TokenOr : (lexer $ drop 2 cs)
lexer (c:cs)
| isSpace c = lexer cs
| isAlpha c = lexVar (c:cs)
lexer ('"':cs) = TokenQuote : lexer cs
lexer ('(':cs) = TokenOb : lexer cs
lexer (')':cs) = TokenCb : lexer cs
lexVar cs =
case span isAlphaNum cs of
(var,rest) -> TokenString var : lexer rest
main = print $ translate . lexer $ "computer AND science OR cs"
Your data types do not have to correspond directly to the grammar rules. It is perfectly valid to have the Term nonterminal produce an Expr value. For example
data Expr
= ActOp Op Expr Expr
| QuotedWord String
| Word String
deriving Show
...
Expr : Expr "OR" Term { ActOp Or $1 $3 }
| Expr "AND" Term { ActOp And $1 $3 }
| Term { $1 } -- no need to wrap in a constructor
Term : '"' string '"' { QuotedWord $2 }
| string { Word $1 }
| '(' Expr ')' { $2 } -- no need to wrap in a constructor
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.