简体   繁体   中英

Parse String to Datatype in Haskell

I'm taking a Haskell course at school, and I have to define a Logical Proposition datatype in Haskell. Everything so far Works fine (definition and functions), and i've declared it as an instance of Ord, Eq and show. The problem comes when I'm required to define a program which interacts with the user: I have to parse the input from the user into my datatype:

type Var = String
data FProp = V Var
           | No FProp
           | Y FProp FProp
           | O FProp FProp
           | Si FProp FProp
           | Sii FProp FProp

where the formula: ¬q ^ p would be: (Y (No (V "q")) (V "p"))

I've been researching, and found that I can declare my datatype as an instance of Read.

Is this advisable? If it is, can I get some help in order to define the parsing method?

the read part of a REPL interpreter typically looks like this

repl :: ForthState -> IO () -- parser definition
repl state
    = do putStr "> " -- puts a > character to indicate it's waiting for input
         input <- getLine -- this is what you're looking for, to read a line. 
         if input == "quit" -- allows user to quit the interpreter 
            then do putStrLn "Bye!"
                    return ()
            else let (is, cs, d, output) = eval (words input) state -- your grammar definition is somewhere down the chain when eval is called on input
                 in  do mapM_ putStrLn output
                        repl (is, cs, d, [])

main = do putStrLn "Welcome to your very own interpreter!"
          repl initialForthState -- runs the parser, starting with read

your eval method will have various loops, stack manipulations, conditionals, etc to actually figure out what the user inputted. hope this helps you with at least the reading input part.

Not a complete answer, since this is a homework problem, but here are some hints.

The other answer suggested getLine followed by splitting at words . It sounds like you instead want something more like a conventional tokenizer, which would let you write things like:

(Y
   (No (V q))
   (V p))

Here's one implementation that turns a string into tokens that are either a string of alphanumeric characters or a single, non-alphanumeric printable character. You would need to extend it to support quoted strings:

import Data.Char

type Token = String

tokenize :: String -> [Token]
{- Here, a token is either a string of alphanumeric characters, or else one
 - non-spacing printable character, such as "(" or ")".
 -}
tokenize [] = []
tokenize (x:xs) | isSpace x = tokenize xs
                | not (isPrint x) = error $
  "Invalid character " ++ show x ++ " in input."
                | not (isAlphaNum x) = [x]:(tokenize xs)
                | otherwise = let (token, rest) = span isAlphaNum (x:xs)
                              in token:(tokenize rest)

It turns the example into ["(","Y","(","No","(","V","q",")",")","(","V","p",")",")"] . Note that you have access to the entire repertoire of Unicode.

The main function that evaluates this interactively might look like:

main = interact ( unlines . map show . map evaluate . parse . tokenize )

Where parse turns a list of tokens into a list of ASTs and evaluate turns an AST into a printable expression.

As for implementing the parser, your language appears to have similar syntax to LISP, which is one of the simplest languages to parse; you don't even need precedence rules. A recursive-descent parser could do it, and is probably the easiest to implement by hand. You can pattern-match on parse ("(":xs) = , but pattern-matching syntax can also implement lookahead very easily, for example parse ("(":x1:xs) = to look ahead one token.

If you're calling the parser recursively, you would define a helper function that consumes only a single expression, and that has a type signature like :: [Token] -> (AST, [Token]) . This lets you parse the inner expression, check that the next token is ")" , and proceed with the parse. However, externally, you'll want to consume all the tokens and return an AST or a list of them.

The stylish way to write a parser is with monadic parser combinators. (And maybe someone will post an example of one.) The industrial-strength solution would be a library like Parsec, but that's probably overkill here. Still, parsing is (mostly!) a solved problem, and if you just want to get the assignment done on time, using a library off the shelf is a good idea.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM