简体   繁体   中英

AST and Parsing in Haskell

I have an assignment and I can't figure out how to define the answer.

The assignment

Write the function exp:: [String] -> (AST, [String])

AST :

  • If x is a number, it should say Number x .
  • If it is a "+" og a "-" it should say Atom x .
  • If it reads a "(", then all the contents behind the "(" until it comes a ")" should be a List [AST] .

so that the outputs will be:

exp (token "(hi (4) 32)")
> (List [Atom "hi", List [Number 4], Number 32], [])

exp (token "(+ 3 42 654 2)") 
> (List [Atom "+", Number 3, Number 42, Number 654, Number 2], [])

exp (token "(+ 21 444) junk") 
> (List [Atom "+", Number 21, Number 444], ["junk"])

what I have so far

I already have the token-function, token :: String -> [String] which makes a list.

Example:
exp :: [String] -> (AST, [String])
exp [] = error "Empty list"
exp (x:xs)  | x == ")"      = error ""
            | x == "("      = let (e, ss') = exp xs in (List [getAst xs], ss')
            | x == "+"      = let (e, ss') = exp xs in (Atom (read x), ss')
            | x == "-"      = let (e, ss') = exp xs in (Atom (read x), ss')
            | otherwise     = exp xs`

The exp function looks like this:

getAst :: [String] -> AST
getAst [] = error ""
getAst (x:xs)
            | x == ")"  = error ""
            | x == "("  = (List [getAst xs])
            | isAtom x  = (Atom x) 
            | isNum x   = (Number (read x))
            | otherwise = getAst xs`

where the getAst function:

 getAst :: [String] -> AST getAst [] = error "" getAst (x:xs) | x == ")" = error "" | x == "(" = (List [getAst xs]) | isAtom x = (Atom x) | isNum x = (Number (read x)) | otherwise = getAst xs` 

(And yes, I am a beginner in Haskell..)

I think I can try to help you out a bit.

The way the problem is represented you should be able to do this by just looking at the next input/token and decide from there where to go.

some assumptions

The way the data is represented as [String] -> (Ast, [String]) I assume it's a common parser, where a parser tries to read some parts of the input and return the parsed/transformed output together with the rest of the input that it did not transform (so just the two parsts of the tuple - the Ast and the rest of the input).

the AST type

as you did not include it I assumed it to be:

data Ast
  = Number Int
  | Atom String
  | List [Ast]
  deriving Show

some things I gonna need

I need a few things:

import Prelude hiding (exp)

import Control.Applicative ((<$>))
import Data.Maybe (fromJust, isJust)

I have to hide exp as we want to use this as a function-name.

Then I want to fmap over a Maybe so I include the operator from Control.Applicative . This is really just this, in case you did not see it before:

f <$> Nothing = Nothing
f <$> Just a  = Just (f a)

I want some helpers for Maybe :

  • isJust to check if for Just _
  • and fromJust to get the a from Just a

Finally I need this helper-function to read a bit more safe:

tryRead :: (Read a) => String -> Maybe a
tryRead input =
  case readsPrec 0 input of
    (a,_):_ -> Just a
    _       -> Nothing

This will try to read a number here - returing Just n if n is a number and Nothing otherwise.

a first go

Here is a unfinished first go at your problem:

exp :: [String] -> (Ast, [String])
exp (lookat:rest)
  | isJust number = (fromJust number, rest)
  | lookat == "("  = parseList rest []
  where number = Number <$> tryRead lookat

parseList :: [String] -> [Ast] -> (Ast, [String])
parseList inp@(lookat:rest) acc
  | lookat == ")" = (List (reverse acc), rest)
  | otherwise    = let (el, rest') = exp inp
                   in parseList rest' (el:acc)

As you can see I just branch based on lookat but with a slight twist:

In case I see a number I return the number and the rest-token-list. In case I see a ( I start another parser parseList .

parseList will do the same: - it looks at the first token - if the token is a ) it finishes the current list (it uses the accumulator technique for this) and returns. - if not it uses the existing exp parser to get the elements of the list recursively.

Here is an example run:

λ> let input = ["(", "2", "(", "3", "4", ")", "5", ")"]

λ> exp input
(List [Number 2,List [Number 3,Number 4],Number 5],[])

TODO

There are some border cases left you have to decide on (what if there are no input tokens?).

And of course you have to add the case for Atom s - to finish this excecise.

full solution

ok - 3 hours later the OP did not check in again so I guess I can post a complete solution. I hope I did not forget any edge cases and this sureley is not the most efficient implementation ( tokens comes to mind) - but the examples the OP gave all match:

module Ast where

import Prelude hiding (exp)

import Control.Applicative ((<$>))
import Data.Char (isSpace, isControl)
import Data.Maybe (fromJust, isJust)

data Ast
  = Number Int
  | Atom String
  | List [Ast]
  | Empty
  deriving Show

type Token = String

main :: IO ()
main = do
  print $ parse "(hi (4) 32)"
  print $ parse "(+ 3 42 654 2)"
  print $ parseAst . tokens $ "(+ 21 444) junk"

parse :: String -> Ast
parse = fst . parseAst . tokens

parseAst :: [Token] -> (Ast, [Token])
parseAst [] = (Empty, [])
parseAst (lookat:rest)
  | isJust number = (fromJust number, rest)
  | lookat == "("  = parseList rest []
  | otherwise     = (Atom lookat, rest)
  where number = Number <$> tryRead lookat

parseList :: [Token] -> [Ast] -> (Ast, [Token])
parseList [] _ = error "Syntax error: `)` not found"
parseList inp@(lookat:rest) acc
  | lookat == ")" = (List (reverse acc), rest)
  | otherwise    = let (el, rest') = parseAst inp
                   in parseList rest' (el:acc)
tokens :: String -> [Token]
tokens = split ""
  where split tok "" = add tok []
        split tok (c:cs)
          | c == '(' || c == ')' = add tok $ [c] : split "" cs
          | isSpace c || isControl c = add tok $ split "" cs
          | otherwise = split (tok ++ [c]) cs
        add "" tks = tks
        add t tks =  t : tks

tryRead :: (Read a) => Token -> Maybe a
tryRead input =
  case readsPrec 0 input of
    (a,_):_ -> Just a
    _       -> Nothing

example run

λ> :main
List [Atom "hi",List [Number 4],Number 32]
List [Atom "+",Number 3,Number 42,Number 654,Number 2]
(List [Atom "+",Number 21,Number 444],["junk"])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM