简体   繁体   中英

Haskell Happy parser not going further

I'm implementing a parser for a language similar to Oberon. I've successfully written the lexer using Alex since I can see that the list of tokens returned by the lexer is correct.

When I give the tokens list to the parser, it stops at the first token.

This is my parser:

...

%name myParse
%error { parseError }

%token
    KW_PROCEDURE        { KW_TokenProcedure }
    KW_END              { KW_TokenEnd }
    ';'                 { KW_TokenSemiColon }
    identifier          { TokenVariableIdentifier $$ }

%%

ProcedureDeclaration    :   ProcedureHeading ';' ProcedureBody identifier     { putStrLn("C") }

ProcedureHeading        :   KW_PROCEDURE identifier { putStrLn("D") }

ProcedureBody           :   KW_END                      { putStrLn("E") }   
                        | DeclarationSequence KW_END    { putStrLn("F") }

DeclarationSequence     :   ProcedureDeclaration        { putStrLn("G") }

{
parseError :: [Token] -> a
parseError _ = error "Parse error"

main = do
  inStr <- getContents
  print (alexScanTokens inStr)
  myParse (alexScanTokens inStr)
  putStrLn("DONE")
}

This is the test code I give to the parser:

PROCEDURE proc;
END proc

This is the token list returned by the lexer:

[KW_TokenProcedure,TokenVariableIdentifier "proc",KW_TokenSemiColon,KW_TokenEnd,TokenVariableIdentifier "proc"]

The parser does't give any error, but it sticks to my ProcedureDeclaration rule, printing only C.

This is what the output looks like:

C
DONE

Any idea why?


UPDATE:

I've made a first step forward and I was able to parse the test input given before. Now I changed my parser to recognize the declaration of multiple procedures on the same level. To do this, this is how my new parse looks like:

...

%name myParse
%error { parseError }

%token
    KW_PROCEDURE        { KW_TokenProcedure }
    KW_END              { KW_TokenEnd }
    ';'                 { KW_TokenSemiColon }
    identifier          { TokenVariableIdentifier $$ }

%%

ProcedureDeclarationList    :   ProcedureDeclaration                                { $1 }
                            |   ProcedureDeclaration ';' ProcedureDeclarationList   { $3:[$1] }

ProcedureDeclaration        :   ProcedureHeading ';' ProcedureBody identifier       { addProcedureToProcedure $1 $3 }

ProcedureHeading            :   KW_PROCEDURE identifier                             { defaultProcedure { procedureName = $2 } }

ProcedureBody               :   KW_END                                              { Nothing }
                            |   DeclarationSequence KW_END                          { Just $1 }

DeclarationSequence         :    ProcedureDeclarationList                           { $1 }

{
parseError :: [Token] -> a
parseError _ = error "Parse error"

main = do
  inStr <- getContents
  let result = myParse (alexScanTokens inStr)
  putStrLn ("result: " ++ show(result))
}

The thing is, it fails to compile giving me this error:

Occurs check: cannot construct the infinite type: t5 ~ [t5]
    Expected type: HappyAbsSyn t5 t5 t6 t7 t8 t9
                   -> HappyAbsSyn t5 t5 t6 t7 t8 t9
                   -> HappyAbsSyn t5 t5 t6 t7 t8 t9
                   -> HappyAbsSyn t5 t5 t6 t7 t8 t9
      Actual type: HappyAbsSyn t5 t5 t6 t7 t8 t9
                   -> HappyAbsSyn t5 t5 t6 t7 t8 t9
                   -> HappyAbsSyn t5 t5 t6 t7 t8 t9
                   -> HappyAbsSyn [t5] t5 t6 t7 t8 t9
    ...

I know for sure that it's caused by the second element of my ProcedureDeclarationsList rule, but I don't understand why.

There are two things to note here.

  1. happy uses the first production rule as the top-level production for myParse .

Your first production rule is ProcedureDeclaration , so that's all it's going to try to parse. You probably want to make DeclarationSequence the first rule.

  1. The return type of your productions are IO-actions, and in Haskell IO-actions are values. They are not "executed" until they become part of main . That means you need to write your productions like this:

     DeclarationSequence : ProcedureDeclaration { do $1; putStrLn("G") } ProcedureDeclaration : ProcedureHeading ';' ProcedureBody identifier { do $1; $3; putStrLn("C") } 

That is, the return value of the DeclarationSequence rule is the IO-action returned by ProcedureDeclaration followed by putStrLn "G" .

And the return value of the ProducedureDeclaration rule is the action returned by ProcudureHeading followed by the action returned by ProcedureBody followed by putStrLn "C" .

You could also write the RHS of the rules using the >> operator:

{ $1 >> putStrLn "G" }
{ $1 >> $3 >> putStrLn "C" }

Note that you have to decide the order in which to sequence the actions - ie pre-/post-/in- order.

Working example: http://lpaste.net/162432

It seems okay your expression has been parsed just fine. Check the return type of myParse , I guess it will be IO () , and the actual action will be putStrLn("D") - is what your wrote in ProcedureDeclaration . Next, your put call to myParse in the do block, it will be interpreted as print .. >> myParse (..) >> putStrLn .. or just linking monadic actions. myParse will return an action which will print "D" so the output is exactly what one would expect.

You have other actions defined in ProcedureBody and DeclarationSequence . But you never use these actions in any way, it's like you will write:

    do
      let a = putStrLn "E"
      putStrLn("C")

Which will output "C", a is not used by any means. Same with your parser. If you want to invoke these actions, try to write $1 >> putStrLn("C") >> $2 in ProcedureDeclaration associated code.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM