I'm implementing a parser for a language similar to Oberon. I've successfully written the lexer using Alex since I can see that the list of tokens returned by the lexer is correct.
When I give the tokens list to the parser, it stops at the first token.
This is my parser:
...
%name myParse
%error { parseError }
%token
KW_PROCEDURE { KW_TokenProcedure }
KW_END { KW_TokenEnd }
';' { KW_TokenSemiColon }
identifier { TokenVariableIdentifier $$ }
%%
ProcedureDeclaration : ProcedureHeading ';' ProcedureBody identifier { putStrLn("C") }
ProcedureHeading : KW_PROCEDURE identifier { putStrLn("D") }
ProcedureBody : KW_END { putStrLn("E") }
| DeclarationSequence KW_END { putStrLn("F") }
DeclarationSequence : ProcedureDeclaration { putStrLn("G") }
{
parseError :: [Token] -> a
parseError _ = error "Parse error"
main = do
inStr <- getContents
print (alexScanTokens inStr)
myParse (alexScanTokens inStr)
putStrLn("DONE")
}
This is the test code I give to the parser:
PROCEDURE proc;
END proc
This is the token list returned by the lexer:
[KW_TokenProcedure,TokenVariableIdentifier "proc",KW_TokenSemiColon,KW_TokenEnd,TokenVariableIdentifier "proc"]
The parser does't give any error, but it sticks to my ProcedureDeclaration rule, printing only C.
This is what the output looks like:
C
DONE
Any idea why?
UPDATE:
I've made a first step forward and I was able to parse the test input given before. Now I changed my parser to recognize the declaration of multiple procedures on the same level. To do this, this is how my new parse looks like:
...
%name myParse
%error { parseError }
%token
KW_PROCEDURE { KW_TokenProcedure }
KW_END { KW_TokenEnd }
';' { KW_TokenSemiColon }
identifier { TokenVariableIdentifier $$ }
%%
ProcedureDeclarationList : ProcedureDeclaration { $1 }
| ProcedureDeclaration ';' ProcedureDeclarationList { $3:[$1] }
ProcedureDeclaration : ProcedureHeading ';' ProcedureBody identifier { addProcedureToProcedure $1 $3 }
ProcedureHeading : KW_PROCEDURE identifier { defaultProcedure { procedureName = $2 } }
ProcedureBody : KW_END { Nothing }
| DeclarationSequence KW_END { Just $1 }
DeclarationSequence : ProcedureDeclarationList { $1 }
{
parseError :: [Token] -> a
parseError _ = error "Parse error"
main = do
inStr <- getContents
let result = myParse (alexScanTokens inStr)
putStrLn ("result: " ++ show(result))
}
The thing is, it fails to compile giving me this error:
Occurs check: cannot construct the infinite type: t5 ~ [t5]
Expected type: HappyAbsSyn t5 t5 t6 t7 t8 t9
-> HappyAbsSyn t5 t5 t6 t7 t8 t9
-> HappyAbsSyn t5 t5 t6 t7 t8 t9
-> HappyAbsSyn t5 t5 t6 t7 t8 t9
Actual type: HappyAbsSyn t5 t5 t6 t7 t8 t9
-> HappyAbsSyn t5 t5 t6 t7 t8 t9
-> HappyAbsSyn t5 t5 t6 t7 t8 t9
-> HappyAbsSyn [t5] t5 t6 t7 t8 t9
...
I know for sure that it's caused by the second element of my ProcedureDeclarationsList
rule, but I don't understand why.
There are two things to note here.
myParse
. Your first production rule is ProcedureDeclaration
, so that's all it's going to try to parse. You probably want to make DeclarationSequence
the first rule.
The return type of your productions are IO-actions, and in Haskell IO-actions are values. They are not "executed" until they become part of main
. That means you need to write your productions like this:
DeclarationSequence : ProcedureDeclaration { do $1; putStrLn("G") } ProcedureDeclaration : ProcedureHeading ';' ProcedureBody identifier { do $1; $3; putStrLn("C") }
That is, the return value of the DeclarationSequence
rule is the IO-action returned by ProcedureDeclaration
followed by putStrLn "G"
.
And the return value of the ProducedureDeclaration
rule is the action returned by ProcudureHeading
followed by the action returned by ProcedureBody
followed by putStrLn "C"
.
You could also write the RHS of the rules using the >>
operator:
{ $1 >> putStrLn "G" }
{ $1 >> $3 >> putStrLn "C" }
Note that you have to decide the order in which to sequence the actions - ie pre-/post-/in- order.
Working example: http://lpaste.net/162432
It seems okay your expression has been parsed just fine. Check the return type of myParse
, I guess it will be IO ()
, and the actual action will be putStrLn("D")
- is what your wrote in ProcedureDeclaration
. Next, your put call to myParse
in the do block, it will be interpreted as print .. >> myParse (..) >> putStrLn ..
or just linking monadic actions. myParse
will return an action which will print "D" so the output is exactly what one would expect.
You have other actions defined in ProcedureBody
and DeclarationSequence
. But you never use these actions in any way, it's like you will write:
do
let a = putStrLn "E"
putStrLn("C")
Which will output "C", a
is not used by any means. Same with your parser. If you want to invoke these actions, try to write $1 >> putStrLn("C") >> $2
in ProcedureDeclaration
associated code.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.