简体   繁体   中英

Is it possible to debug the Menhir lexer?

It's possible to debug the parser generated by Menhir, eg menhir --interpret --interpret-show-cst parser.mly . Is it also possible to debug which tokens are created by the lexer? I didn't find anything in the Manhir manual page or online.

For example, debugging "1+2" would spit out "INT 1 PLUS INT 2" token list.

Looking at the generated parser.ml module, there's a MenhirBasics module with the token type, but no string_of_token or similar. Oh, and the token type is exposed in the parser.mli file. Would be nice if Menhir could generate [@@deriving show] or similar.

Related Gitlab issue: https://gitlab.inria.fr/fpottier/menhir/-/issues/6

If you want to print the tokens that are parsed you can simply print them in the lexer.mll file:

{
  open Parser

  exception Error of string
}

rule token = parse
| [' ' '\t' '\n'] as c
    { Format.eprintf "%c" c; token lexbuf }
| ';'
    { Format.eprintf ";"; SEMICOLON }
| ['0'-'9']+ as i
    { Format.eprintf "INT %s" i; INT (int_of_string i) }
| '+'
    { Format.eprintf "PLUS"; PLUS }
| '-'
    { Format.eprintf "MINUS"; MINUS }
| '*'
    { Format.eprintf "TIMES"; TIMES }
| '/'
    { Format.eprintf "DIV"; DIV }
| '('
    { Format.eprintf "LPAREN"; LPAREN }
| ')'
    { Format.eprintf "RPAREN"; RPAREN }
| eof
    { Format.eprintf "EOF"; EOF }
| _
    { raise (Error (Printf.sprintf "At offset %d: unexpected character.\n" (Lexing.lexeme_start lexbuf))) }

Would that be ok for you?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM