简体   繁体   中英

Is there a way to parse lex and yacc files in/to PLY (Python Lex-Yacc)

Its known that PLY tries to achieve what Lex and Yacc do in Python. I was wondering whether the library provides a method to parse/translate/convert .l (lex files) or .y (yacc files) themselves into the grammar definitions PLY uses.

This is the use case that I've got .l and .y files of the language, I now want to parse the files written in the aforementioned language using PLY, so that I can process the tokens exactly as the original language definition states it to be.

Not that I know of.

The grammar specifications are sufficiently similar that you can usually copy and paste. Note that Ply parser functions p_* correspond to individual productions, not non-terminals; Ply allows you to combine two productions into the same action function, in case the actions are the same, but for mechanical translation it is probably better to start with one function per production and optimize later. Note also that Ply does not implement the default action $$ = $1 ( p[0] = p[1] in Ply terms), so these must be made explicit (and in this case all productions with the default action could be combined into a single parser action function.)

Ply does not implement mid-rule actions; if your existing yacc/bison parser relies on them, they will have to be removed. Bison's -v output can be useful here.

Since Ply relies on Python's regular expression library, regular expressions may need to be changed, particularly if they use (f)lex macro definitions. Also, the use of pattern regex variables in Ply alters the pattern acceptance order; you might want to avoid these at the beginning. (Even with pattern functions, Ply does not implement maximum munch, but at least acceptance order can be controlled.)

Unlike (F)lex, Ply cannot optimise large numbers of regular expressions. In (F)lex scanner definitions, it is common to use a individual patterns for each keyword, relying on the scanner generator to produce what is effectively a highly-efficient trie-like state machine. Ply can't do that, and using a large number of patterns can be a significant performance hit (although even so, lexical analysis is rarely a performance bottleneck these days.)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM