简体繁体 English

有没有一种方法可以在PLY中解析lex和yacc文件（Python Lex-Yacc）

[英]Is there a way to parse lex and yacc files in/to PLY (Python Lex-Yacc)

原文 2017-12-01 06:25:43 2 1 compiler-construction/ yacc/ lex/ ply

Its known that PLY tries to achieve what Lex and Yacc do in Python. 众所周知，PLY试图实现Lex和Yacc在Python中所做的事情。 I was wondering whether the library provides a method to parse/translate/convert .l (lex files) or .y (yacc files) themselves into the grammar definitions PLY uses. 我想知道库是否提供了一种将.l（lex文件）或.y（yacc文件）本身解析/翻译/转换为PLY所使用的语法定义的方法。

This is the use case that I've got .l and .y files of the language, I now want to parse the files written in the aforementioned language using PLY, so that I can process the tokens exactly as the original language definition states it to be. 这是用例，我有该语言的.l和.y文件，现在我想使用PLY解析以上述语言编写的文件，以便我可以按照原始语言定义所声明的那样完全处理标记成为。

1 个解决方案

Not that I know of. 从来没听说过。

The grammar specifications are sufficiently similar that you can usually copy and paste. 语法规范足够相似，您通常可以复制和粘贴。 Note that Ply parser functions p_* correspond to individual productions, not non-terminals; 请注意，Ply解析器函数p_*对应于单个产品，而不是非终端产品； Ply allows you to combine two productions into the same action function, in case the actions are the same, but for mechanical translation it is probably better to start with one function per production and optimize later. 如果动作相同，则Ply允许您将两个产品组合到同一个动作函数中，但是对于机械平移而言，最好是每个产品从一个功能开始，然后再进行优化。 Note also that Ply does not implement the default action $$ = $1 ( p[0] = p[1] in Ply terms), so these must be made explicit (and in this case all productions with the default action could be combined into a single parser action function.) 还要注意，Ply没有实现默认操作$$ = $1 （以Ply术语表示p[0] = p[1] ），因此必须将它们明确化（在这种情况下，具有默认操作的所有生产都可以合并为一个解析器动作函数。）

Ply does not implement mid-rule actions; 层不执行规则中的动作； if your existing yacc/bison parser relies on them, they will have to be removed. 如果您现有的yacc / bison解析器依赖于它们，则必须将其删除。 Bison's -v output can be useful here. 野牛的-v输出在这里很有用。

Since Ply relies on Python's regular expression library, regular expressions may need to be changed, particularly if they use (f)lex macro definitions. 由于Ply依赖于Python的正则表达式库，因此可能需要更改正则表达式，尤其是如果它们使用（f）lex宏定义。 Also, the use of pattern regex variables in Ply alters the pattern acceptance order; 同样，在Ply中使用模式正则表达式变量会更改模式接受顺序； you might want to avoid these at the beginning. 您可能一开始就要避免这些。 (Even with pattern functions, Ply does not implement maximum munch, but at least acceptance order can be controlled.) （即使具有模式功能，Ply也不执行最大的限制，但至少可以控制接受顺序。）

Unlike (F)lex, Ply cannot optimise large numbers of regular expressions. 与（F）lex不同，Ply无法优化大量正则表达式。 In (F)lex scanner definitions, it is common to use a individual patterns for each keyword, relying on the scanner generator to produce what is effectively a highly-efficient trie-like state machine. 在（F）lex扫描器定义中，通常对每个关键字使用单独的模式，依靠扫描器生成器来产生实际上是高效的特里样状态机。 Ply can't do that, and using a large number of patterns can be a significant performance hit (although even so, lexical analysis is rarely a performance bottleneck these days.) Ply无法做到这一点，并且使用大量模式可能会严重打击性能（尽管即使如此，词法分析现在也很少成为性能瓶颈。）