简体   繁体   English

可识别缩进块的Lexer

[英]Lexer that recognizes indented blocks

I want to write a compiler for a language that denotes program blocks with white spaces, like in Python. 我想为一种语言编写编译器,该语言表示带有空格的程序块,例如在Python中。 I prefer to do this in Python, but C++ is also an option. 我更喜欢在Python中执行此操作,但也可以选择C ++。 Is there an open-source lexer that can help me do this easily, for example by generating INDENT and DEDENT identifiers properly like the Python lexer does? 是否有一个开源词法分析器可以帮助我轻松地做到这一点,例如通过像Python词法分析器一样正确地生成INDENT和DEDENT标识符? A corresponding parser generator will be a plus. 相应的解析器生成器将为加号。

LEPL是纯Python,并支持越位解析。

If you're using something like lex, you can do it this way: 如果您使用的是lex之类的方法,则可以这样进行:

^[ \t]+              { int new_indent = count_indent(yytext);
                       if (new_indent > current_indent) {
                          current_indent = new_indent;
                          return INDENT;
                       } else if (new_indent < current_indent) {
                          current_indent = new_indent;
                          return DEDENT;
                       }
                       /* Else do nothing, and this way
                          you can essentially treat INDENT and DEDENT
                          as opening and closing braces. */
                     }

You may need a little additional logic, for example to ignore blank lines, and to automatically add a DEDENT at the end of the file if needed. 您可能需要一些其他逻辑,例如忽略空行,并在需要时在文件末尾自动添加DEDENT。

Presumably count_indent would take into account converting tabs to spaces according to a tab-stop value. 大概count_indent将考虑根据制表位值将制表符转换为空格。

I don't know about lexer/parser generators for Python, but what I posted should work with lex/flex, and you can hook it up to yacc/bison to create a parser. 我不了解Python的lexer / parser生成器,但是我发布的内容应该可以与lex / flex一起使用,您可以将其连接到yacc / bison来创建解析器。 You could use C or C++ with those. 您可以使用C或C ++。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM