简体   繁体   English

Lex:在多行中分解长正则表达式

[英]Lex: Breaking up long regular expressions over multiple lines

What is the correct syntax for breaking long lex regular expressions over multiple lines in a .l file. 在.l文件中的多行上打破长lex正则表达式的正确语法是什么?

For example, say I have a regular expression like: 例如,假设我有一个正则表达式,例如:

word1|word2|word3|word4  ECHO;

When I attempt to do this: 当我尝试这样做时:

word1|word2|
word3|word4  ECHO;

I get an error. 我得到一个错误。 What is the correct way for breaking up regex over multiple lines in lex? 在lex的多行中分解正则表达式的正确方法是什么?

With flex (as an extension to standard lex syntax), you can use the (?x:…) syntax, similar to PCRE/Perl extended syntax. 使用flex(作为标准lex语法的扩展),您可以使用(?x:…)语法,类似于PCRE / Perl扩展语法。 Note that unlike PCRE, the text to which the x flag applies is surrounded by parentheses. 请注意,与PCRE不同,x标志所应用的文本用括号括起来。 [Note 1]. [注1]。

Within the parentheses, comments and whitespace are ignored unless they are escaped or quoted. 在括号内,注释和空格将被忽略,除非将其转义或引用。 So you can write: 所以你可以这样写:

(?x:
   word1 |
   word2 |
   word3 |
   word4 )    ECHO;

Note: This syntax cannot be used in the definitions section, only in the rules section. 注意:此语法不能在“定义”部分中使用,只能在“规则”部分中使用。 I don't know if that is by design or whether some future enhancement might lift the restriction. 我不知道这是设计使然,还是未来的增强功能可能会解除限制。

See the flex manual for a few more details. 有关更多详细信息,请参见flex手册 (It's in the section which starts '(?rs:pattern)') (在以“(?rs:pattern)”开头的部分中)


Notes 笔记

  1. In PCRE (that is, python), you would write (?x) --- extended regex , and the extension continues until the end of the regex, unless you turn it off. 在PCRE中(即python),您将编写(?x) --- extended regex ,并且扩展将一直持续到regex的末尾,除非您将其关闭。 I won't even try to explain the rules Perl uses to detect the end of an eXtended regex. 我什至不会尝试解释Perl用于检测扩展正则表达式结尾的规则。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM