简体   繁体   English

lex 中的正则表达式

[英]regular expression in lex

I want to print only when the whole line matches the pattern.我只想在整行与模式匹配时打印。 But that's if even part of the line matches the pattern, it prints out.但即使是部分行与模式匹配,它也会打印出来。 How should I use regular expressions?我应该如何使用正则表达式?

You need to do two things:你需要做两件事:

  1. Insist that the pattern matches up to the end of the line;坚持模式匹配到行尾;

     ((100+1+)|(01))+\n {printf("%s\n",yytext);}

    ( \n matches the end-of line character.) \n匹配行尾字符。)

  2. Include an alternative pattern to catch the lines not matched by the first pattern:包括替代模式以捕获与第一个模式不匹配的行:

     .*\n? { /* Maybe do something here */

You need to put these two rules in that order, because the second one will match any line at all, correct or not.您需要按此顺序放置这两个规则,因为第二个规则将完全匹配任何行,无论正确与否。 However, if the first pattern matches the same line, it is the one which will be used.但是,如果第一个模式匹配同一行,它将被使用。

The?这? at the end of the second rule is to make the newline character optional.第二条规则的末尾是使换行符可选。 In (f)lex, .在 (f)lex 中, . matches any character other than a newline, so you might think that .*\n will match any line.匹配除换行符以外的任何字符,因此您可能认为.*\n将匹配任何行。 And indeed it will.确实如此。 However, it is possible (though not strictly correct) for the last line in a text file to be missing the newline terminator.但是,文本文件中的最后一行可能(尽管不是严格正确)缺少换行符。 To cover that case we use .*\n?为了涵盖这种情况,我们使用.*\n? . . (F)lex rules never match the empty string, and patterns always match as much as possible, so the only time that rule can match without a newline is if the characters to be matched are exactly at the end of the file, without a newline. (F)lex 规则从不匹配空字符串,并且模式总是尽可能多地匹配,所以该规则唯一可以在没有换行符的情况下匹配的情况是要匹配的字符恰好在文件末尾,没有换行符.

Note the difference between .*\n and .*$ .请注意.*\n.*$之间的区别。 If a pattern ends with $ , (f)lex will only use the rule of the next character is a newline.如果模式以$结尾,(f)lex 将仅使用下一个字符为换行符的规则。 But $ does not match the newline, so it will still be in the input stream waiting to be matched.但是$不匹配换行符,所以还是会在输入stream中等待匹配。 If you used $ instead of \n , you would need another rule to match (and discard) newline characters.如果您使用$而不是\n ,您将需要另一个规则来匹配(并丢弃)换行符。 But that might be what you want after all, because flex always reads one character ahead, even if it doesn't need the next character to know what to do.但这毕竟可能是您想要的,因为 flex 总是提前读取一个字符,即使它不需要下一个字符来知道要做什么。 So if you explicitly match the \n with rules like the ones I suggested above, you will find that your scanner isn't very responsive in interactive use;因此,如果您明确地将\n与我上面建议的规则匹配,您会发现您的扫描仪在交互使用中的响应不是很好; messages will be delayed until the next line is read.消息将被延迟,直到读取下一行。

The default rule in lex is to print the output; lex 中的默认规则是打印 output; so if you add a line:所以如果你添加一行:

. . { } { }

At the end, you will prevent it from echoing the unwashed patterns.最后,您将防止它与未清洗的图案相呼应。 Next up, if you want your pattern to be limited to a single line;接下来,如果您希望您的模式仅限于一行; you need to include the newline in your rule:您需要在规则中包含换行符:

((100+1+)|(01))+\n {printf("%s\n",yytext);} ((100+1+)|(01))+\n {printf("%s\n",yytext);}

But notice I made an assumption where your newline should go;但是请注意,我假设你的换行符应该是 go; it could just as well have been:它也可以是:

((100+1+)|(01))\n+ {printf("%s\n",yytext);} ((100+1+)|(01))\n+ {printf("%s\n",yytext);}

For an entirely different effect.为了完全不同的效果。

Lex is a sharp tool. Lex 是一个利器。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM