简体   繁体   English

Flex 词法分析器未按预期运行

[英]Flex lexical analyzer not behaving as expected

I'm trying to use Flex to match basic patterns and print something.我正在尝试使用 Flex 来匹配基本模式并打印一些东西。

%%
 
^[^qA-Z]*q[a-pr-z0-9]*4\n           {printf("userid1, userid2  \n"); return 1;}

%% 
int yywrap(void){return 1;}

int main( int argc, char **argv )
             {
             ++argv, --argc;  /* skip over program name */
             if ( argc > 0 )
                     yyin = fopen( argv[0], "r" );
             else
                     yyin = stdin;

             while (yylex());
             }

Resolved dumb question解决了愚蠢的问题

I don't know what you are trying to do, so I'll focus on the immediate issue, which is your last pattern:我不知道你想做什么,所以我会关注眼前的问题,这是你的最后一个模式:

^[^qA-Z]*q[a-pr-z0-9]*4[a-pr-z0-9]*4[a-pr-z0-9]*\n

That pattern starts by matching [^qA-Z]* , which is any number of anything which is not a q nor a capital letter ( AZ ).该模式首先匹配[^qA-Z]* ,它是任意数量的任何不是q也不是大写字母 ( AZ ) 的东西。 Then it matches a q .然后它匹配一个q

Here it's worth considering all the things which are not a q nor a capital letter ( AZ ).这里值得考虑所有不是q也不是大写字母 ( AZ ) 的东西。 Obviously, that includes lower-case letters such as s (other than q ).显然,这包括小写字母,例如sq除外)。 It also includes digits.它还包括数字。 And it includes any other character: punctuation, whitespace, even control characters.它包括任何其他字符:标点符号、空格,甚至控制字符。 In particular, it includes a newline character.特别是,它包括一个换行符。

So when you type所以当你输入

10s10<newline>

That certainly could be the start of the last pattern.那当然可能是最后一个模式的开始。 The scanner hasn't yet seen a q so it doesn't know whether the pattern will eventually match, but it hasn't yet failed.扫描仪还没有看到q所以它不知道模式最终是否会匹配,但它还没有失败。 So it keeps on reading more characters, including more newlines.所以它会继续读取更多的字符,包括更多的换行符。

When you eventually type a q , the scanner can continue with the rest of the pattern.当您最终键入q ,扫描仪可以继续处理模式的其余部分。 Depending on what you type next, it might or might not be able to continue.根据您接下来键入的内容,它可能会或可能无法继续。 If, as seems likely, your input eventually fails to match the pattern, the lexer will fall back to the longest successful match, which is the first pattern.如果,看起来很可能,您的输入最终无法匹配模式,词法分析器将回退到最长的成功匹配,即第一个模式。 At that point, it will perform the first action此时,它将执行第一个动作

Negative character classes need to be used with a bit of caution.需要谨慎使用负字符类。 It's s easy to fall into the trap of thinking that "not ..." only includes "reasonable" input.很容易陷入认为“不……”只包括“合理”输入的陷阱。 But it includes everything.但它包括一切。 Often, as in this case, you'll want to at least exclude newlines.,通常,在这种情况下,您至少要排除换行符。,

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM