简体   繁体   English

python regex表现不佳,我认为应该

[英]python regex not behaving as i think it should

I am trying to sort through a some files using regex expression. 我正在尝试使用正则表达式对某些文件进行排序。

I have a file which contains the two following lines 我有一个包含以下两行的文件

NET "MBC_ADR_I1<1>" LOC = "R2";
NET "GP_O<7>" LOC = "R20";

I am using the following expression to get one of the lines only 我正在使用以下表达式仅获得其中一行

f2MatchLoc = re.search('(LOC)[ ]+=[ ]+["]?({})'.format(f1LocValue), f2Line, re.IGNORECASE)

where f1LocValue = R2 . 其中f1LocValue = R2 However I'm getting a match on both lines. 但是我在两条线上都匹配。

I've tried to enter the same expression here regex101.com 我试图在regex101.com中输入相同的表达式

which shows that my argument should be correctly formatted 这表明我的参数应正确设置格式

f2MatchLoc = re.search(r'(LOC)[ ]+=[ ]+["]?({}\b)'.format(f1LocValue), f2Line, re.IGNORECASE)
                                              ^^

You need to use \\b after R2 so that there are no partial matches. 您需要在R2之后使用\\b ,以便没有部分匹配。 See demo . 参见演示 Also use r or raw mode. 也可以使用rraw模式。

Because you have no conditions how the string should end. 因为您没有条件字符串应如何结束。

'(LOC)[ ]+=[ ]+["]?({})'
                       ^??

So it matches anything that starts with LOC = "R2 . Following are all valid search results 因此,它匹配以LOC = "R2开头的任何东西。以下是所有有效的搜索结果

LOC = "R2 
LOC = "R2asd
LOC = "R2121
LOC = "R2   "

Simply, you can use double quotes or semicolon to identify end of search string. 简单来说,您可以使用双引号或分号来标识搜索字符串的结尾。 Also you can replace \\s for white-space capturing and you can remove [] around single element lists 您也可以替换\\s进行空白捕获,并且可以删除单个元素列表周围的[]

r'(LOC)\s+=\s+"?({})"?;'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM