python regex表现不佳，我认为应该

Question

I am trying to sort through a some files using regex expression. 我正在尝试使用正则表达式对某些文件进行排序。

I have a file which contains the two following lines 我有一个包含以下两行的文件

NET "MBC_ADR_I1<1>" LOC = "R2";
NET "GP_O<7>" LOC = "R20";

I am using the following expression to get one of the lines only 我正在使用以下表达式仅获得其中一行

f2MatchLoc = re.search('(LOC)[ ]+=[ ]+["]?({})'.format(f1LocValue), f2Line, re.IGNORECASE)

where f1LocValue = R2 . 其中f1LocValue = R2 。 However I'm getting a match on both lines. 但是我在两条线上都匹配。

I've tried to enter the same expression here regex101.com 我试图在regex101.com中输入相同的表达式

which shows that my argument should be correctly formatted 这表明我的参数应正确设置格式

Answer 1

f2MatchLoc = re.search(r'(LOC)[ ]+=[ ]+["]?({}\b)'.format(f1LocValue), f2Line, re.IGNORECASE)
                                              ^^

You need to use \\b after R2 so that there are no partial matches. 您需要在R2之后使用\\b ，以便没有部分匹配。 See demo . 参见演示。 Also use r or raw mode. 也可以使用r或raw模式。

Answer 2

Because you have no conditions how the string should end. 因为您没有条件字符串应如何结束。

'(LOC)[ ]+=[ ]+["]?({})'
                       ^??

So it matches anything that starts with LOC = "R2 . Following are all valid search results 因此，它匹配以LOC = "R2开头的任何东西。以下是所有有效的搜索结果

LOC = "R2 
LOC = "R2asd
LOC = "R2121
LOC = "R2   "

Simply, you can use double quotes or semicolon to identify end of search string. 简单来说，您可以使用双引号或分号来标识搜索字符串的结尾。 Also you can replace \\s for white-space capturing and you can remove [] around single element lists 您也可以替换\\s进行空白捕获，并且可以删除单个元素列表周围的[]

r'(LOC)\s+=\s+"?({})"?;'

python regex表现不佳，我认为应该

问题描述

2 个解决方案

解决方案1
3 已采纳 2015-06-11 08:46:52

解决方案2
1 2015-06-11 09:08:34

python regex表现不佳，我认为应该

问题描述

2 个解决方案

解决方案1 3 已采纳 2015-06-11 08:46:52

解决方案2 1 2015-06-11 09:08:34

解决方案1
3 已采纳 2015-06-11 08:46:52

解决方案2
1 2015-06-11 09:08:34