简体   繁体   English

Python re与上一个捕获组不匹配

[英]Python re doesn't match last capture group

For the following code: 对于以下代码:

t1 = 'tyler vs ryan'
p1 = re.compile('(.*?) vs (.*?)')
print p1.findall(t1)

the output is: 输出是:

[('tyler', '')]

but I would've expected this: 但我会期待这个:

[('tyler', 'ryan')]

I have found that if I add a delimiter I can get it to work: 我发现如果我添加一个分隔符,我可以让它工作:

t2 = 'tyler vs ryan!'               # Notice the exclamation mark
p2 = re.compile('(.*?) vs (.*?)!')  # Notice the exclamation mark
print p2.findall(t2)

outputs: 输出:

[('tyler', 'ryan')]

Is there a way I can get my matches without having a custom delimiter? 有没有办法让我的比赛没有自定义分隔符?

(.*?) is non greedy it will match the smallest it can which is the empty string (after the vs at least) (.*?)非贪婪它会匹配最小的空字符串(至少在vs之后)

try (.*) or ([^ ]*) or something 尝试(.*)([^ ]*)或其他东西

The regex is capturing the shortest string it can; 正则表达式捕获它可以的最短字符串; that's what the question mark signifies. 这就是问号所代表的含义。 So as soon as it has captured the text vs it captures an empty string, then stops. 因此,只要它捕获了文本vs它就会捕获一个空字符串,然后停止。 This is what it looks like: 这就是它的样子:

Direct link: https://regex101.com/r/hO4lM7/2 直接链接: https//regex101.com/r/hO4lM7/2

If you use: 如果您使用:

re.compile('(.*?) vs (.*)')

that is, without the 2nd question mark, it will capture the text after vs as well. 也就是说,如果没有第二个问号,它也会在vs之后捕获文本。

No. Try this 不,试试吧

t1 = 'tyler vs ryan'
p1 = re.compile('(.*?) vs (.*?)$') 
print p1.findall(t1)

gives: 得到:

[('tyler', 'ryan')]

$ - Matches the end of the string or just before the newline at the end of the string, and in MULTILINE mode also matches before a newline. $ - 匹配字符串的结尾或在字符串末尾的换行符之前,并且在MULTILINE模式下也匹配换行符。

If you are assured of single-name combatants, you could use a regex like: 如果您确信单名战斗员,您可以使用正则表达式:

r'\s*(\S+)\s*vs\s*(\S+)\s*'

Your use of findall() implies to me you're expecting to have to match multiple pairings - if not, then you may want to use search() and use the ^ and $ regex special characters to more tightly bound your search. 你使用findall()对我来说意味着你必须匹配多个配对 - 如果没有,那么你可能想要使用search()并使用^$ regex特殊字符来更紧密地绑定你的搜索。

The non greedy ? 不贪心? is preventing to capture te second word. 阻止捕获第二个单词。 It would be better to do 这样做会更好

r'(.*) vs (.*)'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM