简体   繁体   English

python正则表达式搜索findall捕获组

[英]python regex search findall capturing groups

I just want to get "66664324", the content between ")" and "-". 我只想获取“ 66664324”,即“)”和“-”之间的内容。 Why did the search method get the ")" and "-" themselves. 为什么search方法会自己获取“)”和“-”。

a="(021)66664324-01"
b1=re.findall('\)(.*)-',a)
>['66664324']

b2=re.search('\)(.*)-',a).group()
>')66664324-'

What are differences between the two Code snippets. 这两个代码段之间有什么区别。

Try printing the group(1) in re.search instead of group(). 尝试在re.search打印group(1)而不是group()。 Where group() prints the whole match but group(1) prints only the captured group 1( printig chars which was present inside the group index 1 ). 其中group()打印整个匹配项,而group(1)仅打印捕获的组1( 组索引1内部存在的printig字符 )。

>>> a="(021)66664324-01"
>>> import re
>>> b2=re.search('\)(.*)-',a).group(1)
>>> b2
'66664324'
>>> b2=re.search('\)(.*)-',a).group()
>>> b2
')66664324-'

But re.findall gives the first preference to groups rather than the match and also it returns the results in lists but search didn't. 但是re.findall优先考虑组而不是匹配项,并且它返回列表中的结果,但search没有。 So that this b1=re.findall('\\)(.*)-',a) gives you the desired output. 这样b1=re.findall('\\)(.*)-',a)为您提供所需的输出。 If a group is present then re.findall method would print only the groups not the match. 如果存在一个组,则re.findall方法将仅打印不匹配的组。 If no groups are present, then only it prints the match. 如果没有组,则仅打印匹配项。

>>> b1=re.findall('\)(.*)-',a)
>>> b1
['66664324']
>>> b1=re.findall('\).*-',a)
>>> b1
[')66664324-']

The difference is in b2.group(), which equals to b2.group(0). 区别在于b2.group(),等于b2.group(0)。 And based on the python regex manual 并基于python regex手册

the search() method of patterns scans through the string, so the match may not start at zero in that case 模式的search()方法扫描字符串,因此在这种情况下,匹配可能不会从零开始

So in your case the result starts at index of 1. I had have tried your code with a little modification of the search rule and the expected result is at index 1. 因此,在您的情况下,结果从索引1开始。我已经尝试对您的代码进行一些修改,对搜索规则进行了修改,预期结果在索引1处。

>>> a="(021)66664324-01" >>> a =“(021)66664324-01”

>>> re.search('\\)([0-9]*)',a).group(1) >>> re.search('\\)([0-9] *)',a).group(1)

'66664324' '66664324'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM