简体   繁体   English

字符串模式正则表达式python

[英]String pattern Regular Expression python

I am a novice in regular expressions. 我是正则表达式的新手。 I have written the following regex to find abababab9 in the given string. 我编写了以下正则表达式以在给定的字符串中找到abababab9 The regular expression returns two results, however I was expecting one result. 正则表达式返回两个结果,但是我期待一个结果。

testing= re.findall(r'((ab)*[0-9])',temp);


**Output**: [('abababab9', 'ab')]

According to my understanding, it should have returned only abababab9 , why has it returned ab alone . 按照我的理解,它应该只返回abababab9 ,为什么它返回ab 孤单

You didnt' read the findall documentation: 您没有阅读findall文档:

Return a list of all non-overlapping matches in the string. 返回字符串中所有不重叠匹配项的列表。

If one or more capturing groups are present in the pattern, return a list of groups; 如果模式中存在一个或多个捕获组,则返回一个组列表;否则,返回一个列表。 this will be a list of tuples if the pattern has more than one group. 如果模式包含多个组,则这将是一个元组列表。

Empty matches are included in the result. 空匹配项包含在结果中。

And if you take a look at the re module capturing groups are subpatterns enclosed in parenthesis like (ab) . 如果您看一下re模块,捕获组是括在括号中的子模式,如(ab)

If you want to only get the complete match you can use one of the following solutions: 如果您只想获得完全匹配,则可以使用以下解决方案之一:

re.findall(r'(?:ab)*[0-9]', temp)  # use non-capturing groups

[groups[0] for groups in re.findall(r'(ab)*[0-9]', temp)] # take the first group

[match.group() for match in re.finditer(r'(ab)*[0-9]', temp)] # use finditer

You have configured by (...) two matching groups the first group is ((ab)*[0-9]) and the second group is (ab) . 您已经通过(...)两个匹配组进行了配置, 第一个组((ab)*[0-9])第二个组(ab) Therefore you get these two results. 因此,您将获得这两个结果。 To get only the first group you could make the second a non-capturing group . 要仅获得第一个组,可以使第二个成为非捕获组 This is done by ?: . 这是通过?:完成的。 So this result is not delivered. 因此,此结果未交付。

((?:ab)*[0-9])

正则表达式可视化

Debuggex Demo Debuggex演示

This one only matches abababab9 . 这只匹配abababab9

Edit 1: 编辑1:

Here is an explanation of the grouping concept of regular expressions: groups and capturing 这是对正则表达式的分组概念的解释: 分组和捕获

在内部使用?:删除第二个组捕获(ab)

testing= re.findall(r'((?:ab)*[0-9])',temp);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM