简体   繁体   中英

Trouble understanding re.findall() behavior

I'm having some trouble understanding the behavior of re.findall . Quoting from the documentation:

Return all non-overlapping matches of pattern in string, as a list of strings. The string is scanned left-to-right, and matches are returned in the order found. If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group. Empty matches are included in the result.

Based on this, I would expect the following line

re.findall(f"(a)|(b)|(c)","c")

to produce the result

[(c)]

However, it produces the result

[('', '', 'c')]

I don't understand why the two empty strings are included, since I don't see an empty match anywhere.

It's because of having three capturing groups:

import re

print(re.findall(r"(a)|(b)|(c)","d"))
print(re.findall(f"(a)|(b)|(c)","c"))
print(re.findall(r"(?:a)|(?:b)|(?:c)","c"))
print(re.findall(f"(?:a)|(b)|(c)","c"))
print(re.findall(f"(?:a|b|c)","c"))
print(re.findall(r"a|b|c","c"))

Output

[]
[('', '', 'c')]
['c']
[('', 'c')]
['c']
['c']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM