简体   繁体   中英

python regex search and findall

Below is my code. My understanding is my pattern says you must meet car and pet is oprtion . ie check for word car and a carpet both .Now re.search match carpet which is fine.But re.finall output should be ['carpet', 'car'], But it is showing me ['pet', ''] .Please let me know where i am incorrect ?

import re
string = "carpet and car"
pattern = r'car(pet)?'
print(re.search(pattern, string))
print(re.findall(pattern, string))

Here is output of code:

<_sre.SRE_Match object; span=(0, 6), match='carpet'>
['pet', '']

The reason is mentioned in re documentation of findall() :

Return all non-overlapping matches of pattern in string, as a list of strings. The string is scanned left-to-right, and matches are returned in the order found. If one or more groups are present in the pattern, return a list of groups ; this will be a list of tuples if the pattern has more than one group. Empty matches are included in the result .

If you want the result you expect, use finditer() .

Use

pattern = r'car(?:pet)?'

instead. The ?: makes the group non-capturing (see the regex syntax docs ) which makes all the difference to findall as it returns a list of the capturing groups if such are present in your pattern:

>>> re.findall(pattern, "carpet and car")
['carpet', 'car']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM