简体   繁体   中英

Regular expression to match repeated occurrence of a pattern

I have a few possible input strings like below:

Roll|N/A|300x60|(1x1)|AAA|BBB

Desktop|1x1|(1x1)|AAA|BBB

Desktop|NA|(NA)|AAA|BBB

Roll|N/A|N/A|(1x1)|AAA|BBB

from which, I'm trying to detect pattern of type \\d+x\\d+ (eg, '300x60', '1x1' from the first line; '1x1', '1x1' from the second; None from the third; and '1x1' from the last). Could someone show me how to write Python regular expression search to capture none or one or many occurrence(s) of such pattern in a given string? I tried below already and it only captures either the first or the second occurrence of the pattern in a given sentence. Thank you!

r = re.search('(\(?\d+x\d+\)?)+', my_str) 
r.group() # only gives me '320x50' for the first input above

You can use

import re
my_strs = ["Roll|N/A|300x60|(1x1)|AAA|BBB", "Desktop|1x1|(1x1)|AAA|BBB", "Desktop|NA|(NA)|AAA|BBB", "Roll|N/A|N/A|(1x1)|AAA|BBB"]
print([re.findall(r'\d+x\d+', s) for s in my_strs])
# => [['300x60', '1x1'], ['1x1', '1x1'], [], ['1x1']]

See the IDEONE demo and the regex demo .

The main point is using the re.findall that will fetch multiple matches (or captured substrings, but there is no capturing group in the pattern I suggest). The issue you have is that you tried to match repeated captures with 1 search operation. Since the substrings are not adjoining, glued, you only had single results.

You could do like this:

import re
input_strings = ['Roll|N/A|300x60|(1x1)|AAA|BBB', 'Desktop|1x1|(1x1)|AAA|BBB',\
                 'Desktop|NA|(NA)|AAA|BBB','Roll|N/A|N/A|(1x1)|AAA|BBB']

print [[ j if j else None for j in [re.findall('(\d+x\d+)', i)]  ][0] for i in input_strings ]

Output:

[['300x60', '1x1'], ['1x1', '1x1'], None, ['1x1']]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM