简体   繁体   中英

How to use RegEx re.search and group() to get multiple lines of matching data in python

My text is this

text =

  [ BP2572 23,
    BP2345 34,
    BP2457 45,
    BP2866 56 ] 

I want to extract just a portion of the text using search and group() re expression.

What I am hoping for is the following output

[BP2572, BP2345, BP2457, BP2866]
[23, 34, 45, 56]

I tried this code but I didn't get quite what I was expecting

>>re.findall(r"\s+[A-Z]{2}\d{4} \d{2}",text)

['\nBP2572 23', '\nBP2345 34', '\nBP2457 45', '\nBP2866 56']

>>re.search(r"\s+([A-Z]{2}\d{4}) (\d{2})",text).group(1)
>>re.search(r"\s+([A-Z]{2}\d{4}) (\d{2})",text,re.DOTALL).group(1)

'BP2572' #my expected output is [BP2572, BP2345, BP2457, BP2866]

>>re.search(r"\s+([A-Z]{2}\d{4}) (\d{2})",text).group(2)
>>re.search(r"\s+([A-Z]{2}\d{4}) (\d{2})",text,re.DOTALL).group(2)

'23' #my expected output is `[23, 34, 45, 56]`

Here I only got the first match.

How can use re.search and group() to get all the matching results and not just the first match?

You cannot use re.search to find multiple matches. Both re.search and re.match only return the FIRST match.

You need to use re.findall or re.finditer for multiple matches of a pattern.

You can do:

 >>> re.findall(r'^[ \t]*([A-Z]{2}\d{4}[ \t]+\d{2})', text, flags=re.M)
 ['BP2572 23', 'BP2345 34', 'BP2457 45', 'BP2866 56']

Or if you only want the first part:

 >>> re.findall(r'^[ \t]*([A-Z]{2}\d{4})[ \t]+\d{2}', text, flags=re.M)
 ['BP2572', 'BP2345', 'BP2457', 'BP2866']

Or only the second part:

 >>> re.findall(r'^[ \t]*[A-Z]{2}\d{4}[ \t]+(\d{2})', text, flags=re.M)
 ['23', '34', '45', '56']

Or both parts:

 >>> re.findall(r'^[ \t]*([A-Z]{2}\d{4})[ \t]+(\d{2})', text, flags=re.M)
 [('BP2572', '23'), ('BP2345', '34'), ('BP2457', '45'), ('BP2866', '56')]

Or, you can do simple splits:

>>> [line.split()[0] for line in text.splitlines() if line]
['BP2572', 'BP2345', 'BP2457', 'BP2866']
>>> [line.split()[1] for line in text.splitlines() if line]
['23', '34', '45', '56']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM