简体   繁体   中英

Splitting multiple strings using regular expression

[Delta-1234, United-1345] Testing different airlines
[Delta-1234] Testing different airlines

I want to get Delta-1234 and United-1345 in the first case and just Delta-1234 in the second. Is it possible using findall?

Do you really need regular expressions? You can just find elements between the brackets [ and ]

x = lambda s: s[s.index('['):s.index("]")+1]

string1 = "[Delta-1234, United-1345] Testing different airlines"
string2 = "[Delta-1234] Testing different airlines"

print(x(string1))
print(x(string2))

outputs

[Delta-1234, United-1345]
[Delta-1234]

If you want to use a regular expression, just match [ , and then (greedily) capture repeated non- ] s:

>>> regex = re.compile(r"\[([^\]]+)")
>>> re.findall(regex, "[Delta-1234, United-1345] Testing different airlines")
['Delta-1234, United-1345']
>>> re.findall(regex, "[Delta-1234] Testing different airlines")
['Delta-1234']

Or use lookbehind

>>> regex = re.compile(r"(?<=\[)[^\]]+")
>>> re.findall(regex, "[Delta-1234, United-1345] Testing different airlines")
['Delta-1234, United-1345']
>>> re.findall(regex, "[Delta-1234] Testing different airlines")
['Delta-1234']

Another way to achieve this using regex is:

import re

str1 = "[Delta-1234, United-1345] Testing different airlines"
str2 = "[Delta-1234] Testing different airlines"

regex_pattern = r"[^[]*\[([^]]*)\]"

print(re.match(regex_pattern, str1).groups()[0])
print(re.match(regex_pattern, str2).groups()[0])

It will print

Delta-1234, United-1345
Delta-1234

Given:

s='''\
[Delta-1234, United-1345] Testing different airlines
[Delta-1234] Testing different airlines'''

You can do:

>>> [e.split(', ') for e in re.findall(r'\[([^]]+)\]', s)]
[['Delta-1234', 'United-1345'], ['Delta-1234']]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM