简体   繁体   中英

Regex capture between certain characters

I'm quite new to Python and regex. I'm almost there but fail to fix this issue after 6 hours. Hopefully someone can help.

My string is as follows:

str_1 =  & peers & & apples & & lemon juice & & Strawberries & & Mellon & 

I would like a new list that contains: ['peers','apples','lemon juice','Strawberries','Mellon'] . So without all the whitespace and the & signs.

My code is as follows:

list_1 = re.compile(r'(?<=&)(.*?)(?=&)').findall(str_1)

However, I get something like this:

list_1 =  [' peers ', ' ', ' apples ', ' ', ' lemon juice ', ' ', ' Strawberries ', ' ', ' Mellon']

Can someone please help to get:

['peers','apples','lemon juice','Strawberries','Mellon']

You don't need regexes for this

>>> str_1 =  '& peers & & apples & & lemon juice & & Strawberries & & Mellon &'
>>> ls = [x.strip() for x in str_1.split('&')]
>>> ls = [x for x in ls if x]
>>> ls
['peers', 'apples', 'lemon juice', 'Strawberries', 'Mellon']

If you still want a regex, then

>>> re.findall(r'[^& ][^&]*[^& ]', str_1)
['peers', 'apples', 'lemon juice', 'Strawberries', 'Mellon']

If you have to use a regex , you can use

re.findall(r'[^&\s]+(?:[^&]*[^&\s])?', str_1)

See the regex demo . Details :

  • [^&\s]+ - one or more chars other than & and whitespace - (?:[^&]*[^&\s])? - an optional sequence of any chars other than & and then a char other than a & or whitespace.

See the Python demo :

import re
str_1 = "& peers & & apples & & lemon juice & & Strawberries & & Mellon & "
print( re.findall(r'[^&\s]+(?:[^&]*[^&\s])?', str_1) )
# => ['peers', 'apples', 'lemon juice', 'Strawberries', 'Mellon']

A non-regex solution can look like

[x.strip() for x in str_1.split('&') if x.strip()]

See this Python demo . Here, you split a string with & chars and only keep the items that are not empty or are all whitespace with leading/trailing spaces stripped.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM