简体   繁体   中英

Regex to remove specific words in python

I want to do the some manipulation using regex in python.

So input is +1223,+12_remove_me,+222,+2223_remove_me and output should be +1223,+222

Output should only contain comma seperated words which don't contain _remove_me and only one comma between each word.

Note : REGEX which I tried \\+([0-9|+]*)_ , \\+([0-9|+]*) and some other combination using which I did not get required output.

Note 2 I can't use loop, need to do that without loop with regex only.

Your regex seems incomplete, but you were on the right track. Note that a pipe symbol inside a character class is treated as a literal and your [0-9|+] matches a digit or a | or a + symbols.

You may use

,?\+\d+_[^,]+

See the regex demo

Explanation:

  • ,? - optional , (if the "word" is at the beginning of the string, it should be optional)
  • \\+ - a literal +
  • \\d+ - 1+ digits
  • _ - a literal underscore
  • [^,]+ - 1+ characters other than ,

Python demo :

import re
p = re.compile(r',?\+\d+_[^,]+')
test_str = "+1223,+12_remove_me,+222,+2223_remove_me"
result = p.sub("", test_str)
print(result)
# => +1223,+222

In your case you need regex with negotiation

[^(_remove_me)]

Demo

A non-regex approach would involve using str.split() and excluding items ending with _remove_me :

>>> s = "+1223,+12_remove_me,+222,+2223_remove_me"
>>> items = [item for item in s.split(",") if not item.endswith("_remove_me")]
>>> items
['+1223', '+222']

Or, if _remove_me can be present anywhere inside each item, use not in :

>>> items = [item for item in s.split(",") if "_remove_me" not in item]
>>> items
['+1223', '+222']

You can then use str.join() to join the items into a string again:

>>> ",".join(items)
'+1223,+222'

You could perform this without a regex, just using string manipulation. The following can be written as a one-liner, but has been expanded for readability.

my_string = '+1223,+12_remove_me,+222,+2223_remove_me' #define string
my_list = my_string.split(',')                         #create a list of words
my_list = [word for word in my_list if '_remove_me' not in word] #stop here if you want a list of words
output_string = ','.join(my_list)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM