简体   繁体   中英

How to remove word from a string if it contains a specific character?

Ok, lets say my string contains the following: "$$$FOOD WATERMELON" . How would I remove the "$$$FOOD" from the string?

And lets say I have strings in a list:

data_list = ["$$$FOOD WATERMELON", "$$$STORY I walked to the local store"]

The method that I have in my code splits the elements in the list, then iterates through the lists inside of the list and removes any element that contains "$$$" which would work fine, if it weren't for the fact that the .split() function splits every word, so the list would end up looking like this: [["WATERMELON"],["I","walked","to","the","local,"store"]] which is not optimal, because then I would have to join the elements in the lists of the list, that takes more time.

Basically, the only thing I am wondering is: how do I remove a word in a string if it contains "$$$" . So this string: word = "$$$STORY I walked to the store" would become "I walked to the store"

Sorry if this was confusing

You can try this:

>>> word = "$$$STORY I walked to the store"
>>> if word.startswith('$',0,word.index(' ')):
        word=word[word.index(' ')+1:]

>>> word
'I walked to the store'

Here is how it looks using the list and the function:


def check_word(word):
    if word.startswith('$',0,word.index(' ')):
        return word[word.index(' ')+1:]
data_list = ["$$$FOOD WATERMELON", "$$$STORY I walked to the local store"]
list2=[check_word(x) for x in data_list]
print(list2)

Try this one.

import re

data_list = ["$$$FOOD WATERMELON", "$$$STORY I walked to the local store"]
output = [re.sub(r'\s*[$]+\w+\s*', '', x) for x in data_list]

This gave me the following output:

['WATERMELON', 'I walked to the local store']

Here's regex definition:

\s
matches any whitespace character (equivalent to [\r\n\t\f\v ])
* matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)
Match a single character present in the list below [$]
+ matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)
$ matches the character $ with index 3610 (2416 or 448) literally (case sensitive)
\w
matches any word character (equivalent to [a-zA-Z0-9_])
+ matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)
\s
matches any whitespace character (equivalent to [\r\n\t\f\v ])
* matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)

This code checks and ensures there is no substring at any word, not just the first word like some of the other answers.

Output:

['WATERMELON', 'I walked to the local store']

Code:

data_list = ["$$$FOOD WATERMELON", "$$$STORY I walked to the local store"]

for word in range(len(data_list)):
  temp1 = data_list[word].split(" ")
  finished = False
  while not (finished):
      for temp2 in range(len(temp1)):
          if "$$$" in temp1[temp2]:
              temp1.pop(temp2)
              break
          finished = True
  data_list[word] = ' '.join(temp1)

print(data_list)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM