简体   繁体   中英

In Python, how to delete some words in a string according to a list?

This is what I came up with, before getting stuck (NB source of the text : The Economist) :

import random
import re

text = 'One calculation by a film consultant implies that half of Hollywood productions with budgets over one hundred million dollars lose money.'

nbofwords = len(text.split())

words = text.split()

randomword = random.choice(words)
randomwordstr = str(randomword)

Step 1 works : Delete the random word from the original text

replaced1 = re.sub(randomwordstr, '', text)
replaced2 = re.sub('  ', ' ', replaced1)

Step 2 works : Select a defined number of random words

nbofsamples = 3
randomitems = random.choices(population=words, k=nbofsamples)

gives, eg ['over', 'consultant', 'One']

Step 3 works : Delete from the original text one element of that list of random words thanks to its index

replaced3 = re.sub(randomitems[1], '', text)
replaced4 = re.sub('  ', ' ', replaced3)

deletes the word 'consultant'

Step 4 fails : Delete from the original text all the elements of that list of random words thanks to their index The best I can figure out is :

replaced5 = re.sub(randomitems[0],'',text)
replaced6 = re.sub(randomitems[1],'',replaced5)
replaced7 = re.sub(randomitems[2],'',replaced6)
replaced8 = re.sub('  ', ' ', replaced7)
print(replaced8)

It works (all 3 words have been deleteg), but it is clumsy and inefficient (I would have to rewrite it if I changed the nbofsamples variable).

How can I iterate from my list of random words (step 2) to delete those words in the original text ?

Thanks in advance

to delete words in a list from a string just use a for-loop. This will iterate through each item in the list, assigning the value of the item in the list to whatever variable you want (In this case i used "i", but i can be pretty much anything a normal variable could be) and executes the code in the loop until there are no more items in the list given. Here's the bare bones version of a for-loop:

list = []
for i in list:
    print(i)

in your case you wanted to remove the words specified in the list from a string, so just plug the variable "i" into the same method you've been using to remove the words. After that you need a constantly changing variable, otherwise the loop would have only removed the last word in the list from the string. after that you can print the output. This code will work a list of and length.

r=replaced3
for i in randomitems:
    replaced4 = re.sub(i, '', r)
    r=replaced4
print(replaced4)

Note that as long as you do not use any regular expressions but replace just simple strings by others (or nothing), you don't need re :

for r in randomitems:
    text = text.replace(r, '')
print(text)

For replacing only the first occurence you can simple set desired number of occurences in the replace function:

text = text.replace(r, '', 1)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM