简体   繁体   English

在Python中,如何根据列表删除字符串中的某些单词?

[英]In Python, how to delete some words in a string according to a list?

This is what I came up with, before getting stuck (NB source of the text : The Economist) : 这是我在陷入困境之前想出的(案文来源:《经济学人》):

import random
import re

text = 'One calculation by a film consultant implies that half of Hollywood productions with budgets over one hundred million dollars lose money.'

nbofwords = len(text.split())

words = text.split()

randomword = random.choice(words)
randomwordstr = str(randomword)

Step 1 works : Delete the random word from the original text 步骤1起作用:从原始文本中删除随机词

replaced1 = re.sub(randomwordstr, '', text)
replaced2 = re.sub('  ', ' ', replaced1)

Step 2 works : Select a defined number of random words 步骤2起作用:选择定义数量的随机词

nbofsamples = 3
randomitems = random.choices(population=words, k=nbofsamples)

gives, eg ['over', 'consultant', 'One'] 给出,例如['over','consultant','One']

Step 3 works : Delete from the original text one element of that list of random words thanks to its index 步骤3运作:由于其索引,从原始文本中删除了该随机单词列表中的一个元素

replaced3 = re.sub(randomitems[1], '', text)
replaced4 = re.sub('  ', ' ', replaced3)

deletes the word 'consultant' 删除“顾问”一词

Step 4 fails : Delete from the original text all the elements of that list of random words thanks to their index The best I can figure out is : 步骤4失败:由于其索引,从原始文本中删除了该随机单词列表中的所有元素。我能找出的最好的方法是:

replaced5 = re.sub(randomitems[0],'',text)
replaced6 = re.sub(randomitems[1],'',replaced5)
replaced7 = re.sub(randomitems[2],'',replaced6)
replaced8 = re.sub('  ', ' ', replaced7)
print(replaced8)

It works (all 3 words have been deleteg), but it is clumsy and inefficient (I would have to rewrite it if I changed the nbofsamples variable). 它可以工作(所有3个单词都被删除了),但是它笨拙且效率低下(如果我更改nbofsamples变量,则必须重写它)。

How can I iterate from my list of random words (step 2) to delete those words in the original text ? 如何从随机单词列表中进行迭代(第2步)以删除原始文本中的那些单词?

Thanks in advance 提前致谢

to delete words in a list from a string just use a for-loop. 要从字符串中删除列表中的单词,只需使用for循环。 This will iterate through each item in the list, assigning the value of the item in the list to whatever variable you want (In this case i used "i", but i can be pretty much anything a normal variable could be) and executes the code in the loop until there are no more items in the list given. 这将遍历列表中的每个项目,将列表中项目的值分配给您想要的任何变量(在这种情况下,我使用“ i”,但是我几乎可以使用任何普通变量)并执行直到循环中没有更多项目为止。 Here's the bare bones version of a for-loop: 这是for循环的基本版本:

list = []
for i in list:
    print(i)

in your case you wanted to remove the words specified in the list from a string, so just plug the variable "i" into the same method you've been using to remove the words. 在您的情况下,您想从字符串中删除列表中指定的单词,因此只需将变量“ i”插入到您用来删除单词的相同方法中即可。 After that you need a constantly changing variable, otherwise the loop would have only removed the last word in the list from the string. 之后,您需要一个不断变化的变量,否则循环将只从字符串中删除列表中的最后一个单词。 after that you can print the output. 之后,您可以打印输出。 This code will work a list of and length. 该代码将列出和的长度。

r=replaced3
for i in randomitems:
    replaced4 = re.sub(i, '', r)
    r=replaced4
print(replaced4)

Note that as long as you do not use any regular expressions but replace just simple strings by others (or nothing), you don't need re : 请注意,只要您不使用任何正则表达式,而仅用其他字符串(或不使用其他字符串)替换简单的字符串,就不需要re

for r in randomitems:
    text = text.replace(r, '')
print(text)

For replacing only the first occurence you can simple set desired number of occurences in the replace function: 对于仅替换第一次出现的事件,您可以在替换函数中简单地设置所需的出现次数:

text = text.replace(r, '', 1)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Python如何从列表中的字符串中删除小写单词 - Python how to delete lowercase words from a string that is in a list Python:如何从字符串中生成单词列表并根据其索引将其保存在文本文件中? - Python: how to generate a list of words from a string and save it in a text file according to their index? Python:如何根据字符串中单词的数量将字符串拆分为变量? - Python: How to split a string into variables according to the amount of the words in the string? 如何从python中删除字符串中的小写单词 - how to delete lowercase words from a string in python 如何删除python文件中所有带有未知单词的行? - How to delete all line with some unknown words in file on python? Python:如何确定字符串中是否存在单词列表 - Python: how to determine if a list of words exist in a string 根据元素的某些方面,如何将Python列表分成两个列表 - How to separate a Python list into two lists, according to some aspect of the elements 如何从python字符串中删除特定单词或字符串,而又不从Python中的其他单词修剪掉它? - How can I delete specific words or string from a python string without trimming it from other words in Python? 如何根据 Python 中的某些字段条件仅删除数据框中的某些行? - How to delete just some rows in a dataframe according with some fields condition in Python? 如何从变量或列表python中删除某些单词 - How to delete certain words from a variable or a list python
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM