[英]Remove all words from a string that exist in a list
community.社区。
I need to write a function that goes through a string and checks if each word exists in a list, if the word exists in the (Remove list) it should remove that word if not leave it alone.我需要编写一个 function 来遍历一个字符串并检查每个单词是否存在于列表中,如果该单词存在于(删除列表)中,它应该删除该单词,如果不单独放置的话。
i wrote this:我写了这个:
def remove_make(x):
a = x.split()
for word in a:
if word in remove: # True
a = a.remove(word)
else:
pass
return a
But it returns back the string with the (Remove) word still in there.但它会返回带有 (Remove) 字样的字符串。 Any idea how I can achieve this?
知道如何实现这一目标吗?
A more terse way of doing this would be to form a regex alternation based on the list of words to remove, and then do a single regex substitution:一种更简洁的方法是根据要删除的单词列表形成正则表达式替换,然后进行单个正则表达式替换:
inp = "one two three four"
remove = ['two', 'four']
regex = r'\s*(?:' + r'|'.join(remove) + ')\s*'
out = re.sub(regex, ' ', inp).strip()
print(out) # prints 'one three'
You can try something more simple:您可以尝试更简单的方法:
import re
remove_list = ['abc', 'cde', 'edf']
string = 'abc is walking with cde, wishing good luck to edf.'
''.join([x for x in re.split(r'(\W+)', string) if x not in remove_list])
And the result would be:结果将是:
' is walking with, wishing good luck to.'
'是走在一起,祝好运。
The important part is the last line:重要的部分是最后一行:
''.join([x for x in re.split(r'(\W+)', string) if x not in remove_list])
What it does:它能做什么:
The BNF notation for list comprehensions and a little bit more information on them may be found here列表推导的 BNF 符号和更多关于它们的信息可以在这里找到
PS: Of course, you may make this a little bit more readable if you break the one-liner into peaces and assign the result of re.split(r'(\W+)', string) to a variable and decouple the join and the comprehension. PS:当然,如果您将单行分解为和平并将re.split(r'(\W+)', string)的结果分配给变量并将连接和解耦,则可以使这更具可读性理解。
list.remove(x)
returns None
and modifies the list
in-place by removing x
it exists inside the list. list.remove(x)
返回None
并通过删除它存在于列表中的x
来就地修改list
。 When you do当你这样做
a = a.remove(word)
you will be effectively storing None
in a
and this would give an exception in the next iteration when you again do a.remove(word)
( None.remove(word)
is invalid), but you don't get that either since you immediately return
after the conditional (which is wrong, you need to return
after the loop has finished, outside its scope).您将有效地将
None
存储在a
中,当您再次执行a.remove(word)
( None.remove(word)
无效)时,这将在下一次迭代中出现异常,但您也不会得到,因为您立即在条件之后return
(这是错误的,您需要在循环完成后return
,超出其范围)。 This is how your function should look like (without modifying a list while iterating over it):这就是您的 function 的样子(在迭代列表时不修改列表):
remove_words = ["abc", ...] # your list of words to be removed
def remove_make(x):
a = x.split()
temp = a[:]
for word in temp:
if word in remove_words: # True
a.remove(word)
# no need of 'else' also, 'return' outside the loop's scope
return " ".join(a)
You can create a new list without the words you want to remove and then use join() function to concatenate all the words in that list.您可以创建一个不包含要删除的单词的新列表,然后使用 join() function 连接该列表中的所有单词。 Try
尝试
def remove_words(string, rmlist):
final_list = []
word_list = string.split()
for word in word_list:
if word not in rmlist:
final_list.append(word)
return ' '.join(final_list)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.