[英]Python - Remove target words from each string in a list
例如,我有一個字符串列表
my_list = ['this is a string', 'this is also a string', 'another String']
而且我還有一個要從該列表中的每個字符串中刪除的單詞列表
remove = ['string', 'is']
我想從 my_list 中刪除 remove 中的字符串。
我試過循環遍歷每個列表
new_list = []
for i in my_list:
for word in remove:
x = i.replace(word, "")
new_list.append(x)
但這只是返回每個原始句子。
l1=""
l2=[]
my_list = ['this is a string', 'this is also a string', 'another String']
remove = ['string', 'is']
for i in my_list:
l1=""
for j in i.split():
if j not in remove:
l1=l1+" " +j
l2+=[l1]
print(l2)
您的代碼會將 output 作為['this is a ', 'th a string', 'this is also a ', 'th also a string', 'another String', 'another String']
(每個單詞is
都有也被刪除,這是不可取的。)您可以使用.split()
,如上所示。
output 將是:
[' this a', ' this also a', ' another String']
編輯:
要消除列表中每個元素中的空格,您可以運行for
循環並使用.lstrip()
運行你的代碼,我得到了這個: ['this is a ', 'th a string', 'this is also a ', 'th also a string', 'another String', 'another String']
單詞被刪除,但由於您為每個字符串迭代remove
兩次,因此您獲得了 2 倍的原點字符串。
如果您實際上是在刪除單詞,即 'is' 中的 'is' 而不是 'this' 中的,我建議使用正則表達式。
import re
my_list = ['this is a string', 'this is also a string', 'another String']
pattern = re.compile(r'\s*\b(string|is)\b|\b(string|is)\b\s*')
new_list = [pattern.sub("", s) for s in my_list]
print(new_list)
這應該做的工作:
input_string = 'this is a string, this is also a string, another String'
my_list = list(input_string.split())
remove = ['string', 'is']
remove_with_comma = [str(i) + ',' for i in remove]
correct_words = my_list[:]
for word in my_list:
if word in remove:
correct_words.remove(word)
elif word in remove_with_comma:
correct_words[correct_words.index(word)] = ','
print(' '.join(correct_words))
Output: 'this a, this also a, another String'
沿着相同的思路重新實現您的解決方案,只需進行一些調整即可獲得正確的結果:
例子:
my_list = ['this is a string', 'this is also a string', 'another String']
remove = ['string', 'is']
annotated_list = []
for phrase in my_list:
annotated_phrase = phrase.casefold()
for pattern in remove:
annotated_phrase = annotated_phrase.replace(" " + pattern, "")
annotated_list.append(annotated_phrase)
print(annotated_list)
Output:
['this a', 'this also a', 'another']
您正在獲取列表中的每個字符串,並分別刪除每個單詞,然后將它們附加到new_list
。 相反,您需要做的是刪除這些特定單詞,然后將其添加到new_list
。 這可以簡單地通過一些重組來完成
new_list = []
for i in my_list:
x = i
for word in remove:
x = x.replace(word, "")
new_list.append(x)
但是,這將刪除單詞中的出現,而不僅僅是整個單詞。 僅刪除整個單詞可以通過更多邏輯來完成,例如
new_list = []
for i in my_list:
x = i.split()
new_list.append(" ".join(a if a not in remove else '' for a in x))
這個有點復雜,但是它將每個字符串拆分為一個列表,並使用列表推導形成一個新列表,其中過濾掉所有要刪除的單詞,然后用空格將它們連接在一起。 這也可以通過 map 來完成。 請注意,這將導致刪除的單詞出現雙空格,可以通過添加諸如
" ".join(a if a not in remove else '' for a in x)).replace(" ", " ")
要將 output 保存在同一列表中:
for rem in remove:
for i,str in enumerate(my_list):
if rem in str:
str = str.replace(rem, '')
my_list[i]=str
print(my_list)
Output: ['th a ', 'th also a ', 'another String']
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.