[英]removing specific items from a list of strings
我有一個字符串列表,我想從中刪除每個字符串中的特定元素。 這是我到目前為止的內容:
s = [ "Four score and seven years ago, our fathers brought forth on",
"this continent a new nation, conceived in liberty and dedicated"]
result = []
for item in s:
words = item.split()
for item in words:
result.append(item)
print(result,'\n')
for item in result:
g = item.find(',.:;')
item.replace(item[g],'')
print(result)
輸出為:
['Four', 'score', 'and', 'seven', 'years', 'ago,', 'our', 'fathers', 'brought', 'forth', 'on', 'this', 'continent', 'a', 'new', 'nation,', 'conceived', 'in', 'liberty', 'and', 'dedicated']
在這種情況下,我希望新列表包含所有單詞,但除引號和撇號外,不應包含任何標點符號。
['Four', 'score', 'and', 'seven', 'years', 'ago', 'our', 'fathers', 'brought', 'forth', 'on', 'this', 'continent', 'a', 'new', 'nation', 'conceived', 'in', 'liberty', 'and', 'dedicated']
即使正在使用find函數,結果似乎也相同。 如何糾正沒有標點符號的打印? 如何改進代碼?
您可以使用re.split
來指定要分割的正則表達式,在這種情況下,所有內容都不是數字或數字。
import re
result = []
for item in s:
words = re.split("[^A-Za-z0-9]", s)
result.extend(x for x in words if x) # Include nonempty elements
分割字符串后,可以刪除所有要刪除的字符:
for item in s:
words = item.split()
for item in words:
result.append(item.strip(",.")) # note the addition of .strip(...)
您可以將想要刪除的任何字符添加到.strip()
的String參數中,全部放在一個字符串中。 上面的示例去除了逗號和句點。
s = [ "Four score and seven years ago, our fathers brought forth on", "this continent a new nation, conceived in liberty and dedicated"]
# Replace characters and split into words
result = [x.translate(None, ',.:;').split() for x in s]
# Make a list of words instead of a list of lists of words (see http://stackoverflow.com/a/716761/1477364)
result = [inner for outer in result for inner in outer]
print s
輸出:
['Four', 'score', 'and', 'seven', 'years', 'ago', 'our', 'fathers', 'brought', 'forth', 'on', 'this', 'continent', 'a', 'new', 'nation', 'conceived', 'in', 'liberty', 'and', 'dedicated']
或者,您可以只添加一個循環
for item in result:
g = item.find(',.:;')
item.replace(item[g],'')
並拆分,.:;
只需添加一個標點符號數組
punc = [',','.',':',';']
然后遍歷它內部for item in result:
for p in punc:
g = item.find(p)
item.replace(item[g],'')
所以完整的循環是
punc = [',','.',':',';']
for item in result:
for p in punc:
g = item.find(p)
item.replace(item[g],'')
我已經測試過了,它有效。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.