[英]Python removing the strings in a list from a string
I have a big string and a big list of a stop words. 我有一个大字串和一个停用词清单。 I created a little example below. 我在下面创建了一个小例子。
s = "I am 20 years old. I live in New York in United States of America."
stop = ["am", "old", "in", "of"]
As you can imagine I want the members in stop out from s. 可以想象,我希望成员从s中退出。 I tried this one. 我尝试了这个。
for word in stop:
s = s.replace(word,"")
I get this error. 我得到这个错误。
AttributeError: 'list' object has no attribute 'replace' AttributeError:“列表”对象没有属性“替换”
You need to do the following. 您需要执行以下操作。 Split the s
to a list of words by 将s
拆分为单词列表 . 。 Then create a hash from the list of stop words. 然后从停用词列表中创建一个哈希。 Then iterate through the list and if the value not in hash - leave it. 然后遍历列表,如果该值不在散列中,则保留它。
s = "I am 20 years old. I live in New York in United States of America."
stop = ["am", "old", "in", "of"]
arr = s.split(' ')
h = {i: 1 for i in stop}
result = []
for i in arr:
if i not in h:
result.append(i)
print ' '.join(result)
s is a list when you write s.replace(), so you probably made change to s, and it s now a list instead of a string s是在编写s.replace()时的列表,因此您可能已对s进行了更改,现在它是列表而不是字符串
this code works well: 该代码运行良好:
s = "I am 20 years old. I live in New York in United States of America."
stop = ["am", "old", "in", "of"]
for word in stop:
s = s.replace(word,"")
try to find where you make modification of s, search for an assignment somewhere in your code 尝试找到您对s进行修改的位置,在代码中的某处搜索赋值
The most elegant way would be to use set difference . 最优雅的方法是使用设置差异 。
z = list(set(string.split()) - set(stop))
This would print the following: 这将打印以下内容:
['United', '20', 'I', 'live', 'years', 'States', 'America.', 'York', 'New', 'old.']
Unit Test 单元测试
import unittest
def so_26944574(string):
stop = ["am", "old", "in", "of"]
z = list(set(string.split()) - set(stop))
return sorted(z)
# Unit Test
class Test(unittest.TestCase):
def testcase(self):
self.assertEqual(so_26944574("I am 20 years old. I live in New York in United States of America."), sorted(['United', '20', 'I', 'live', 'years', 'States', 'America.', 'York', 'New', 'old.']))
self.assertEqual(so_26944574("I am very old but still strong, kind of"), sorted(['I', 'very', 'but', 'still', 'strong,', 'kind']))
unittest.main()
Test Pass 测试合格
Ran 1 test in 0.000s
OK
Another way would be to do this: 另一种方法是这样做:
s = "I am 20 years old. I live in New York in United States of America."
stop = ["am", "old", "in", "of"]
s_list = s.split() # turn string into list
s = ' '.join([word for word in s_list if word not in stop]) # Make new string
>>> s
'I 20 years old. I live New York United States America.'
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.