简体   繁体   English

Python从字符串中删除列表中的字符串

[英]Python removing the strings in a list from a string

I have a big string and a big list of a stop words. 我有一个大字串和一个停用词清单。 I created a little example below. 我在下面创建了一个小例子。

s = "I am 20 years old. I live in New York in United States of America."
stop = ["am", "old", "in", "of"]

As you can imagine I want the members in stop out from s. 可以想象,我希望成员从s中退出。 I tried this one. 我尝试了这个。

for word in stop:
    s = s.replace(word,"")

I get this error. 我得到这个错误。

AttributeError: 'list' object has no attribute 'replace' AttributeError:“列表”对象没有属性“替换”

You need to do the following. 您需要执行以下操作。 Split the s to a list of words by s拆分为单词列表 . Then create a hash from the list of stop words. 然后从停用词列表中创建一个哈希。 Then iterate through the list and if the value not in hash - leave it. 然后遍历列表,如果该值不在散列中,则保留它。

s = "I am 20 years old. I live in New York in United States of America."
stop = ["am", "old", "in", "of"]
arr = s.split(' ')
h = {i: 1 for i in stop}

result = []
for i in arr:
    if i not in h:
        result.append(i)

print ' '.join(result)

s is a list when you write s.replace(), so you probably made change to s, and it s now a list instead of a string s是在编写s.replace()时的列表,因此您可能已对s进行了更改,现在它是列表而不是字符串

this code works well: 该代码运行良好:

s = "I am 20 years old. I live in New York in United States of America."
stop = ["am", "old", "in", "of"]
for word in stop:
    s = s.replace(word,"")

try to find where you make modification of s, search for an assignment somewhere in your code 尝试找到您对s进行修改的位置,在代码中的某处搜索赋值

Demo here 在这里演示

The most elegant way would be to use set difference . 最优雅的方法是使用设置差异

z = list(set(string.split()) - set(stop))

This would print the following: 这将打印以下内容:

['United', '20', 'I', 'live', 'years', 'States', 'America.', 'York', 'New', 'old.']

Unit Test 单元测试

import unittest

def so_26944574(string):
    stop = ["am", "old", "in", "of"]
    z = list(set(string.split()) - set(stop))
    return sorted(z)

# Unit Test
class Test(unittest.TestCase):
    def testcase(self):
        self.assertEqual(so_26944574("I am 20 years old. I live in New York in United States of America."), sorted(['United', '20', 'I', 'live', 'years', 'States', 'America.', 'York', 'New', 'old.']))
        self.assertEqual(so_26944574("I am very old but still strong, kind of"), sorted(['I', 'very', 'but', 'still', 'strong,', 'kind']))
unittest.main()

Test Pass 测试合格

Ran 1 test in 0.000s

OK

Another way would be to do this: 另一种方法是这样做:

s = "I am 20 years old. I live in New York in United States of America."
stop = ["am", "old", "in", "of"]
s_list = s.split() # turn string into list
s = ' '.join([word for word in s_list if word not in stop]) # Make new string
>>> s
'I 20 years old. I live New York United States America.'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM