简体   繁体   English

使用Python从字符串列表中删除短语词典的更快方法

[英]Faster way to remove a dictionary of phrase from a list of string using Python

I have to remove a dictionary of phrase from a list of string using Python 我必须使用Python从字符串列表中删除短语词典

A list of strings L1. 字符串列表L1。 Example: L1 = ['Programmer New York', 'Programmer San Francisco'] 示例: L1 = ['Programmer New York', 'Programmer San Francisco']

A dictionary of phrase L2 (all of them are more than one word). 短语L2的字典(它们全部都是一个以上的单词)。 Example: L2={'New York', 'San Francisco'} 示例: L2={'New York', 'San Francisco'}

The expected output is, for each string in L1, remove substring that exists in L2. 对于L1中的每个字符串,预期的输出是删除L2中存在的子字符串。 So the output will be res=['Programmer', 'Programmer'] . 因此输出将为res=['Programmer', 'Programmer']

def foo(L1, L2):
    res = []
    print len(L1)
    for i in L1:
        for j in L2:
            if j in i:
                i = i.replace(j, "")
        res.append(i)
    return res

My current program is a brute force double for loop. 我当前的程序是强力双循环。 But is it possible to improve the performance? 但是有可能提高性能吗? Especially when L1 size is very large. 特别是当L1尺寸很大时。

Try using map() and re, 尝试使用map()并重新

import re
res = map(lambda i, j: re.sub(" "+i, '', j), L2, L1)

The double quotes before the i are there to eliminate the trailing space after programmer. i之前的双引号用于消除程序员之后的尾随空格。

return list(res)

PS returning a list explicitly is only necessary if you are using Python 3. Let me know if this improves your speed at all. PS仅在使用Python 3时才有必要显式地返回列表。让我知道这是否完全可以提高您的速度。

You can use list comprehension to do so as: 您可以使用列表推导来执行以下操作:

l1 = ['Programmer New York', 'Programmer San Francisco']
l2=['New York', 'San Francisco']
a=[x.split(y) for x in l1 for y in l2 if y in x]
res=["".join(x) for x in a]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM