简体   繁体   English

python比较两个列表并在匹配时替换原始列表

[英]python compare two lists and replace original list if there is a match

I have two lists containing strings.我有两个包含字符串的列表。 I want to take each item in the base_list and search to see if there is a match for the first 3 characters in any values from the custom list .我想获取base_list每个项目并搜索以查看custom list中任何值中的前 3 个字符是否匹配。 If there is a match, replace the original value in base_list with the one from custom_list .如果有匹配,替换原来的值base_list从一个custom_list If there is no match, keep the original value.如果不匹配,则保留原始值。

base_list = ["abc123", "cde123", "efg456", "ghi123"]

custom_list = ["abc321", "efg654"]

Desired Output:期望输出:

modified_base_list = ["abc321", "cde123", "efg654", "ghi123"]

Eventually I also want to write this new modified_base_list as a file containing the items, one per line.最后,我还想将这个新的modified_base_list写成一个包含项目的文件,每行一个。

I've tried:我试过了:

modified_base_list = []

for custom in custom_list:
    for base in base_list:
        if custom[:3] == base[:3]:
            modified_base_list.append(custom)
        else:
            modified_base_list.append(base)


print(modified_base_list)

with open('newfile.txt', 'w') as f:
    for s in modified_base_list:
        f.write(s)

***EDITING QUESTION to account for lists that have 15k+ lines to find a faster way to do this. ***编辑问题以解决具有 15k+ 行的列表以找到执行此操作的更快方法。

This is a solution that mutates the original list, replacing only those where the desired match exists:这是一种改变原始列表的解决方案,仅替换存在所需匹配项的列表:

>>> base_list = ["abc123", "cde123", "efg456", "ghi123"]
>>> custom_list = ["abc321", "efg654"]
>>> for i, x in enumerate(base_list):
        for test in custom_list:
            if test[:3] == x[:3]:
                base_list[i] = test
                break

>>> base_list
['abc321', 'cde123', 'efg654', 'ghi123']

Of course if you don't want to modify the original list, you can create a coopy of it first using modified_base_list = base_list[:] .当然,如果您不想修改原始列表,您可以先使用modified_base_list = base_list[:]创建一个副本。


You can also follow your own idea but in that case, you have to make sure that you are primarily iterating over base_list and won't add items multiple times:您也可以遵循自己的想法,但在这种情况下,您必须确保主要迭代base_list并且不会多次添加项目:

modified_base_list = []
for base in base_list:
    found = False
    for custom in custom_list:
        if custom[:3] == base[:3]:
            modified_base_list.append(custom)
            found = True
            break

    if not found:
        modified_base_list.append(base)

You can also use for…else here instead of that utility variable found :您也可以在此处使用for…else而不是found实用程序变量:

for base in base_list:
    for custom in custom_list:
        if custom[:3] == base[:3]:
            modified_base_list.append(custom)
            break
    else:
        modified_base_list.append(base)

You could use a list comprehension containing a generator expression:您可以使用包含生成器表达式的列表推导式:

base_list = ["abc123", "cde123", "efg456", "ghi123"]
custom_list = ["abc321", "efg654"] 
modified_base_list = [next((y for y in custom_list if y[:3] == x[:3]), x) for x in base_list]
# ['abc321', 'cde123', 'efg654', 'ghi123']

Note that I'm assuming if the same 3 character prefix occurs multiple times in custom_list that you only wish to take the first instance.请注意,我假设如果相同的 3 个字符前缀在custom_list中多次custom_list ,您只希望采用第一个实例。

Try the following with filter() :使用filter()尝试以下操作:

res = []

for i in base_list:
    temp = list(filter(lambda j: j[:3] == i[:3], custom_list))
    if temp:
        res.append('**{}**'.format(temp.pop()))
    else:
        res.append(i)

Output:输出:

>>> res
['**abc321**', 'cde123', '**efg654**', 'ghi123']

You can use a combination of list comprehension and map for this:为此,您可以结合使用list comprehensionmap

base_list = ["abc123", "cde123", "efg456", "ghi123"]

custom_list = ["abc321", "efg654"]

smaller_custom = [y[:3] for y in custom_list]

modified_base_list = ["**{}**".format(custom_list[smaller_custom.index(x[:3])]) if x[:3] in smaller_custom else x for x in base_list]
# ['**abc321**', 'cde123', '**efg654**', 'ghi123']

with open('output_data.txt','w') as outfile:
    outfile.write("\n".join(modified_base_list))

I hope this helps.我希望这有帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM