简体   繁体   English

如何合并两个数组列表(x,y,z,m),排除仅基于(x,y,z)的重复项

[英]How to merge two lists of arrays, (x,y,z,m), excluding duplicates based on only (x,y,z)

I have two lists of the form 我有两个表格形式

list1 = list(zip(SGXm, SGYm, SGZm, Lm))
list2 = list(zip(SGXmm, SGYmm, SGZmm, Lmm))

I want to merge them, while excluding duplicate (x,y,z) entries, and ignoring differences in L. 我想合并它们,同时排除重复的(x,y,z)条目,并忽略L中的差异。

list1.extend(x for x in list2 if x not in list1)

Does the job only for my x,y,z, but I want to retain the Ls (of the first list when there is a choice). 仅对我的x,y,z执行此工作,但是我想保留L(在可以选择的情况下,在第一个列表中)。

You'll have to extract the triples you need for comparison. 您必须提取需要进行比较的三元组。

seen = set(item[:3] for item in list1)
list1.extend(item for item in list2 if item[:3] not in seen)

If you want sorted output (particularly if you already have sorted input) itertools.groupby and heapq.merge combine nicely for this. 如果您想对输出进行排序(特别是如果您已经对输入进行排序),则itertools.groupbyheapq.merge很好地结合使用。 If the inputs aren't already sorted, you'll need to do so. 如果输入尚未排序,则需要这样做。 Either concatenate and sort all at once: 可以一次连接并排序所有内容:

from operator import itemgetter

commonkey = itemgetter(0, 1, 2)
combined = sorted(list1 + list2, key=commonkey)

or if they're already sorted, or you want to sort independently, you use heapq.merge and avoid making shallow copies of the inputs: 或者,如果它们已经被排序,或者您想独立地排序,请使用heapq.merge并避免对输入内容进行浅表复制:

# Explicit sort calls only necessary if inputs not already sorted
list1.sort(key=commonkey)
list2.sort(key=commonkey)

# Combine already sorted inputs with heapq.merge, avoiding intermediate lists
combined = heapq.merge(list1, list2, key=commonkey)

Whichever approach you chose, you then follow it up with a simple comprehension over groupby to keep only one copy of each unique key by just getting the first entry in each unique group: 无论选择哪种方法,都可以通过对groupby的简单理解来跟进它,只需获取每个唯一组中的第一个条目,就仅保留每个唯一键的一个副本:

# Groups neighboring entries with the same key, and we keep only the first one
uniq = [next(g) for _, g in itertools.groupby(combined, key=commonkey)]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM