[英]Merge/append lists that share a common item
The title may be misleading, so feel free to change the wording when the terminology for the real problem arises. 标题可能会误导您,因此在出现实际问题的术语时,请随时更改措词。 =) =)
In this case, I am aware that the lists can probably be interchanged with tuples, for the most part. 在这种情况下,我知道列表在大多数情况下可能可以与元组互换。 The end result can be any iterable as far as I'm concerned. 就我而言,最终结果可以是任意迭代的。
I have two lists-of-lists. 我有两个列表。 Suppose they are: 假设它们是:
list_a = [[1, 'f00d'], [2, 'dead'], [3, 'beef']]
list_b = [[1, 'frankenbeans'], [2, 'chickensoup'], [3, 'spaceballs']]
Neither list is necessarily the same length, nor is guaranteed that they contain a common first element. 列表的长度不一定相同,也不能保证它们包含共同的第一个元素。
What I'm trying to do is create a new list-of-lists/list-of-tuples/list-of-dicts/whatever, as such: 我正在尝试做的是创建一个新的列表列表/元组列表/字典列表/其他内容,例如:
list_c = [[1, 'f00d', 'frankenbeans'], [2, 'dead', 'chickensoup'], [3, 'beef', 'spaceballs']
Updated : Basically, I know the position of the common "ID" in these lists, though it is not necessarily sequential, nor are the lists-of-lists in the same order (but is an integer). 更新 :基本上,我知道公用“ ID”在这些列表中的位置,尽管它不一定是顺序的,列表的顺序也不是相同的 (而是整数)。 I'm looking for an efficient way to create a new set of the sub-lists, based on that common ID. 我正在寻找一种基于该通用ID创建一组新的子列表的有效方法。
The naive way: 天真的方法:
new_list = []
for list_a_list in list_a:
for list_b_list in list_b:
if list_a_list[0] = list_b_list[0]:
new_list.append([list_a_list[0], list_a_list[1], list_b_list[1]])
... or some such. ...或类似的东西。 Giving me the feeling that there's a much "smarter" way to do this, but I kinda suck at that. 给我一种感觉,有很多“更智能”的方法可以做到这一点,但我还是很烂。
Update: 更新:
Does it add any bearing if I mention that the list-of-lists each carry thousands to a million items at a time? 如果我提到列表一次包含数千到一百万个项目,这是否增加了任何影响?
from collections import defaultdict
from itertools import chain
final = defaultdict(list)
for idx, value in chain(l1, l2):
final[idx].append(value)
# and if you have to have a list of lists at the end
finalList = [[k] + v for k, v in final.iteritems()]
Your input lists should be dictionaries in the first place: 输入列表首先应该是字典:
dict_a = dict(list_a)
dict_b = dict(list_b)
dict_c = dict((k, [v, dict_b[k]]) for k,v in dict_a.items())
If keys are not guaranteed to occur in both lists, you'll have to be a little more careful: 如果不能保证在两个列表中都出现密钥,则必须多加注意:
all_keys = set(dict_a.keys()) | set(dict_b.keys())
dict_c = dict((k, (dict_a.get(k), dict_b.get(k))) for k in all_keys)
For example, for list_a = [(1, 'a')]
and list_b = [(1, 'b'), (2, 'c')]
, the above would set dict_c to {1: ('a', 'b'), 2: (None, 'c')}
. 例如,对于list_a = [(1, 'a')]
和list_b = [(1, 'b'), (2, 'c')]
,上述方法会将dict_c设置为{1: ('a', 'b'), 2: (None, 'c')}
。
itertools.groupby() is helpful for this kind of task: itertools.groupby()对于此类任务很有帮助:
from itertools import groupby, chain
from operator import itemgetter
list_a = [[1, 'f00d'], [2, 'dead'], [3, 'beef']]
list_b = [[1, 'frankenbeans'], [2, 'chickensoup'], [3, 'spaceballs']]
combined = [(k, [v[1] for v in g]) for k, g in
groupby(sorted(list_a+list_b), key=itemgetter(0))]
print combined
Note that it was necessary to create a new sorted list combining list_a and list_b before we can use groupby, since groupby assumes that the list will already be sorted by the key. 注意,在我们可以使用groupby之前,有必要创建一个将list_a和list_b组合在一起的新排序列表,因为groupby假定该列表已经通过键进行了排序。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.