简体   繁体   English

python使用键连接列表列表

[英]python join list of list of lists using key

I have this list structure: 我有这个列表结构:

lst = [[['a', 100],['b', 200],['d', 325]],[['a', 50],['b', 250],['c', 75]]]

'lst' can contain an arbitrary number of sublists (len(lst) can be bigger than 2) 'lst'可以包含任意数量的子列表(len(lst)可以大于2)

As an output I want: 作为我想要的输出:

output = [['a',100,50],['b',200,250],['c',0,75],['d',325,0]]

Here is another example: 这是另一个例子:

lst = [[['a', 100],['b', 200],['d', 325]],[['a', 50],['b', 250],['c', 75]], [['a', 22], ['b', 10]]]

output = [['a', 100, 50, 22],['b', 200, 250, 10], ['c', 0, 75, 0], ['d', 325, 0, 0]]

How would you do that? 你会怎么做?

You can use a defaultdict : 您可以使用defaultdict

from collections import defaultdict
import itertools
d = defaultdict(list)
lst = [[['a', 100],['b', 200],['d', 325]],[['a', 50],['b', 250],['c', 75]]]
for a, b in itertools.chain.from_iterable(lst):
   d[a].append(b)

new_lst = sorted([list(itertools.chain.from_iterable([[a], [0 for i in range(len(max(d.items(), key=lambda x:len(x[-1])))-len(b))]+b])) for a, b in d.items()])

Output: 输出:

[['a', 100, 50], ['b', 200, 250], ['c', 0, 75], ['d', 0, 325]]

This task would be a little simpler if we had a list of all the letter keys used in lst , but it's easy enough to extract them. 如果我们有一个lst使用的所有字母键的列表,这个任务会更简单,但是提取它们很容易。

My strategy is to convert the sublists into dictionaries. 我的策略是将子列表转换为字典。 That makes it easy & efficient to grab the value associated with each key. 这样可以轻松高效地获取与每个键相关的值。 And the dict.get method allows us to supply a default value for missing keys. dict.get方法允许我们为缺失的键提供默认值。

lst = [[['a', 100],['b', 200],['d', 325]],[['a', 50],['b', 250],['c', 75]]]

# Convert outer sublists to dictionaries
dicts = [*map(dict, lst)]

# Get all the keys
keys = set()
for d in dicts:
    keys.update(d.keys())

# Get data for each key from each dict, using 0 if a key is missing
final = [[k] + [d.get(k, 0) for d in dicts] for k in sorted(keys)]
print(final)

output 产量

[['a', 100, 50], ['b', 200, 250], ['c', 0, 75], ['d', 325, 0]]

If we use 如果我们使用

lst = [[['a', 100],['b', 200],['d', 325]],[['a', 50],['b', 250],['c', 75]], [['a', 22], ['b', 10]]]

then the output is 那么输出就是

[['a', 100, 50, 22], ['b', 200, 250, 10], ['c', 0, 75, 0], ['d', 325, 0, 0]]

If you want to run this on Python 2 you need to make a minor change to the code that converts the outer sublists to dictionaries. 如果你想在Python 2上运行它,你需要对将外部子列表转换为字典的代码进行微小的更改。 Change it to 将其更改为

dicts = list(map(dict, lst))

That will work correctly on both Python 2 & 3. And if you only need to run it on Python 2, you could simply do 这将在Python 2和3上正常工作。如果你只需要在Python 2上运行它,你就可以做到

dicts = map(dict, lst)

since map in Python 2 return a list, not an iterator. 因为Python 2中的map返回一个列表,而不是迭代器。

With itertools.chain.from_iterable() , itertools.groupby() functions and built-in next() function: 使用itertools.chain.from_iterable()itertools.groupby()函数和内置的next()函数:

import itertools

lst = [ [['a', 100],['b', 200],['d', 325]],[['a', 50],['b', 250],['c', 75]], [['a', 22], ['b', 10]] ]
lst_len = len(lst)
sub_keys = [{k[0] for k in _} for _ in lst]
result = [[k] + [next(g)[1] if k in sub_keys[i] else 0 for i in range(lst_len)]
          for k,g in itertools.groupby(sorted(itertools.chain.from_iterable(lst), key=lambda x:x[0]), key=lambda x: x[0])]

print(result)

The output: 输出:

[['a', 100, 50, 22], ['b', 200, 250, 10], ['c', 0, 75, 0], ['d', 325, 0, 0]]

This is my "long-hand" method, I just had to work out what was going on : 这是我的“长手”方法,我只需要弄清楚发生了什么:

lst = [[['a', 100],['b', 200],['d', 325]],
      [['a', 50],['b', 250],['c', 75]],
      [['a', 22], ['b', 10]],
      [['c', 110],['f', 200],['g', 425]],
      [['a', 50],['f', 250],['h', 75]],
      [['a', 32], ['b', 10]], ]
nlist = []
store={}
for n,j in enumerate(lst):
    for i in j  :
        if i[0] in store :
            store[i[0]].append(i[1])
        else :
            store[i[0]] = nlist + [i[1]]
    nlist += [0]
    for k,v in store.items() :
        if len(v) < n+1 :
            store[k] = v + [0]
print(store)
result=[]
for k,v in store.items():
    result += [[k] + v]
print(sorted(result))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM