[英]Combining list of lists with dictionaries
感谢所有在这里提供帮助的人。
我有一个列表列表。 这些列表包含如下字典:
combined lists = [
[
{'COMPANY': 'company1', 'NUMBER': '111', 'SHIPMENTS': ['1', '2', '3', '4']},
{'COMPANY': 'company2', 'NUMBER': '222', 'SHIPMENTS': ['1']},
{'COMPANY': 'company3', 'NUMBER': '333', 'SHIPMENTS': ['1', '4']},
{'COMPANY': 'company4', 'NUMBER': '444', 'SHIPMENTS': ['2', '5']},
{'COMPANY': 'company5', 'NUMBER': '555', 'SHIPMENTS': ['1', '3', '5', '9']}
],
[
{'COMPANY': 'company1', 'NUMBER': '111', 'SHIPMENTS': ['5', '6', '7', '8']},
{'COMPANY': 'company3', 'NUMBER': '333', 'SHIPMENTS': ['3', '5']},
{'COMPANY': 'company5', 'NUMBER': '555', 'SHIPMENTS': ['3', '5', '7']},
{'COMPANY': 'company7', 'NUMBER': '777', 'SHIPMENTS': ['2', '4']},
{'COMPANY': 'company9', 'NUMBER': '999', 'SHIPMENTS': ['1', '2', '5', '6', '7']}
],
]
我根据COMPANY
和SHIPMENTS
组合这些列表,并且我希望没有重复的SHIPMENTS
值。 NUMBER
键/值无关紧要。
最终的 output 理想情况下是一个看起来像这样的字典列表,其中公司的出货量全部合并:
final_list = [
{'COMPANY': 'company1', 'SHIPMENTS': ['1', '2', '3', '4', '5', '6', '7', '8']},
{'COMPANY': 'company2', 'SHIPMENTS': ['1']},
{'COMPANY': 'company3', 'SHIPMENTS': ['1', '4', '3', '5']},
{'COMPANY': 'company4', 'SHIPMENTS': ['2', '5']},
{'COMPANY': 'company5', 'SHIPMENTS': ['1', '3', '5', '7', '9']},
{'COMPANY': 'company7', 'SHIPMENTS': ['2', '4']},
{'COMPANY': 'company9', 'SHIPMENTS': ['1', '2', '5', '6', '7']}
]
我知道我没有提供任何我尝试过的东西,但主要是寻找如何接近最终的 output。 如果这很重要,我正在使用 python3.6
这是一个解决方案,它使用集合来确保没有重复,但它会丢失发货顺序。
from itertools import chain
combined_lists = [
[
{'COMPANY': 'company1', 'NUMBER': '111', 'SHIPMENTS': ['1', '2', '3', '4']},
{'COMPANY': 'company2', 'NUMBER': '222', 'SHIPMENTS': ['1']},
{'COMPANY': 'company3', 'NUMBER': '333', 'SHIPMENTS': ['1', '4']},
{'COMPANY': 'company4', 'NUMBER': '444', 'SHIPMENTS': ['2', '5']},
{'COMPANY': 'company5', 'NUMBER': '555', 'SHIPMENTS': ['1', '3', '5', '9']}
],
[
{'COMPANY': 'company1', 'NUMBER': '111', 'SHIPMENTS': ['5', '6', '7', '8']},
{'COMPANY': 'company3', 'NUMBER': '333', 'SHIPMENTS': ['3', '5']},
{'COMPANY': 'company5', 'NUMBER': '555', 'SHIPMENTS': ['3', '5', '7']},
{'COMPANY': 'company7', 'NUMBER': '777', 'SHIPMENTS': ['2', '4']},
{'COMPANY': 'company9', 'NUMBER': '999', 'SHIPMENTS': ['1', '2', '5', '6', '7']}
]
]
COMPANY_KEY = 'COMPANY'
SHIPMENTS_KEY = 'SHIPMENTS'
# you're looking to:
# - combine the lists
# - drop the number
# - combine the shipments, removing duplicates
final_dict = {}
for d in chain.from_iterable(combined_lists):
key = d[COMPANY_KEY]
if key in final_dict:
final_dict[key][SHIPMENTS_KEY].update(*d[SHIPMENTS_KEY])
else:
final_dict[key] = {SHIPMENTS_KEY: set(d[SHIPMENTS_KEY])}
print(final_dict)
# if you need a list, not a dict
final_list = [{COMPANY_KEY: key, SHIPMENTS_KEY: value} for key, value in final_dict.items()]
print(final_list)
请注意,如果您只需要一份货运清单,而这确实是您字典中唯一的内容,那么更简单的解决方案是:
from collections import defaultdict
better_dict = defaultdict(set)
for d in chain.from_iterable(combined_lists):
better_dict[d[COMPANY_KEY]].update(*d[SHIPMENTS_KEY])
print(better_dict)
我认为这应该可以解决您的问题
import collections
merged = collections.defaultdict(list)
for x in combined_lists:
for y in x:
merged[y["COMPANY"]] += y["SHIPMENT"]
final_list = []
for x in merged:
final_list.append({"COMPANY": x, "SHIPMENT": merged[x]})
这样就解决了问题,试一试,你可以优化代码以获得更好的性能。
def company_exists(company, resulting_list):
for i,dict_ in enumerate(resulting_list):
if company == dict_['COMPANY']:
return i, True
return None, False
def merge_lists(combined_lists):
res = []
for list_ in combined_lists:
for dict_ in list_:
idx, check = company_exists(dict_['COMPANY'], res)
if not check:
res.append(dict_)
else:
res[idx]['SHIPMENTS'].extend(dict_['SHIPMENTS'])
res[idx]['SHIPMENTS'] = list(set(res[idx]['SHIPMENTS']))
return res
希望,它有帮助。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.