Python根据匹配的键/值对减少字典列表

Question

I have a list of dicts which specify flows (source to hop to destionation with their respective volume). 我有一个字典列表，这些字典指定了流量（要跳转到其各自量的目标的源）。 Now i want to split these flows into link (eg (source to hop with volume, hop to destination with volume) and merge all duplicate links together by summing up their volumes. 现在，我想将这些流拆分为链接（例如（源到卷的跳数，跳到目的地的卷）），并通过汇总其所有卷将所有重复的链接合并在一起。

Since I'm new to python I'm wondering what a good approach would be. 由于我是python的新手，所以我想知道哪种方法更好。 My first approach would be to loop through all flows and nest a loop through all links inside and check if the links already exists. 我的第一种方法是遍历所有流，并在内部的所有链接之间嵌套一个环，并检查链接是否已存在。

But if I have millions of flows, that might become quite ineffienct and slow I guess. 但是，如果我有数百万个流量，我想那可能会变得非常无效且缓慢。

My starting data looks like this: 我的起始数据如下：

flows = [
    {
        'source': 1,
        'hop': 2,
        'destination': 3,
        'volume': 100,
    },{
        'source': 1,
        'hop': 2,
        'destination': 4,
        'volume': 50,
    },{
        'source': 2,
        'hop': 2,
        'destination': 4,
        'volume': 200,
    },
]

What my result should be: 我的结果应该是：

links = [
    {
        'source': 1,
        'hop': 2,
        'volume': 150,
    },{
        'hop': 2,
        'destination': 3,
        'volume': 100,
    },{
        'hop': 2,
        'destination': 4,
        'volume': 250,
    },{
        'source': 2,
        'hop': 2,
        'volume': 200,
    },
]

Thanks a lot for your help! 非常感谢你的帮助！

Answer 1

You can collect the links to two different dictionaries, one between source & hop and another one between hop & destination. 您可以收集到两个不同字典的链接，一个在源与跃点之间，另一个在跃点与目的地之间。 Then you can easily create the result list separately from both of the dicts. 然后，您可以轻松地将结果与两个字典分开创建。 Below Counter is used which is dict like object with 0 as default value: 在Counter下面使用了像对象一样的dict ，默认值为0：

import pprint
from collections import Counter

flows = [
    {
        'source': 1,
        'hop': 2,
        'destination': 3,
        'volume': 100.5,
    },{
        'source': 1,
        'hop': 2,
        'destination': 4,
        'volume': 50,
    },{
        'source': 2,
        'hop': 2,
        'destination': 4,
        'volume': 200.7,
    },
]

sources = Counter()
hops = Counter()

for f in flows:
    sources[f['source'], f['hop']] += f['volume']
    hops[f['hop'], f['destination']] += f['volume']

res = [{'source': source, 'hop': hop, 'volume': vol} for (source, hop), vol in sources.items()]
res.extend([{'hop': hop, 'destination': dest, 'volume': vol} for (hop, dest), vol in hops.items()])
pprint.pprint(res)

Output: 输出：

[{'hop': 2, 'source': 1, 'volume': 150.5},
 {'hop': 2, 'source': 2, 'volume': 200.7},
 {'destination': 3, 'hop': 2, 'volume': 100.5},
 {'destination': 4, 'hop': 2, 'volume': 250.7}]

Above will run in O(n) time so it should work with millions of flows provided you have enough memory. 上面的代码将以O（n）的时间运行，因此只要您有足够的内存，它就可以处理数百万个流。

Answer 2

pseudo algorithm: 伪算法：

create an empty result list/set/dictionary 创建一个空的结果列表/集合/字典
loop over de flows list 循环流列表
split up each single flow into 2 links 将每个单独的流程分成2个链接
for each of these 2 links test if they are already in the result list (based on the 2 nodes). 对于这2个链接中的每一个，测试它们是否已在结果列表中（基于2个节点）。
if not: add them. 如果没有：添加它们。 if yes: upgrade the volume of the one already in the list. 如果是，请升级列表中已存在的卷的音量。

Python根据匹配的键/值对减少字典列表

问题描述

2 个解决方案

解决方案1
2 已采纳 2017-02-14 10:17:48

解决方案2
0 2017-02-14 10:01:41

Python根据匹配的键/值对减少字典列表

问题描述

2 个解决方案

解决方案1 2 已采纳 2017-02-14 10:17:48

解决方案2 0 2017-02-14 10:01:41

解决方案1
2 已采纳 2017-02-14 10:17:48

解决方案2
0 2017-02-14 10:01:41