用python对字典进行排序和汇总

Question

I want to normalize a directed graph. 我想规范一个有向图。 Currently I am doing this: 目前，我正在这样做：

def normalize_distribution(in_degree_dist):
    '''
    normalize the distribution so all values sum up to 1
    '''
    # calcuate normalization factor
    factor=1.0/sum(in_degree_dist.itervalues())

    # sort the dictionary
    sorted_in_degree_dist = OrderedDict(sorted(in_degree_dist.iteritems(), key= lambda srt: srt[0]))

    # apply the factor to every value
    for key in sorted_in_degree_dist:
        sorted_in_degree_dist[key] *= factor
    return sorted_in_degree_dist

I found out that I am doing one iteration too much. 我发现我做了太多迭代。 While iterating over the dictionary keys I could sum up the values. 在遍历字典键的同时，我可以总结这些值。 So I would only do one iteration instead of two a huge saver if the graph gets big. 因此，如果图形变大，我将只进行一次迭代，而不是两次迭代。

So I replaced the lambda with my own function. 所以我用自己的函数替换了lambda。 But the sorting is kind of smart so the key function isn't called on every element which is kind of necessary for sorting. 但是排序是一种聪明的做法，因此并非在排序所必需的每个元素上都调用键函数。

def sort_sum(*args, **kwargs):
    '''
    Sorting and summing up
    '''
    print args
    return args[0][0]

For a graph with n = 20 nodes and p = 0.5 the output is: 对于n = 20个节点且p = 0.5的图，输出为：

Random graph with 20 nodes and probability 0.5 created.
Degree distribution calculated.
((4, 1),)
((8, 4),)
((9, 1),)
((10, 5),)
((11, 4),)
((12, 3),)
((13, 1),)
((14, 1),)
Degree distribution normalized.

Only eight calls for 20 elements, which is good for sorting but bad for summing. 只有8个元素需要20个元素，这对排序很有好处，但对求和却不利。

I thought of list comprehension to do this 我想到了列表理解

[key, val for key, val in in_degree_distribuion.iteritems()]

but I can't figure out how to sum up. 但我不知道该怎么总结。

Do I have to write my own sorting and summing algorithm to do this in one step? 我是否需要编写自己的排序和求和算法来一步完成？

Answer 1

You could always apply the multiplication factor whilst sorting (just after really), instead of doing the sum. 您始终可以在排序时（紧随其后）应用乘法因子，而不用求和。

eg. 例如。

# calcuate normalization factor
factor = 1.0 / sum(in_degree_dist.itervalues())

# sort the dictionary
sorted_in_degree_dist = OrderedDict((key, in_degree_dist[key] * factor)
    for key in sorted(in_degree_dist))

# or 
sorted_in_degree_dist = OrderedDict((key, value * factor)
    for key, value in sorted(in_degree_dist.iteritems()))

用python对字典进行排序和汇总

问题描述

1 个解决方案

解决方案1
0 已采纳 2014-09-06 11:59:27

用python对字典进行排序和汇总

问题描述

1 个解决方案

解决方案1 0 已采纳 2014-09-06 11:59:27

解决方案1
0 已采纳 2014-09-06 11:59:27