平均Python中的字典列表

Question

我已经将每日值列表按顺序排列成字典，如下所示：

vals = [
    {'date': '1-1-2014', 'a': 10, 'b': 33.5, 'c': 82, 'notes': 'high repeat rate'},
    {'date': '2-1-2014', 'a': 5, 'b': 11.43, 'c': 182, 'notes': 'normal operations'},
    {'date': '3-1-2014', 'a': 0, 'b': 0.5, 'c': 2, 'notes': 'high failure rate'},
    ...]

我想做的是获得该月的平均a，b和c。

是否有比做类似的更好的方法：

val_points = {}
val_len = len(vals)

for day in vals:
    for p in ['a', 'b', 'c']:
        if val_points.has_key(p):
            val_points += day[p]
        else:
            val_points = day[p]

val_avg = dict([(i, val_points[i] / val_len] for p in val_points])

我没有运行上面的代码，可能会有故障，但是我希望我能理解。 我知道使用运算符，itertools和collection的某种组合可能是更好的方法。

Answer 1

{p:sum(map(lambda x:x[p],vals))/len(vals) for p in ['a','b','c']}

输出：

{'a': 5, 'c': 88, 'b': 15.143333333333333}

Answer 2

这可能比以利沙的答案稍长，但是中间数据结构较少，因此可能更快：

KEYS = ['a', 'b', 'c']

def sum_and_count(sums_and_counts, item, key):
    prev_sum, prev_count = sums_and_counts.get(key, (0,0)) # using get to have a fall-back if there is nothing in our sums_and_counts
    return (prev_sum+item.get(key, 0), prev_count+1) # using get to have a 0 default for a non-existing key in item

sums_and_counts = reduce(lambda sc, item: {key: sum_and_count(sc, item, key) for key in KEYS}, vals, {})

averages = {k:float(total)/no for (k,(total,no)) in sums_and_counts.iteritems()}
print averages

输出：

{'a': 5.0, 'c': 88.66666666666667, 'b': 15.143333333333333}

Answer 3

如要按月计算平均值（此处考虑“ dd-mm-yyyy”中的日期格式）：

vals = [
    {'date': '1-1-2014', 'a': 10, 'b': 33.5, 'c': 82, 'notes': 'high repeat rate'},
    {'date': '2-1-2014', 'a': 5, 'b': 11.43, 'c': 182, 'notes': 'normal operations'},
    {'date': '3-1-2014', 'a': 20, 'b': 0.5, 'c': 2, 'notes': 'high failure rate'},
    {'date': '3-2-2014', 'a': 0, 'b': 0.5, 'c': 2, 'notes': 'high failure rate'},
    {'date': '4-2-2014', 'a': 20, 'b': 0.5, 'c': 2, 'notes': 'high failure rate'}
    ]

month = {}

for x in vals:
    newKey =  x['date'].split('-')[1]
    if newKey not in month:
        month[newKey] = {}   

    for k in 'abc':

        if k in month[newKey]:
             month[newKey][k].append(x[k])
        else:
             month[newKey][k] = [x[k]]


output = {}
for y in month:
    if y not in output:
        output[y] = {}
    for z in month[y]:
        output[y][z] = sum(month[y][z])/float(len(month[y][z]))

print output

输出：

{'1': {'a': 11.666666666666666, 'c': 88.66666666666667, 'b': 15.143333333333333}, 
 '2': {'a': 10.0, 'c': 2.0, 'b': 0.5}}

Answer 4

如果您有多个月的数据，熊猫将使您的生活更加轻松：

df = pandas.DataFrame(vals)
df.date = [pandas.datetools.parse(d, dayfirst=True) for d in df.date]
df.set_index('date', inplace=True)
means = df.resample('m', how='mean')

结果是：

            a          b          c
date                               
2014-01-31  5  15.143333  88.666667

平均Python中的字典列表

问题描述

4 个解决方案

解决方案1
3 已采纳 2014-09-04 09:33:29

解决方案2
1 2014-09-04 10:02:40

解决方案3
1 2014-09-04 10:57:30

解决方案4
0 2014-09-04 10:35:17

平均Python中的字典列表

问题描述

4 个解决方案

解决方案1 3 已采纳 2014-09-04 09:33:29

解决方案2 1 2014-09-04 10:02:40

解决方案3 1 2014-09-04 10:57:30

解决方案4 0 2014-09-04 10:35:17

解决方案1
3 已采纳 2014-09-04 09:33:29

解决方案2
1 2014-09-04 10:02:40

解决方案3
1 2014-09-04 10:57:30

解决方案4
0 2014-09-04 10:35:17