繁体   English   中英

在字典列表中汇总键和值python

[英]Summing keys and values in a list of dictionaries python

我有一个名为“ timebucket”的词典列表:

[{0.9711533363722904: 0.008296776727415599}, 
 {0.97163564816067838: 0.008153794130319884}, 
 {0.99212783984967068: 0.0022392112909864364}, 
 {0.98955473263127025: 0.0029843621053514003}]

我想返回前两个最大的键(.99和.98)并对它们进行平均,再加上,同时获得它们的值和平均值。

预期的输出将类似于:

 { (avg. two largest keys) : (avg. values of two largest keys) }

我试过了:

import numpy as np
import heapq
[np.mean(heapq.nlargest(2, i.keys())) for i in timebucket]

但是heapq在这种情况下不起作用,并且不确定如何保持键和值的链接

numpy做到这一点:

In []:
a = np.array([e for i in timebucket for e in i.items()]); 
a[a[:,1].argsort()][:2].mean(axis=0)

Out[]
array([ 0.99084129,  0.00261179])

尽管我怀疑预先创建更好的数据结构可能是更好的方法。

这将为您提供两个最大键的平均值(键)和两个对应值的平均值(键)。 键和值放入名为newdict的字典中。

timebucket = [{0.9711533363722904: 0.008296776727415599}, 
 {0.97163564816067838: 0.008153794130319884}, 
 {0.99212783984967068: 0.0022392112909864364}, 
 {0.98955473263127025: 0.0029843621053514003}]
keys = []
for time in timebucket:
  for x in time:
    keys.append(x)
result = {}
for d in timebucket:
   result.update(d)


largestkey = (sorted(keys)[-1])
ndlargestkey = (sorted(keys)[-2])
keyave = (float((largestkey)+(ndlargestkey))/2)

largestvalue = (result[(largestkey)])
ndlargestvalue = (result[(ndlargestkey)])
valave = (float((largestvalue)+(ndlargestvalue))/2)
newdict = {}
newdict[keyave] = valave

print(newdict)
#print(keyave)
#print(valave)

输出{0.9908412862404705: 0.002611786698168918}

这是您的问题的解决方案:

def dothisthing(mydict) # define the function with a dictionary a the only parameter
    keylist = [] # create an empty list
    for key in mydict: # iterate the input dictionary
        keylist.append(key) # add the key from the dictionary to a list
    keylist.sort(reverse = True) # sort the list  from highest to lowest numbers
    toptwokeys = 0 # create a variable
    toptwovals = 0 # create a variable
    count = 0 # create an integer variable
    for item in keylist: # iterate the list we created above
        if count <2: # this limits the iterations to the first 2 
            toptwokeys += item # add the key 
            toptwovals += (mydict[item]) # add the value
        count += 1
    finaldict = {(toptwokeys/2):(toptwovals/2)} # create a dictionary where the key and val are the average of the 2 from the input dict with the greatest keys
    return finaldict # return the output dictionary

dothisthing({0.9711533363722904: 0.008296776727415599, 0.97163564816067838: 0.008153794130319884, 0.99212783984967068: 0.0022392112909864364, 0.98955473263127025: 0.0029843621053514003})
#call the function with your dictionary as the parameter

希望对您有所帮助

您只需输入四行即可完成此操作,而无需导入numpy:

一线解决方案

对于两个最大平均密钥:

max_keys_average=sorted([keys for item in timebucket for keys,values in item.items()])[::-1][:2]


print(sum(max_keys_average)/len(max_keys_average))

输出:

0.9908412862404705

平均而言:

max_values_average=[values for item in max_keys_average for item_1 in timebucket for keys,values in item_1.items() if item==keys]

print(sum(max_values_average)/len(max_values_average))

输出:

0.002611786698168918

如果您在理解列表理解方面遇到问题,这里为您提供详细的解决方案:

详细解决方案

第一步:

在一个列表中获取dict的所有键:

Here is your timebucket list:

timebucket=[{0.9711533363722904: 0.008296776727415599},
 {0.97163564816067838: 0.008153794130319884},
 {0.99212783984967068: 0.0022392112909864364},
 {0.98955473263127025: 0.0029843621053514003}]

现在,将所有密钥存储在一个列表中:

keys_list=[]

for dict in timebucket:
    for key,value in dict.items():
        keys_list.append(key)

现在,下一步是对该列表进行排序,并获取此列表的最后两个值:

max_keys=sorted(keys_list)[::-1][:2]

下一步只需将这个新列表的总和除以len即可:

print(sum(max_keys)/len(max_keys))

输出:

0.9908412862404705

现在,只需迭代timebucket中的max_keys和key,看看两个项目是否匹配,然后在列表中获取该项目的值即可。

max_values=[]

for item in max_keys:
    for dict in timebucket:
        for key, value in dict.items():
            if item==key:
                max_values.append(value)

print(max_values)

现在最后一部分,取总和除以max_values的len:

print(sum(max_values)/len(max_values))

给出输出:

0.002611786698168918

这是该问题的替代解决方案:

In []:
import numpy as np
import time

def AverageTB(time_bucket):
    tuples = [tb.items() for tb in time_bucket]
    largest_keys = []
    largest_keys.append(max(tuples))
    tuples.remove(max(tuples))
    largest_keys.append(max(tuples))
    keys = [i[0][0] for i in largest_keys]
    values = [i[0][1] for i in largest_keys]
    return np.average(keys), np.average(values)

time_bucket = [{0.9711533363722904: 0.008296776727415599},
           {0.97163564816067838: 0.008153794130319884},
           {0.99212783984967068: 0.0022392112909864364},
           {0.98955473263127025: 0.0029843621053514003}]
time_exe = time.time()
print('avg. (keys, values): {}'.format(AverageTB(time_bucket)))
print('time: {}'.format(time.time() - time_exe))


Out[]:
avg. (keys, values): (0.99084128624047052, 0.0026117866981689181)
time: 0.00037789344787 

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM