[英]Python: summing up list of dictionaries with different keys and same values
[英]Summing keys and values in a list of dictionaries python
我有一个名为“ timebucket”的词典列表:
[{0.9711533363722904: 0.008296776727415599},
{0.97163564816067838: 0.008153794130319884},
{0.99212783984967068: 0.0022392112909864364},
{0.98955473263127025: 0.0029843621053514003}]
我想返回前两个最大的键(.99和.98)并对它们进行平均,再加上,同时获得它们的值和平均值。
预期的输出将类似于:
{ (avg. two largest keys) : (avg. values of two largest keys) }
我试过了:
import numpy as np
import heapq
[np.mean(heapq.nlargest(2, i.keys())) for i in timebucket]
但是heapq在这种情况下不起作用,并且不确定如何保持键和值的链接
用numpy
做到这一点:
In []:
a = np.array([e for i in timebucket for e in i.items()]);
a[a[:,1].argsort()][:2].mean(axis=0)
Out[]
array([ 0.99084129, 0.00261179])
尽管我怀疑预先创建更好的数据结构可能是更好的方法。
这将为您提供两个最大键的平均值(键)和两个对应值的平均值(键)。 键和值放入名为newdict的字典中。
timebucket = [{0.9711533363722904: 0.008296776727415599},
{0.97163564816067838: 0.008153794130319884},
{0.99212783984967068: 0.0022392112909864364},
{0.98955473263127025: 0.0029843621053514003}]
keys = []
for time in timebucket:
for x in time:
keys.append(x)
result = {}
for d in timebucket:
result.update(d)
largestkey = (sorted(keys)[-1])
ndlargestkey = (sorted(keys)[-2])
keyave = (float((largestkey)+(ndlargestkey))/2)
largestvalue = (result[(largestkey)])
ndlargestvalue = (result[(ndlargestkey)])
valave = (float((largestvalue)+(ndlargestvalue))/2)
newdict = {}
newdict[keyave] = valave
print(newdict)
#print(keyave)
#print(valave)
输出{0.9908412862404705: 0.002611786698168918}
这是您的问题的解决方案:
def dothisthing(mydict) # define the function with a dictionary a the only parameter
keylist = [] # create an empty list
for key in mydict: # iterate the input dictionary
keylist.append(key) # add the key from the dictionary to a list
keylist.sort(reverse = True) # sort the list from highest to lowest numbers
toptwokeys = 0 # create a variable
toptwovals = 0 # create a variable
count = 0 # create an integer variable
for item in keylist: # iterate the list we created above
if count <2: # this limits the iterations to the first 2
toptwokeys += item # add the key
toptwovals += (mydict[item]) # add the value
count += 1
finaldict = {(toptwokeys/2):(toptwovals/2)} # create a dictionary where the key and val are the average of the 2 from the input dict with the greatest keys
return finaldict # return the output dictionary
dothisthing({0.9711533363722904: 0.008296776727415599, 0.97163564816067838: 0.008153794130319884, 0.99212783984967068: 0.0022392112909864364, 0.98955473263127025: 0.0029843621053514003})
#call the function with your dictionary as the parameter
希望对您有所帮助
您只需输入四行即可完成此操作,而无需导入numpy:
一线解决方案
对于两个最大平均密钥:
max_keys_average=sorted([keys for item in timebucket for keys,values in item.items()])[::-1][:2]
print(sum(max_keys_average)/len(max_keys_average))
输出:
0.9908412862404705
平均而言:
max_values_average=[values for item in max_keys_average for item_1 in timebucket for keys,values in item_1.items() if item==keys]
print(sum(max_values_average)/len(max_values_average))
输出:
0.002611786698168918
如果您在理解列表理解方面遇到问题,这里为您提供详细的解决方案:
详细解决方案
第一步:
在一个列表中获取dict的所有键:
Here is your timebucket list:
timebucket=[{0.9711533363722904: 0.008296776727415599},
{0.97163564816067838: 0.008153794130319884},
{0.99212783984967068: 0.0022392112909864364},
{0.98955473263127025: 0.0029843621053514003}]
现在,将所有密钥存储在一个列表中:
keys_list=[]
for dict in timebucket:
for key,value in dict.items():
keys_list.append(key)
现在,下一步是对该列表进行排序,并获取此列表的最后两个值:
max_keys=sorted(keys_list)[::-1][:2]
下一步只需将这个新列表的总和除以len即可:
print(sum(max_keys)/len(max_keys))
输出:
0.9908412862404705
现在,只需迭代timebucket中的max_keys和key,看看两个项目是否匹配,然后在列表中获取该项目的值即可。
max_values=[]
for item in max_keys:
for dict in timebucket:
for key, value in dict.items():
if item==key:
max_values.append(value)
print(max_values)
现在最后一部分,取总和除以max_values的len:
print(sum(max_values)/len(max_values))
给出输出:
0.002611786698168918
这是该问题的替代解决方案:
In []:
import numpy as np
import time
def AverageTB(time_bucket):
tuples = [tb.items() for tb in time_bucket]
largest_keys = []
largest_keys.append(max(tuples))
tuples.remove(max(tuples))
largest_keys.append(max(tuples))
keys = [i[0][0] for i in largest_keys]
values = [i[0][1] for i in largest_keys]
return np.average(keys), np.average(values)
time_bucket = [{0.9711533363722904: 0.008296776727415599},
{0.97163564816067838: 0.008153794130319884},
{0.99212783984967068: 0.0022392112909864364},
{0.98955473263127025: 0.0029843621053514003}]
time_exe = time.time()
print('avg. (keys, values): {}'.format(AverageTB(time_bucket)))
print('time: {}'.format(time.time() - time_exe))
Out[]:
avg. (keys, values): (0.99084128624047052, 0.0026117866981689181)
time: 0.00037789344787
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.