简体   繁体   English

键列表,如何在字典中查找最大值

[英]List of Keys, how to find max values in Dictionary

I have been working on an assignment gathering data, and counting how many times each thing appears from a big dataset about 500mb. 我一直在从事一项收集数据的作业,并计算大约500mb的大型数据集中每件事出现的次数。 I have a couple of dictionaries reading csv files and putting data together and my final dict looks like this after all of the data has been gathered and worked on. 我有几个字典读取csv文件并将数据放在一起,在收集并处理所有数据之后,我的最终字典看起来像这样。

I am almost done with the assigment but am stuck on this section, I need to find the top 5 max values between all keys and values. 我几乎完成了分配工作,但仍停留在本节中,我需要在所有键和值之间找到前5个最大值。

I have the following dictionary: 我有以下字典:

printed using: print key, task1[key]

KEY KEYVALUE

WA [[('1082225', 29), ('845195', 21), ('265021', 17)]]
DE [[('922397', 44), ('627084', 40), ('627297', 14)]]
DC [[('774648', 17), ('911624', 17), ('771241', 16)]]
WI [[('12618', 25), ('242582', 23), ('508727', 22)]]
WV [[('476050', 4), ('1016620', 3), ('769611', 3)]]
HI [[('466263', 5), ('226000', 5), ('13694', 4)]]

I pretty much need to go through and find the top 5 values and their ID number. 我非常需要检查并找到前5个值及其ID号。 for example 例如

  1. DE 922397 44 DE 922397 44
  2. DE 627084 40 DE 627084 40
  3. WA 1082225 29 西澳1082225 29

What would be the best way to do this? 最好的方法是什么?

**EDIT how i am putting together my task dictionary **编辑我如何整理我的任务字典

task1 = {}
for key,val in courses.items():
    task1[key] = [sorted(courses[key].iteritems(), key=operator.itemgetter(1), reverse=True)[:5]]

Assuming your dict looks something like: 假设您的dict看起来像:

mydict = {'WA': [('1082225', 29), ('845195', 21), ('265021', 17)], 'DE': [('922397', 44), ('627084', 40), ('627297', 14)], ...}

This is not the ideal representation. 这不是理想的表示。 If you run this, you can flatten the list into a better format: 如果运行此命令,则可以将列表展平为更好的格式:

data = [(k, idnum, v) for k, kvlist in mydict.items() for idnum, v in kvlist]

Now the data will look like: 现在数据看起来像:

[('WA', '1082225', 29), ('WA', '845195', 21), ('WA', '265021', 17), ('DE', '922397', 44), ...]

In this format, the data is clearly readable, and it is obvious what we need to search. 以这种格式,数据清晰可读,很显然我们需要搜索什么。 This line will sort the new tuples in descending order according to their [2] value: 此行将根据新元组的[2]值以降序对它们进行排序:

sorted(data, key=lambda x: x[2], reverse=True)

Note: the dictionary you provided has an unnecessary [] , so I removed that from the answer for clarity. 注意:您提供的字典没有必要的[] ,因此为了清楚起见,我从答案中删除了该字典。

Edited after clarification. 澄清后编辑。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM