简体   繁体   English

嵌套字典查找具有最高或最多频率值的键

[英]Nested dictionary find key with highest or most frequency value

I have many lists that contain dictionary looks like this:我有很多包含字典的列表,如下所示:

a = [{'health': {'medical_emergency': 1.0}}, {'scitech': {'technology': 1.0, 'computer': 1.0, 'programming': 1.0}}]
b = [{'politics': {'government': 1.0}}, {'travel': {'vacation': 1.0, 'traveling': 1.0, 'tourism': 1.0}}, {'finance': {'business': 1.0}}]
c = [{'sports': {'sports': 2.0}}, {'health': {'exercise': 1.0}}]

The structure is {class: {keyword: number_of_times_the_keyword_occur}}结构是{class: {keyword: number_of_times_the_keyword_occur}}

They are in different lengths.它们的长度不同。 How can I get the class with the highest scores of the value or the class with the most frequency value?怎样才能得到分值最高的class或频率值最高的class?

For example,例如,

in a: it should return scitech, because it has three keywords (technology, computer, programming) in the scitech, and health only has one keyword. in a:应该返回scitech,因为scitech里面有3个关键词(技术、计算机、编程),而health只有一个关键词。

in b: it should return travel, the reason is same as case a. b:应该是回程,原因同a。

in c: it should return sports, because in the sports class, the keyword 'sport' occurs two times, but the health class the keyword(exercise) only happens once在 c 中:它应该返回运动,因为在运动 class 中,关键字“运动”出现两次,但健康 class 关键字(锻炼)只出现一次

Here is what I've tried:这是我尝试过的:

import operator
for i in range(len(a)):
    print(max(a[i].items(), key=operator.itemgetter(1))[0])

But it will only return all the key.但它只会返回所有密钥。

Here's one way to do it:这是一种方法:

a = [{'health': {'medical_emergency': 1.0}}, {'scitech': {'technology': 1.0, 'computer': 1.0, 'programming': 1.0}}]
b = [{'politics': {'government': 1.0}}, {'travel': {'vacation': 1.0, 'traveling': 1.0, 'tourism': 1.0}}, {'finance': {'business': 1.0}}]
c = [{'sports': {'sports': 2.0}}, {'health': {'exercise': 1.0}}]

def get_max(l):
    cnt = []
    for d in l:
        for k,v in d.items():
            cnt.append([k,sum(v.values())])
    return sorted(cnt,key = lambda x : x[1],reverse=True)

print(get_max(a))
print(get_max(b))
print(get_max(c))

Output: Output:

[['scitech', 3.0], ['health', 1.0]]
[['travel', 3.0], ['politics', 1.0], ['finance', 1.0]]
[['sports', 2.0], ['health', 1.0]]

You can get the values you want at the first element您可以在第一个元素处获得所需的值

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM