[英]count occurances in list of sets
I have a variable containing of list of lists contianing 2 sets, it looks like this:我有一个包含 2 个集合列表的变量,它看起来像这样:
[[{'angular', 'java', 'sql', 'xml-schema'},
{'db2', 'docker', 'git', 'hibernate', 'jenkins', 'maven', 'rest'}],
[{'java'}, {'maven'}],
[{'java'}, {'oracle'}],
[{'c++', 'cobol', 'java', 'javascript'}, set()],
[{'angular', 'java'}, set()],
[{'java'}, set()]]
Now what I would like to do is count the occurances of every single item alltogether, I'm just not sure how to go about this.现在我想做的是计算每个项目的出现次数,我只是不知道该怎么做。 Should I flatten the whole list or is there some function regarding sets that can do this?我应该展平整个列表还是有一些关于可以做到这一点的集合的功能?
Thanks!谢谢!
You may use a collections.Counter
and provide him a flatten version of your data您可以使用collections.Counter
并向他提供您的数据的扁平版本
from collections import Counter
values: list[list[set[str]]] = [
[{'angular', 'java', 'sql', 'xml-schema'}, {'db2', 'docker', 'git', 'hibernate', 'jenkins', 'maven', 'rest'}],
[{'java'}, {'maven'}],
[{'java'}, {'oracle'}],
[{'c++', 'cobol', 'java', 'javascript'}, set()],
[{'angular', 'java'}, set()],
[{'java'}, set()]
]
language = 'java'
ocurrences = Counter([word for sublist in values for subset in sublist for word in subset])
print(ocurrences.most_common(3)) # [('java', 6), ('angular', 2), ('maven', 2)]
print(ocurrences[language]) # 6
If you want to separate the 2 sets, in language / other, do that way如果你想分开 2 套,在语言/其他,这样做
ocurrences_languages = Counter([word for sublist in values for word in sublist[0]])
print(ocurrences_languages.most_common(3)) # [('java', 6), ('angular', 2), ('sql', 1)]
ocurrences_other = Counter([word for sublist in values for word in sublist[1]])
print(ocurrences_other.most_common(3)) # [('maven', 2), ('docker', 1), ('rest', 1)]
If you want it without external modules, you can try this function I wrote which will work with any format of list as long as the root object is a string (changeable):如果你想要它没有外部模块,你可以试试我写的这个函数,只要根对象是一个字符串(可变),它就可以处理任何格式的列表:
def count(list_, count_dict=dict()):
for i in list_:
if type(i) == str:
if i in list(count_dict.keys()):
count_dict[i] += 1
else:
count_dict[i] = 1
else:
count_dict.update(count(i, count_dict))
return count_dict
li = [[{'angular', 'java', 'sql', 'xml-schema'},
{'db2', 'docker', 'git', 'hibernate', 'jenkins', 'maven', 'rest'}],
[{'java'}, {'maven'}],
[{'java'}, {'oracle'}],
[{'c++', 'cobol', 'java', 'javascript'}, set()],
[{'angular', 'java'}, set()],
[{'java'}, set()]]
print(count(li))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.