[英]Python - count and group items in list stored in dictionary
I have seen examples on how to count items in dictionary or list. 我已经看到了有关如何计算字典或列表中项目的示例。 My dictionary stored multiple lists.
我的词典存储了多个列表。 Each list stores multiple items.
每个列表存储多个项目。
d = dict{}
d = {'text1': ['A', 'C', 'E', 'F'],
'text2': ['A'],
'text3': ['C', 'D'],
'text4': ['A', 'B'],
'text5': ['A']}
1. I want to count frequency of each alphabet, ie the results should be 1.我想计算每个字母的频率,即结果应为
A - 4
B - 1
C - 2
D - 1
E - 1
F - 1
2. I want to have group by each alphabet, ie the results should be 2.我想按每个字母分组,即结果应为
A - text1, text2, text4, text5
B - text4
C - text1, text3
D - text3
E - text1
F - text1
How can I achieve both by using some Python existing libraries without using many for loops? 如何通过使用一些现有的Python库而不使用许多for循环来实现这两者?
To get to (2), you would have to first invert the keys and values of a dictionary, and store them in a list. 要进入(2),您必须首先反转字典的键和值,并将它们存储在列表中。 Once you are there, use
groupby
with a key to get to the structure of (2). 到达那里后,使用
groupby
和一个键来访问(2)的结构。
from itertools import groupby
arr = [(x,t) for t, a in d.items() for x in a]
# [('A', 'text2'), ('C', 'text3'), ('D', 'text3'), ('A', 'text1'), ('C', 'text1'), ('E', 'text1'), ('F', 'text1'), ('A', 'text4'), ('B', 'text4'), ('A', 'text5')]
res = {g: [x[1] for x in items] for g, items in groupby(sorted(arr), key=lambda x: x[0])}
#{'A': ['text1', 'text2', 'text4', 'text5'], 'C': ['text1', 'text3'], 'B': ['text4'], 'E': ['text1'], 'D': ['text3'], 'F': ['text1']}
res2 = {x: len(y) for x, y in res.items()}
#{'A': 4, 'C': 2, 'B': 1, 'E': 1, 'D': 1, 'F': 1}
PS: I am hoping you'd meaningful variable names in your real code. PS:我希望您在真实代码中使用有意义的变量名。
There are a few ways to accomplish this, but if you'd like to handle things without worrying about import
ing additional modules or installing and importing external modules, this method will work cleanly 'out of the box.' 有几种方法可以完成此操作,但是如果您希望处理这些事情而不必担心
import
其他模块或安装和导入外部模块,则此方法将“开箱即用”。
With d
as your starting dictionary: 以
d
作为起始字典:
d = {'text1': ['A', 'C', 'E', 'F'],
'text2': ['A'],
'text3': ['C', 'D'],
'text4': ['A', 'B'],
'text5': ['A']}
create a new dict
, called letters
, for your results to live in, and populate it with your letters, taken from d.keys()
, by creating the letter key if it isn't present, and creating a list with the count and the key from d
as it's value. 创建一个新的
dict
,呼吁letters
,你的结果住,并与你的信,取自填充它d.keys()
创建如果它不存在的字母键,并创建与计数的列表和来自d
的键值。 If it's already there, increment the count, and append the current key from d
to it's d
key list in the value. 如果已经存在,则增加计数,并将当前键从
d
附加到值的d
键列表中。
letters = {}
for item in d.keys():
for letter in d[item]:
if letter not in letters.keys():
letters[letter] = [1,[item]]
else:
letters[letter][0] += 1
letters[letter][1] += [item]
This leaves you with a dict
called letters
containing values of the counts and the keys from d
that contain the letter, like this: 这样,您便得到了一个包含
letters
的dict
letters
其中包含计数值以及d
中包含字母的键,如下所示:
{'E': [1, ['text1']], 'C': [2, ['text3', 'text1']], 'F': [1, ['text1']], 'A': [4, ['text2', 'text4', 'text1', 'text5']], 'B': [1, ['text4']], 'D': [1, ['text3']]}`
Now, to print your first list, do: 现在,要打印您的第一个列表,请执行以下操作:
for letter in sorted(letters):
print(letter, letters[letter][0])
printing each letter and the contents of the first, or 'count' index of the list as its value, and using the built-in sorted()
function to put things in order. 打印每个字母和列表的第一个索引(或“计数”索引)的内容作为其值,并使用内置的
sorted()
函数对事物进行排序。
To print the second, likewise sorted()
, do the same, but with the second, or 'key', index of the list in its value, .joined
using a ,
into a string: 要打印第2,同样
sorted()
做同样的,但与第二或“钥匙”,它的价值列表中的指标, .joined
使用,
为一个字符串:
for letter in sorted(letters):
print(letter, ', '.join(letters[letter][1]))
To ease Copy/Paste, here's the code unbroken by my ramblings: 为了简化“复制/粘贴”操作,以下是我杂乱无章的代码:
d = {'text1': ['A', 'C', 'E', 'F'],
'text2': ['A'],
'text3': ['C', 'D'],
'text4': ['A', 'B'],
'text5': ['A']}
letters = {}
for item in d.keys():
for letter in d[item]:
if letter not in letters.keys():
letters[letter] = [1,[item]]
else:
letters[letter][0] += 1
letters[letter][1] += [item]
print(letters)
for letter in letters:
print(letter, letters[letter][0])
print()
for letter in letters:
print(letter, ', '.join(letters[letter][1]))
Hope this helps! 希望这可以帮助!
from collections import Counter, defaultdict
from itertools import chain
d = {'text1': ['A', 'C', 'E', 'F'],
'text2': ['A'],
'text3': ['C', 'D'],
'text4': ['A', 'B'],
'text5': ['A']}
counter = Counter(chain.from_iterable(d.values()))
group = defaultdict(list)
for k, v in d.items():
for i in v:
group[i].append(k)
out: 出:
Counter({'A': 4, 'B': 1, 'C': 2, 'D': 1, 'E': 1, 'F': 1})
defaultdict(list,
{'A': ['text2', 'text4', 'text1', 'text5'],
'B': ['text4'],
'C': ['text1', 'text3'],
'D': ['text3'],
'E': ['text1'],
'F': ['text1']})
from collections import defaultdict
alphabets = defaultdict(list)
his is a way to acheive this:
for text, letters in d.items():
for letter in letters:
alphabets[letter].append(text)
for letter, texts in sorted(alphabets.items()):
print(letter, texts)
for letter, texts in sorted(alphabets.items()):
print(letter, len(texts))
note that if you have A - text1, text2, text4, text5
to get to A - 4
is just a matter of counting the texts. 请注意,如果您拥有
A - text1, text2, text4, text5
才能到达A - 4
则只需对文本进行计数即可。
For your first task: 对于您的第一个任务:
from collections import Counter
d = {
'text1': ['A', 'C', 'E', 'F'],
'text2': ['A'],
'text3': ['C', 'D'],
'text4': ['A', 'B'],
'text5': ['A']
}
occurrences = Counter(''.join(''.join(values) for values in d.values()))
print(sorted(occurrences.items(), key=lambda l: l[0]))
Now let me explain it: 现在让我解释一下:
As I saw, you already have the solution for your second problem. 如我所见,您已经有了第二个问题的解决方案。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.