计算列表中每个唯一项的出现次数

Question

I have a list of lists like the following: 我有一个类似于以下列表的列表：

listoflist = [["A", "B", "A", "C", "D"], ["Z", "A", "B", "C"], ["D", "D", "X", "Y", "Z"]]

I want to find the number of sublists that each unique value in listoflist occurs in. For example, "A" shows up in two sublists, while "D" shows up in two sublists also, even though it occurs twice in listoflist[3] . 我想找到出现在listoflist中的每个唯一值的子列表的数量。例如，“ A”出现在两个子列表中，而“ D”也出现在两个子列表中，即使它在listoflist[3]出现两次也是listoflist[3] 。

How can I get a dataframe which has each unique element in one column and the frequency (number of sublists each unique element shows up in)? 如何获得一个数据框，该数据框在一个列中包含每个唯一元素，并且显示频率（每个唯一元素显示在其中的子列表数）？

Answer 1

You can use: itertools.chain together with collections.Counter : 您可以使用： itertools.chain和collections.Counter ：

In [94]: import itertools as it

In [95]: from collections import Counter

In [96]: Counter(it.chain(*map(set, listoflist)))
Out[96]: Counter({'A': 2, 'B': 2, 'C': 2, 'D': 2, 'X': 1, 'Y': 1, 'Z': 2})

As mentioned in the comment by @Jean-François Fabre, you can also use: 如@Jean-FrançoisFabre的评论中所述，您还可以使用：

In [97]: Counter(it.chain.from_iterable(map(set, listoflist)))
Out[97]: Counter({'A': 2, 'B': 2, 'C': 2, 'D': 2, 'X': 1, 'Y': 1, 'Z': 2})

Answer 2

Essentially, it seems that you want something like 本质上，似乎您想要类似

Counter(x for xs in listoflist for x in set(xs))

Each list is converted into a set first, to exclude duplicates. 每个列表首先转换为一组，以排除重复项。 Then the sequence of sets is flatmapped and fed into the Counter . 然后将集合的序列映射并馈入Counter 。

Full code: 完整代码：

from collections import Counter

listoflist = [["A", "B", "A", "C", "D"], ["Z", "A", "B", "C"], ["D", "D", "X", "Y", "Z"]]

c = Counter(x for xs in listoflist for x in set(xs))

print(c)

Results in: 结果是：

# output:
# Counter({'B': 2, 'C': 2, 'Z': 2, 'D': 2, 'A': 2, 'Y': 1, 'X': 1})

Answer 3

Another way to do this is to use pandas: 另一种方法是使用熊猫：

import pandas as pd

df = pd.DataFrame(listoflist)
df.stack().reset_index().groupby(0)['level_0'].nunique().to_dict()

Output: 输出：

{'A': 2, 'B': 2, 'C': 2, 'D': 2, 'X': 1, 'Y': 1, 'Z': 2}

计算列表中每个唯一项的出现次数

问题描述

3 个解决方案

解决方案1
3 2018-05-20 19:17:38

解决方案2
2 2018-05-20 19:21:47

解决方案3
1 2018-05-21 17:03:08

计算列表中每个唯一项的出现次数

问题描述

3 个解决方案

解决方案1 3 2018-05-20 19:17:38

解决方案2 2 2018-05-20 19:21:47

解决方案3 1 2018-05-21 17:03:08

解决方案1
3 2018-05-20 19:17:38

解决方案2
2 2018-05-20 19:21:47

解决方案3
1 2018-05-21 17:03:08