Python 列表到 np.array 的计数

Question

假设我们有一个向量大小N=1000 ，假设我们得到列表[1,1,2,2,2,100]

我想生成一个大小为 1000 的 np.array （或 pd.Series），其中v[n]是n出现在列表中的次数。 在我们的例子中， v[1] = 2, v[2] = 3, v[100] = 1, v=[42] = 0

我怎样才能优雅地用 numpy/pandas 做到这一点？

Answer 1

如果你有一个列表mylist ，你可以得到一个计数数组mycount ：

N = 1000
x = np.array(mylist)
mycount = np.bincount(x, minlength=N)

这会将数组中的每个元素根据其值和数量分类到 bin 中。 您可以在此文档页面上找到有关bincount的更多信息。

Answer 2

Python 有一个本地方法来计算称为Counter的出现次数，如果需要，可以在不调用numpy或pandas的情况下使用

from collections import Counter
a = [1,1,2,2,2,100]
cnts = Counter(a)
print(cnts)
# Counter({2: 3, 1: 2, 100: 1})

您可以将其转换为具有列表理解的列表：

N = 100
cnts_list = [cnts.get(i, 0) for i in range(N+1)]

Answer 3

使用Series.value_counts和Series.reindex来添加不存在的值：

a = [1,1,2,2,2,100]

N = 100
a = pd.Series(a).value_counts().reindex(range(N+1), fill_value=0)
print (a)
0      0
1      2
2      3
3      0
4      0
      ..
96     0
97     0
98     0
99     0
100    1
Length: 101, dtype: int64

Answer 4

您也可以使用np.unique 。

N = 1000
result = np.zeros(N)
idx, val = np.unique([1,1,2,2,2,100], return_counts=True)
result[idx] = val
print(result[:5])                                                                                                                                                                                                                                                           
>>>[0. 2. 3. 0. 0.]

更多信息： https://numpy.org/doc/stable/reference/generated/numpy.unique.html

Answer 5

您可以使用系列和分组方式

In[1]:

import pandas as pd
my_list = [1,1,1,2,2,2,2,3,4,8,1000,8,8,5,5,6]

my_Serie = pd.Series(my_list)
v = my_Serie.groupby(my_list).count().to_dict()
print(v)

{1: 3, 2: 4, 3: 1, 4: 1, 5: 2, 6: 1, 8: 3, 1000: 1}

Python 列表到 np.array 的计数

问题描述

5 个解决方案

解决方案1
3 已采纳 2020-12-08 13:56:07

解决方案2
2 2020-12-08 14:00:29

解决方案3
1 2020-12-08 13:54:17

解决方案4
1 2020-12-08 14:17:13

解决方案5
0 2020-12-08 14:24:06

您可以使用系列和分组方式

Python 列表到 np.array 的计数

问题描述

5 个解决方案

解决方案1 3 已采纳 2020-12-08 13:56:07

解决方案2 2 2020-12-08 14:00:29

解决方案3 1 2020-12-08 13:54:17

解决方案4 1 2020-12-08 14:17:13

解决方案5 0 2020-12-08 14:24:06

您可以使用系列和分组方式

解决方案1
3 已采纳 2020-12-08 13:56:07

解决方案2
2 2020-12-08 14:00:29

解决方案3
1 2020-12-08 13:54:17

解决方案4
1 2020-12-08 14:17:13

解决方案5
0 2020-12-08 14:24:06