[英]Python list to np.array of counts
Suppose we have a vector size N=1000
and let's say we get the list [1,1,2,2,2,100]
假设我们有一个向量大小
N=1000
,假设我们得到列表[1,1,2,2,2,100]
I'd like to generate an np.array (or pd.Series) of size 1000 where v[n]
is the number of times n
appears in the list.我想生成一个大小为 1000 的 np.array (或 pd.Series),其中
v[n]
是n
出现在列表中的次数。 In our example, v[1] = 2, v[2] = 3, v[100] = 1, v=[42] = 0
在我们的例子中,
v[1] = 2, v[2] = 3, v[100] = 1, v=[42] = 0
How can I do that with numpy/pandas elegantly?我怎样才能优雅地用 numpy/pandas 做到这一点?
If you have a list mylist
, you can get an array of counts mycount
:如果你有一个列表
mylist
,你可以得到一个计数数组mycount
:
N = 1000
x = np.array(mylist)
mycount = np.bincount(x, minlength=N)
This sorts each element in the array into bins based on its value and quantity.这会将数组中的每个元素根据其值和数量分类到 bin 中。 You can find more information on
bincount
on this doc page .您可以在此文档页面上找到有关
bincount
的更多信息。
Python has a native method for counting occurrences called Counter
which can be used without invoking numpy
or pandas
if desired Python 有一个本地方法来计算称为
Counter
的出现次数,如果需要,可以在不调用numpy
或pandas
的情况下使用
from collections import Counter
a = [1,1,2,2,2,100]
cnts = Counter(a)
print(cnts)
# Counter({2: 3, 1: 2, 100: 1})
You can convert this to a list with a list comprehension:您可以将其转换为具有列表理解的列表:
N = 100
cnts_list = [cnts.get(i, 0) for i in range(N+1)]
Use Series.value_counts
with Series.reindex
for add non exist values:使用
Series.value_counts
和Series.reindex
来添加不存在的值:
a = [1,1,2,2,2,100]
N = 100
a = pd.Series(a).value_counts().reindex(range(N+1), fill_value=0)
print (a)
0 0
1 2
2 3
3 0
4 0
..
96 0
97 0
98 0
99 0
100 1
Length: 101, dtype: int64
You can use np.unique
as well.您也可以使用
np.unique
。
N = 1000
result = np.zeros(N)
idx, val = np.unique([1,1,2,2,2,100], return_counts=True)
result[idx] = val
print(result[:5])
>>>[0. 2. 3. 0. 0.]
more information: https://numpy.org/doc/stable/reference/generated/numpy.unique.html更多信息: https://numpy.org/doc/stable/reference/generated/numpy.unique.html
In[1]:
import pandas as pd
my_list = [1,1,1,2,2,2,2,3,4,8,1000,8,8,5,5,6]
my_Serie = pd.Series(my_list)
v = my_Serie.groupby(my_list).count().to_dict()
print(v)
{1: 3, 2: 4, 3: 1, 4: 1, 5: 2, 6: 1, 8: 3, 1000: 1}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.