[英]how to count occurrence of each unique value in pandas
I have large pandas dataframe, I would like to count the occurrence of each unique value in it, I try following but it takes to much time and memory usage.我有大熊猫数据框,我想计算其中每个唯一值的出现次数,我尝试遵循但需要花费大量时间和内存使用量。 How can I do it in a pythonic way?
我怎样才能以pythonic的方式做到这一点?
pack=[]
for index,row in packets.iterrows ():
pack.extend(pd.Series(row).dropna().values.tolist())
unique, count= np.unique(pack, return_counts=True)
counts= np.asarray((unique, count))
It seems like you want to compute value counts across all columns .似乎您想计算所有列的值计数。 You can flatten it to a series, drop NaNs, and call
value_counts
.您可以将其展平为一个系列,删除 NaN,然后调用
value_counts
。 Here's a sample -这是一个示例 -
df
a b
0 1.0 NaN
1 1.0 NaN
2 3.0 3.0
3 NaN 4.0
4 5.0 NaN
5 NaN 4.0
6 NaN 5.0
pd.Series(df.values.ravel()).dropna().value_counts()
5.0 2
4.0 2
3.0 2
1.0 2
dtype: int64
Another method is with np.unique
-另一种方法是使用
np.unique
-
u, c = np.unique(pd.Series(df.values.ravel()).dropna().values, return_counts=True)
pd.Series(c, index=u)
1.0 2
3.0 2
4.0 2
5.0 2
dtype: int64
Note that the first method sorts results in descending order of counts, while the latter does not.请注意,第一种方法按计数降序对结果进行排序,而后者则不然。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.