[英]How do I count the number of times a specific name appears in a pandas data frame column?
I have a column of names.我有一列名字。 I need to get a count of how many times a specific name appears in that column.我需要计算特定名称在该列中出现的次数。
Column:
Dave
John
John
Thanos
Bob
I need something like:我需要类似的东西:
[in] df['Column'].count_name('John')
[out] 2
Using value_counts()
doesn't work because there are thousands of names in the column, and many of them only appear once.使用value_counts()
不起作用,因为列中有数千个名称,其中许多只出现一次。 I'm sorry if this question has been asked/answered before, but I haven't been able to figure out a way to search for it that doesn't just give me an answer telling me to use value_counts()
.如果之前有人问过/回答过这个问题,我很抱歉,但我一直无法找到一种搜索它的方法,而不仅仅是给我一个告诉我使用value_counts()
的答案。
Thanks!谢谢!
Just for speed up using numpy.count_nonzero
只是为了加速使用numpy.count_nonzero
import numpy as np
np.count_nonzero(df['Column']=='John')
Out[186]: 2
Try using this:尝试使用这个:
df['Column:'].value_counts()['John']
In [6]: %timeit df[0].value_counts()['John']
1000 loops, best of 3: 548 µs per loop
In [7]: %timeit df[0].eq('John').sum()
The slowest run took 8.19 times longer than the fastest. This could mean that an intermediate result is being cached.
1000 loops, best of 3: 311 µs per loop
In [8]: %timeit np.count_nonzero(df[0]=='John')
10000 loops, best of 3: 162 µs per loop
EDIT: Fastest is using np.count_nonzero...编辑:最快的是使用 np.count_nonzero ...
Apparently, using eq() proves to be faster than using value_counts() which is obvious as value_counts calculates counts for all values whereas .eq() calculates for only given value..显然,使用 eq() 被证明比使用 value_counts() 更快,这很明显,因为 value_counts 计算所有值的计数,而 .eq() 仅计算给定值。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.