[英]pandas value_counts (show values and ratio)
As a newbie to pandas, I'm looking to get a count of values from a specific column and percent count into a single frame.作为熊猫的新手,我希望将特定列中的值计数和百分比计数放入单个框架中。 I can get one or the other, but can't figure out how to add or merge them into a single frame.我可以得到一个或另一个,但无法弄清楚如何将它们添加或合并到一个框架中。 Thoughts?想法?
The frame/table should be like this:框架/表格应该是这样的:
some_value, count, count(as %)
Here is what I have...这是我所拥有的...
import numpy as np
import pandas as pd
np.random.seed(1)
values = np.random.randint(30, 35, 20)
df1 = pd.DataFrame(values, columns=['some_value'])
df1.sort_values(by=['some_value'], inplace = True)
df2 = df1.value_counts()
df3 = df1.value_counts(normalize=True)
print(df2)
print("------")
print(df3)
Just use只需使用
pd.DataFrame({"count":df2,"%":df3*100})
to put the series into one df.将系列放入一个 df。
Output:输出:
count %
some_value
34 7 35.0
32 4 20.0
33 3 15.0
31 3 15.0
30 3 15.0
I guess calling value_counts
and then normalizing it with a lambda function could be more efficient, but you can get the result you are seeking by doing :我想调用value_counts
然后使用 lambda 函数对其进行规范化可能会更有效,但是您可以通过执行以下操作来获得您正在寻找的结果:
df1_counts = df1.value_counts().to_frame(name="count").merge(
df1.value_counts(normalize=True).to_frame(name="count(as %)"),
left_index=True,
right_index=True,
)
Resulting in :导致 :
| some_value | count | count(as %) |
|------------|-------|-------------|
| 34 | 7 | 0.35 |
| 32 | 4 | 0.20 |
| 33 | 3 | 0.15 |
| 31 | 3 | 0.15 |
| 30 | 3 | 0.15 |
Best !最好的事物 !
Compute, rename and join.计算、重命名和连接。 Lets try;咱们试试吧;
df1.some_value.value_counts().to_frame('count').join(df1.some_value.value_counts(normalize=True).to_frame('%'))
count %
34 7 0.35
32 4 0.20
33 3 0.15
31 3 0.15
30 3 0.15
Try this using partial
from functools
with pd.DataFrame.agg
calling a list of functions:尝试使用来自functools
partial
和pd.DataFrame.agg
调用函数列表:
from functools import partial
vc_norm = partial(pd.Series.value_counts, normalize=True)
df1['some_value'].agg([pd.Series.value_counts, vc_norm])
Output:输出:
value_counts value_counts
34 7 0.35
32 4 0.20
31 3 0.15
30 3 0.15
33 3 0.15
Or you can use lambda function like this:或者你可以像这样使用 lambda 函数:
df1['some_value'].agg([pd.Series.value_counts, lambda x: x.value_counts(normalize=True)])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.