简体   繁体   English

pandas value_counts(显示值和比率)

[英]pandas value_counts (show values and ratio)

As a newbie to pandas, I'm looking to get a count of values from a specific column and percent count into a single frame.作为熊猫的新手,我希望将特定列中的值计数和百分比计数放入单个框架中。 I can get one or the other, but can't figure out how to add or merge them into a single frame.我可以得到一个或另一个,但无法弄清楚如何将它们添加或合并到一个框架中。 Thoughts?想法?

The frame/table should be like this:框架/表格应该是这样的:

some_value, count, count(as %)

Here is what I have...这是我所拥有的...

import numpy as np
import pandas as pd 

np.random.seed(1)
values = np.random.randint(30, 35, 20)

df1 = pd.DataFrame(values, columns=['some_value'])
df1.sort_values(by=['some_value'], inplace = True)
df2 = df1.value_counts()
df3 = df1.value_counts(normalize=True)

print(df2)
print("------")
print(df3) 

Just use只需使用

pd.DataFrame({"count":df2,"%":df3*100})

to put the series into one df.将系列放入一个 df。

Output:输出:

            count     %
some_value             
34              7  35.0
32              4  20.0
33              3  15.0
31              3  15.0
30              3  15.0

I guess calling value_counts and then normalizing it with a lambda function could be more efficient, but you can get the result you are seeking by doing :我想调用value_counts然后使用 lambda 函数对其进行规范化可能会更有效,但是您可以通过执行以下操作来获得您正在寻找的结果:

df1_counts = df1.value_counts().to_frame(name="count").merge(
    df1.value_counts(normalize=True).to_frame(name="count(as %)"),
    left_index=True,
    right_index=True,
)

Resulting in :导致 :

| some_value | count | count(as %) |
|------------|-------|-------------|
| 34         | 7     | 0.35        |
| 32         | 4     | 0.20        |
| 33         | 3     | 0.15        |
| 31         | 3     | 0.15        |
| 30         | 3     | 0.15        |

Best !最好的事物 !

Compute, rename and join.计算、重命名和连接。 Lets try;咱们试试吧;

df1.some_value.value_counts().to_frame('count').join(df1.some_value.value_counts(normalize=True).to_frame('%'))



  count   %
34      7  0.35
32      4  0.20
33      3  0.15
31      3  0.15
30      3  0.15

Try this using partial from functools with pd.DataFrame.agg calling a list of functions:尝试使用来自functools partialpd.DataFrame.agg调用函数列表:

from functools import partial
vc_norm = partial(pd.Series.value_counts, normalize=True)
df1['some_value'].agg([pd.Series.value_counts, vc_norm])

Output:输出:

    value_counts  value_counts
34             7          0.35
32             4          0.20
31             3          0.15
30             3          0.15
33             3          0.15

Or you can use lambda function like this:或者你可以像这样使用 lambda 函数:

df1['some_value'].agg([pd.Series.value_counts, lambda x: x.value_counts(normalize=True)])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM