[英]Pandas groupby multiple columns exclusively
I have the DataFrame below and want to find the count of y
and n
for each column:我有下面的 DataFrame 并且想找到每一列的
y
和n
的计数:
ID ![]() |
var1![]() |
var2![]() |
---|---|---|
1 ![]() |
y![]() |
|
2 ![]() |
n ![]() |
y![]() |
3 ![]() |
y![]() |
n ![]() |
4 ![]() |
y![]() |
n ![]() |
5 ![]() |
y![]() |
the result would be like this:结果会是这样的:
var1_N![]() |
var2_N ![]() |
|
---|---|---|
y![]() |
3 ![]() |
2 ![]() |
n ![]() |
1 ![]() |
2 ![]() |
I used transform
function but was wondering there is a better way to get the results.我使用了
transform
function 但想知道是否有更好的方法来获得结果。 Thanks!谢谢!
You can just do value_counts
on all columns you need to count using apply
method, the results will be automatically joined on the index (or var value in your case):您可以使用
apply
方法对需要计数的所有列执行value_counts
,结果将自动加入索引(或您的情况下的 var 值):
df.filter(like='var').apply(lambda s: s.value_counts())
var1 var2
y 3 2
n 1 2
Or use pd.value_counts
directly:或者直接使用
pd.value_counts
:
df.filter(like='var').apply(pd.value_counts)
var1 var2
y 3 2
n 1 2
You can use melt
to flatten your dataframe then use value_counts
and unstack
the variable column:您可以使用
melt
来展平您的unstack
然后使用value_counts
并拆开变量列:
>>> df.melt('ID').value_counts(['variable', 'value']).unstack('variable')
variable var1 var2
value
n 1 2
y 3 2
You can remove index and column names: by appending ``您可以删除索引和列名:通过附加``
Output: Output:
>>> df.melt('ID').value_counts(['variable', 'value']).unstack('variable') \
.rename_axis(index=None, columns=None)
var1 var2
n 1 2
y 3 2
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.