简体   繁体   English

计算每列和附加类别列的多列值的数量

[英]Count number of values of multiple columns per each column and additional category column

I have a dataframe containing multiple columns with 0's and 1's (A, B) as well as one column (C) indicating the category of the row.我有一个数据框,其中包含多个带有 0 和 1(A、B)的列以及一列(C)指示行的类别。 Now, I would like to count the 0 and 1 values per column and category.现在,我想计算每列和类别的 0 和 1 值。

import pandas as pd

test_data = {'A': [0,0,1,1,1,0],
             'B': [0,1,0,1,0,1],
             'C': ['a','a','b','b', 'c', 'c']}

df = pd.DataFrame(test_data)

I tried to figure out how I could rearrange the dataframe using pd.piovt_table, however I wasn't successful getting the right transformation.我试图弄清楚如何使用 pd.piovt_table 重新排列数据帧,但是我没有成功获得正确的转换。 I tried the following:我尝试了以下方法:

table = pd.pivot_table(df, columns = ['C'], index=['A'], aggfunc='count')
print('0', table)

which will result in the following output:这将导致以下输出:

0      B          
C    a    b    c
A               
0  2.0  NaN  1.0
1  NaN  2.0  1.0

My goal is to get the following output:我的目标是获得以下输出:

0      B           |   A            # columns A and B
C    a a  b b  c c | a a  b b  c c  # row category based on C
     0 1  0 1  0 1 | 0 1  0 1  0 1  # 0 and 1 values of the columns A and B

     1 1  1 1  1 1 | 2 0  0 2  1 1  # counts

[Edit] or the following output: [编辑]或以下输出:

0      B     |   A      # columns A and B
C    a  b  c | a  b  c  # row category based on C
  0| 1  1  1 | 2  0  1
  1| 1  1  1 | 0  2  1

Could anyone help me with this?有人可以帮我解决这个问题吗? Thank you!谢谢!

I think you need DataFrame.melt previously我认为你以前需要DataFrame.melt

First case it is the second with unstack()第一种情况是第二种情况,使用 unstack()

new_df = (df.melt('C')
            .groupby(['variable','C'])['value']
            .value_counts().unstack(fill_value=0)
            .stack()
            .to_frame().T
            .rename_axis(index=None,columns=[0,'C',None])
            .sort_index(axis=1, ascending=[False,True,True]))
print(new_df)
0  B                 A               
C  a     b     c     a     b     c   
   0  1  0  1  0  1  0  1  0  1  0  1
0  1  1  1  1  1  1  2  0  0  2  1  1

Second Case it is the first with stack()第二种情况它是第一个使用 stack()

new_df = (df.melt('C').groupby(['C','variable'])['value']
            .value_counts().unstack(['variable','C'],fill_value=0)
            .sort_index(axis=1, ascending=[False, True])
            .rename_axis(columns=[0,'C'],index=None))
print(new_df)

or或者

new_df = (df.melt('C')
            .pivot_table(columns=['variable','C'],
                         index='value',
                         aggfunc='size',
                         fill_value=0)
            .rename_axis(index=None, columns=[0,'C'])
            .sort_index(axis=1, ascending=[False, True]))

Output输出

0  B        A      
C  a  b  c  a  b  c
0  1  1  1  2  0  1
1  1  1  1  0  2  1

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 计算列表中每个项目在 Pandas 数据框列中出现的次数,用逗号将值与其他列的附加聚合分开 - Count number of times each item in list occurs in a pandas dataframe column with comma separates values with additional aggregation of other columns 计算DataFrame每列中值出现的次数 - Count number of occurences of values per column of DataFrame 按几列对数据框的结果进行分组,并计算每列不同的唯一值 - Group results of dataframe by several columns and count the different unique values per each column 如何创建每个id的每个日期计算行数的列 - how to create column that count number of rows per each date of id 如何将具有值的第二级类别列转换为多列 - How to transform a 2nd level category column with values to multiple columns 给定 dataframe,groupby 后跟 sum 操作,并在特定列中为每个类别创建新列 - Given a dataframe, groupby followed by sum operation and create new columns per each category in a specific column 如何获取 python 中某一列的每个类别的计数? - How to get the count of each category of a column in python? 如何获取 pandas 中每对唯一列的列值计数? - How to get count of column values for each unique pair of columns in pandas? Pyspark 对多列中每个不同值的计数 - Pyspark count for each distinct value in column for multiple columns 每个唯一列值的天数和扩展评级 - Count of number of days for each unique column values and extend ratings
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM