简体   繁体   English

转置和 Groupby 熊猫列

[英]Transpose and Groupby pandas Columns

I'm looking to transpose pandas columns and apply a Groupby我正在寻找转置熊猫列并应用 Groupby

df = pd.DataFrame({'ID' : ['ID1', 'ID2', 'ID3', 'ID4'], 
                   'Code1' : ['X60', np.nan, 'X66', np.nan], 
                   'Code2' : [np.nan, 'X64', 'X78', np.nan],
                   'Code3' : [np.nan, 'X66', 'X81', 'X59'],
                   'Code4' : [np.nan, np.nan, 'X38', 'X60']})
df

    ID      Code1   Code2   Code3   Code4
0   ID1     X60     NaN     NaN     NaN
1   ID2     NaN     X64     X66     NaN
2   ID3     X66     X78     X81     X38
3   ID4     NaN     NaN     X59     X60

How can I achieve this expected output ?我怎样才能达到这个预期的输出?

Code NB ID
X38  1  ID3
X59  1  ID4
X60  2  ID1, ID4 
X64  1  ID2
X66  2  ID2, ID3
X78  1  ID3
X81  1  ID3

Use DataFrame.stack for reshape with remove missing values and count values by Series.value_counts , last Series.sort_index with Series.rename_axis and Series.reset_index for 2 columns DataFrame:使用DataFrame.stack进行重塑,通过Series.value_counts删除缺失值和计数值,最后Series.sort_index使用Series.rename_axisSeries.reset_index用于 2 列 DataFrame:

df = df.stack().value_counts().sort_index().rename_axis('Code').reset_index(name='NB')
print (df)
  Code  NB
0  X38   1
1  X59   1
2  X60   2
3  X64   1
4  X66   2
5  X78   1
6  X81   1

EDIT: Use DataFrame.melt and then aggregate by size and join in GroupBy.agg :编辑:使用DataFrame.melt然后按size聚合并join GroupBy.agg

df = (df.melt('ID', value_name='Code')
        .groupby('Code', as_index=False)
        .agg(NB=('Code','size'), ID=('ID',', '.join)))
print (df)
  Code  NB        ID
0  X38   1       ID3
1  X59   1       ID4
2  X60   2  ID1, ID4
3  X64   1       ID2
4  X66   2  ID3, ID2
5  X78   1       ID3
6  X81   1       ID3

one option is to transform the data into long form, with pivot_longer from pyjanitor before grouping:一种选择是将数据转换为长格式,在分组之前使用来自pyjanitorpivot_longer

# pip install pyjanitor
import pandas as pd
import janitor
(df
.pivot_longer(
    index = 'ID', 
    names_to = 'Code', 
    names_pattern = ['Code'])
.groupby('Code')
.agg(NB = ('ID', 'size'), ID = ('ID', ','.join))
)
      NB       ID
Code
X38    1      ID3
X59    1      ID4
X60    2  ID1,ID4
X64    1      ID2
X66    2  ID3,ID2
X78    1      ID3
X81    1      ID3

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM