[英]Transpose and Groupby pandas Columns
I'm looking to transpose pandas columns and apply a Groupby我正在寻找转置熊猫列并应用 Groupby
df = pd.DataFrame({'ID' : ['ID1', 'ID2', 'ID3', 'ID4'],
'Code1' : ['X60', np.nan, 'X66', np.nan],
'Code2' : [np.nan, 'X64', 'X78', np.nan],
'Code3' : [np.nan, 'X66', 'X81', 'X59'],
'Code4' : [np.nan, np.nan, 'X38', 'X60']})
df
ID Code1 Code2 Code3 Code4
0 ID1 X60 NaN NaN NaN
1 ID2 NaN X64 X66 NaN
2 ID3 X66 X78 X81 X38
3 ID4 NaN NaN X59 X60
How can I achieve this expected output ?我怎样才能达到这个预期的输出?
Code NB ID
X38 1 ID3
X59 1 ID4
X60 2 ID1, ID4
X64 1 ID2
X66 2 ID2, ID3
X78 1 ID3
X81 1 ID3
Use DataFrame.stack
for reshape with remove missing values and count values by Series.value_counts
, last Series.sort_index
with Series.rename_axis
and Series.reset_index
for 2 columns DataFrame:使用
DataFrame.stack
进行重塑,通过Series.value_counts
删除缺失值和计数值,最后Series.sort_index
使用Series.rename_axis
和Series.reset_index
用于 2 列 DataFrame:
df = df.stack().value_counts().sort_index().rename_axis('Code').reset_index(name='NB')
print (df)
Code NB
0 X38 1
1 X59 1
2 X60 2
3 X64 1
4 X66 2
5 X78 1
6 X81 1
EDIT: Use DataFrame.melt
and then aggregate by size
and join
in GroupBy.agg
:编辑:使用
DataFrame.melt
然后按size
聚合并join
GroupBy.agg
:
df = (df.melt('ID', value_name='Code')
.groupby('Code', as_index=False)
.agg(NB=('Code','size'), ID=('ID',', '.join)))
print (df)
Code NB ID
0 X38 1 ID3
1 X59 1 ID4
2 X60 2 ID1, ID4
3 X64 1 ID2
4 X66 2 ID3, ID2
5 X78 1 ID3
6 X81 1 ID3
one option is to transform the data into long form, with pivot_longer from pyjanitor before grouping:一种选择是将数据转换为长格式,在分组之前使用来自pyjanitor的pivot_longer :
# pip install pyjanitor
import pandas as pd
import janitor
(df
.pivot_longer(
index = 'ID',
names_to = 'Code',
names_pattern = ['Code'])
.groupby('Code')
.agg(NB = ('ID', 'size'), ID = ('ID', ','.join))
)
NB ID
Code
X38 1 ID3
X59 1 ID4
X60 2 ID1,ID4
X64 1 ID2
X66 2 ID3,ID2
X78 1 ID3
X81 1 ID3
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.