[英]Pandas: Collapse rows in a Multiindex dataframe
下面是我的 df:
df = pd.DataFrame({'A': [1, 1, 1, 2],
'B': [2, 2, 2, 3],
'C': [3, 3, 3, 4],
'D': ['Cancer A', 'Cancer B', 'Cancer A', 'Cancer B'],
'E': ['Ecog 9', 'Ecog 1', 'Ecog 0', 'Ecog 1'],
'F': ['val 6', 'val 1', 'val 0', 'val 1'],
'measure_m': [100, 200, 500, 300]})
print(df)
A B C D E F measure_m
0 1 2 3 Cancer A Ecog 9 val 6 100
1 1 2 3 Cancer B Ecog 1 val 1 200
2 1 2 3 Cancer A Ecog 0 val 0 500
3 2 3 4 Cancer B Ecog 1 val 1 300
当我在不传递索引的情况下pivot
这个 df 时,我得到了这个:
In [1280]: df.pivot(index=None, columns = ['A', 'B', 'C', 'D', 'E', 'F'])
Out[1280]:
measure_m
A 1 2
B 2 3
C 3 4
D Cancer A Cancer B Cancer A Cancer B
E Ecog 9 Ecog 1 Ecog 0 Ecog 1
F val 6 val 1 val 0 val 1
0 100.0 NaN NaN NaN
1 NaN 200.0 NaN NaN
2 NaN NaN 500.0 NaN
3 NaN NaN NaN 300.0
我想要的不是4 rows
而是1
行,其中包含measure_m
列的所有值,如下所示:
measure_m
A 1 2
B 2 3
C 3 4
D Cancer A Cancer B Cancer A Cancer B
E Ecog 9 Ecog 1 Ecog 0 Ecog 1
F val 6 val 1 val 0 val 1
0 100.0 200.0 500.0 300.0
如何解决这个问题?
你的意思是:
df.set_index(list(df.columns[:-1])).T
输出:
A 1 2
B 2 3
C 3 4
D Cancer A Cancer B Cancer A Cancer B
E Ecog 9 Ecog 1 Ecog 0 Ecog 1
F val 6 val 1 val 0 val 1
measure_m 100 200 500 300
更新一些修改以匹配您的输出:
cols = ['A', 'B', 'C', 'D', 'E', 'F']
(df.set_index(cols)
[['measure_m']] # only need this if you have more columns
.unstack(level=cols)
.to_frame().T
)
输出:
measure_m
A 1 2
B 2 3
C 3 4
D Cancer A Cancer B Cancer A Cancer B
E Ecog 9 Ecog 1 Ecog 0 Ecog 1
F val 6 val 1 val 0 val 1
0 100 200 500 300
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.