[英]Pandas/Python How to switch Index/Columns in dataframe while retaining df structure?
I have a Pandas dataframe that looks like this: 我有一个看起来像这样的Pandas数据框:
X1 X1 X1 X2 X2 X2
ABC 12.4 34.3 25.4 29.3 53.2 38.9
DEF 22.3 28.6 32.8 24.6 29.4 25.3
The left column is the index, and the top values are column labels. 左列是索引,最上面的值是列标签。 I am trying to swap the column names and index so that it looks like this: 我试图交换列名称和索引,以便它看起来像这样:
ABC ABC ABC DEF DEF DEF
X1 12.4 34.3 25.4 22.3 28.6 32.8
X2 29.3 53.2 38.9 24.6 29.4 25.3
I can get the axes switched using stack and unstack if I add a numbered index, but the replicates are listed vertically instead of horizontally. 如果添加编号索引,则可以使用堆栈和取消堆栈来切换轴,但是重复项是垂直列出的,而不是水平列出的。 I can't figure out how to do it so that the individual replicates stay side-by-side, which is necessary for what I am trying to do with the table. 我不知道如何做到这一点,以使各个重复项并排放置,这对于我尝试对表进行操作是必需的。 The replicates need to stay separate, I do not want the Average/Sum/etc. 重复项需要保持独立,我不希望平均值/总和/等。
Any help/suggestions would be greatly appreciated. 任何帮助/建议将不胜感激。
Thanks! 谢谢!
edit: 编辑:
This code gives a dataframe that is similar in structure to my actual data but with fewer columns: 这段代码提供了一个数据框,该数据框的结构与我的实际数据相似,但具有较少的列:
names = ["G1","G2","G3","G4", "G5", "G6", "G7", "G8"]
df = pd.DataFrame([(7.345,"NaN","NaN",239.947,295.893,349.834),(13.872,"NaN","NaN",20.485,14.852,29.598),(764.298,"NaN","NaN",492.854,432.943,539.950),(0.00385,"NaN","NaN",0.184,0.384,0.285),(285.836,"NaN","NaN",495.284,395.486,368.952),(7.385,"NaN","NaN",5.293,4.295,4.692),(21.693,"NaN","NaN",25.843,15.843,15.386),(8.583,"NaN","NaN",4.397,6.295,6.39)], names, ["S1", "S1", "S1", "482.1", "482.1", "482.1"])
Giving this dataframe: 提供此数据框:
S1 S1 S1 482.1 482.1 482.1
G1 7.34500 NaN NaN 239.947 295.893 349.834
G2 13.87200 NaN NaN 20.485 14.852 29.598
G3 764.29800 NaN NaN 492.854 432.943 539.950
G4 0.00385 NaN NaN 0.184 0.384 0.285
G5 285.83600 NaN NaN 495.284 395.486 368.952
G6 7.38500 NaN NaN 5.293 4.295 4.692
G7 21.69300 NaN NaN 25.843 15.843 15.386
G8 8.58300 NaN NaN 4.397 6.295 6.390
Running: 运行:
df2 = df.copy()
m = dict(zip(df2.index.unique(), df2.columns.unique()))
df2.index = df2.index.map(m.get)
df2.columns = df2.columns.map({v : k for k, v in m.items()}.get)
gives: 得到:
G1 G1 G1 G2 G2 G2
S1 7.34500 NaN NaN 239.947 295.893 349.834
482.1 13.87200 NaN NaN 20.485 14.852 29.598
NaN 764.29800 NaN NaN 492.854 432.943 539.950
NaN 0.00385 NaN NaN 0.184 0.384 0.285
NaN 285.83600 NaN NaN 495.284 395.486 368.952
NaN 7.38500 NaN NaN 5.293 4.295 4.692
NaN 21.69300 NaN NaN 25.843 15.843 15.386
NaN 8.58300 NaN NaN 4.397 6.295 6.390
The column and index labels have moved, but the data associated with them have not, and several columns are missing. 列和索引标签已移动,但是与它们关联的数据没有移动,并且缺少几列。 Running: 运行:
df2 = df.copy()
m = dict(zip(df2.index.unique(), df2.columns.unique()))
df2 = df2.rename(index=m, columns={v : k for k, v in m.items()})
gives: 得到:
G1 G1 G1 G2 G2 G2
S1 7.34500 NaN NaN 239.947 295.893 349.834
482.1 13.87200 NaN NaN 20.485 14.852 29.598
G3 764.29800 NaN NaN 492.854 432.943 539.950
G4 0.00385 NaN NaN 0.184 0.384 0.285
G5 285.83600 NaN NaN 495.284 395.486 368.952
G6 7.38500 NaN NaN 5.293 4.295 4.692
G7 21.69300 NaN NaN 25.843 15.843 15.386
G8 8.58300 NaN NaN 4.397 6.295 6.390
Which is also wrong for similar reasons. 由于类似原因,这也是错误的。
New_df=df.T.groupby(level=0).agg(lambda x : x.values.tolist()).stack().apply(pd.Series).unstack().sort_index(level=1,axis=1)
New_df.columns=New_df.columns.droplevel(level=0)
New_df
Out[229]:
ABC ABC ABC DEF DEF DEF
X1 12.4 34.3 25.4 22.3 28.6 32.8
X2 29.3 53.2 38.9 24.6 29.4 25.3
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.