将索引值移动到 Pandas Data Frame 中的列名中

Question

I'm trying to reshape a multi-indexed data frame so that the values from the second level of the index are incorporated into the column names in the new data frame.我正在尝试重塑多索引数据框，以便将第二级索引中的值合并到新数据框中的列名中。 In the data frame below, I want to move A and B from "source" into the columns so that I have s1_A, s1_B, s2_A, ..., s3_B.在下面的数据框中，我想将 A 和 B 从“源”移动到列中，以便我有 s1_A、s1_B、s2_A、...、s3_B。

I've tried creating the structure of the new data frame explicitly and populating it with a nested for loop to reassign the values, but it is excruciatingly slow.我尝试过显式创建新数据框的结构，并使用嵌套的 for 循环填充它以重新分配值，但速度非常慢。 I've tried a number of functions from the pandas API, but without much luck.我已经尝试了 Pandas API 中的许多功能，但运气不佳。 Any help would be much appreciated.任何帮助将非常感激。

midx = pd.MultiIndex.from_product( [[1,2,3], ['A','B']], names=["sample","source"])
df = pd.DataFrame( index=midx, columns=['s1', 's2', 's3'], data=np.ndarray(shape=(6,3)) )

>>> df
                s1   s2   s3
sample source               
1      A       1.2  3.4  5.6
       B       1.2  3.4  5.6
2      A       1.2  3.4  5.6
       B       1.2  3.4  5.6
3      A       1.2  3.4  5.6
       B       1.2  3.4  5.6


# Want to build a new data frame thatlooks like this:
>>> df_new
       s1_A   s1_B   s2_A   s2_B   s3_A   s3_B
sample                
1      1.2    1.2    3.4    3.4    5.6    5.6
2      1.2    1.2    3.4    3.4    5.6    5.6
3      1.2    1.2    3.4    3.4    5.6    5.6

Here's how I'm currently doing it.这是我目前的做法。 It's extremely slow, and I know there must be a more idiomatic way to do this with pandas, but I'm still new to its API:它非常慢，我知道必须有一种更惯用的方法来使用 Pandas 来做到这一点，但我仍然不熟悉它的 API：

substances = df.columns.values
sources = ['A','B']
subst_and_src = sorted([ subst + "_" + src for src in sources for subst in substances ])

df_new = pd.DataFrame(index=df.index.unique(0), columns=subst_and_src)

# Runs forever
for (sample, source) in df.index:
    for subst in df.columns:
        df_new[sample, subst + "_" + source] = df.loc[(sample,source), subst]

Answer 1

df = df.unstack(level=1)
df.columns = ['_'.join(col).strip() for col in df.columns.values]
print(df)

Prints:印刷：

                 s1_A           s1_B  s2_A  s2_B           s3_A           s3_B
sample                                                                        
1       4.665045e-310  6.904071e-310   0.0   0.0  6.903913e-310  2.121996e-314
2       6.904071e-310   0.000000e+00   0.0   0.0  3.458460e-323   0.000000e+00
3        0.000000e+00   0.000000e+00   0.0   0.0   0.000000e+00   0.000000e+00

Answer 2

Unstack into a new dataframe and collapse multilevel index of resulting frmae using f string Unstack到一个新的数据帧并使用f string折叠结果帧的多级索引

df1= df.unstack()
df1.columns = df1.columns.map('{0[0]}_{0[1]}'.format)



        s1_A  s1_B  s2_A  s2_B  s3_A  s3_B
sample                                    
1        1.2   1.2   3.4   3.4   5.6   5.6
2        1.2   1.2   3.4   3.4   5.6   5.6
3        1.2   1.2   3.4   3.4   5.6   5.6

将索引值移动到 Pandas Data Frame 中的列名中

问题描述

2 个解决方案

解决方案1
0 已采纳 2020-10-11 00:07:11

解决方案2
0 2020-10-11 00:19:55

将索引值移动到 Pandas Data Frame 中的列名中

问题描述

2 个解决方案

解决方案1 0 已采纳 2020-10-11 00:07:11

解决方案2 0 2020-10-11 00:19:55

解决方案1
0 已采纳 2020-10-11 00:07:11

解决方案2
0 2020-10-11 00:19:55