Pandas 合并行 / Dataframe 转换

Question

I have this example DataFrame:我有这个例子 DataFrame：

e   col1    col2    col3
1   238.4   238.7   238.2
2   238.45  238.75  238.2
3   238.2   238.25  237.95
4   238.1   238.15  238.05
5   238.1   238.1   238
6   229.1   229.05  229.05
7   229.35  229.35  229.1
8   229.1   229.15  229
9   229.05  229.05  229

How would I be able to convert it to this:我怎么能把它转换成这个：

                1                      2            3   
    col1    col2    col3    col1    col2    col3    col1    col2    col3
1   238.4   238.7   238.2   238.45  238.75  238.2   238.2   238.25  237.95
2   238.1   238.15  238.05  238.1   238.1   238     229.1   229.05  229.05
3   229.35  229.35  229.1   229.1   229.15  229    229.05   229.05  229

I am thinking maybe I should pivot by counting with lens or assigning a index that could be multiple of 3, but I really am not sure what would be the most efficient way.我在想也许我应该 pivot 通过用镜头计数或分配一个可能是 3 的倍数的索引，但我真的不确定什么是最有效的方法。

Answer 1

Create a grouping series g , this we will be needed to group the dataframe so that every third element (taking a step size of 3) belongs to the same group, use np.unique to get the unique grouping keys, next use DataFrame.groupby to group the dataframe on g and use set_index to set the index of every grouped frame to k , finally use pd.concat to concat all the grouped dataframes along axis=1 and pass the optional parameter keys=k to create MultiLevel columns:创建一个分组系列g ，我们需要对 dataframe 进行分组，以便每隔三个元素（步长为 3）属于同一个组，使用np.unique获取唯一的分组键，接下来使用DataFrame.groupby在g上对 dataframe 进行分组，并使用set_index将每个分组帧的索引设置为k ，最后使用pd.concat沿axis=1连接所有分组数据帧并传递可选参数keys=k以创建MultiLevel列：

g, k = df.pop('e').sub(1) % 3 + 1, np.unique(g)
df1 = pd.concat([g.set_index(k) for _, g in df.groupby(g)], keys=k, axis=1)

Details:细节：

print(g.tolist())
[1, 2, 3, 1, 2, 3, 1, 2, 3]

print(k)
array([1, 2, 3])

Result:结果：

print(df1)

        1                       2                      3                
     col1    col2    col3    col1    col2   col3    col1    col2    col3
1  238.40  238.70  238.20  238.45  238.75  238.2  238.20  238.25  237.95
2  238.10  238.15  238.05  238.10  238.10  238.0  229.10  229.05  229.05
3  229.35  229.35  229.10  229.10  229.15  229.0  229.05  229.05  229.00

Answer 2

The data is shaped in steps of three, as such, we have to iterate through in those steps of 3, and finally concatenate on the columns axis:数据分三步形成，因此，我们必须在三步中迭代，最后在列轴上连接：

pd.concat([df.iloc[n::3].
           reset_index(drop=True).
           set_index(pd.Index([index]*3), 
                     append = True)
           .unstack()
           .swaplevel(1,0, axis=1)
            for n, index in zip(range(0,df.shape[0]//df.shape[1]),
                                range(1, df.shape[1] + 1))], 
          axis = 1)

Answer 3

Using pandas methods and step by step approach:使用 pandas 方法和逐步方法：

df['id1'] = (df.e+2) % 3 + 1
df['id2'] = df['id1']
df.loc[df['id1']>1,'id2']=np.nan
df['id2'] = df['id2'].cumsum().ffill()
df2 = df.drop(columns='e').melt(id_vars = ['id1','id2'])

df3 = pd.pivot_table(df2, index = 'id2', columns = ['id1','variable'], values = 'value').reset_index(drop=True)
df3.index += 1
df3.columns.names = ['','']

result:结果：

        1                       2                      3                
     col1    col2    col3    col1    col2   col3    col1    col2    col3
1  238.40  238.70  238.20  238.45  238.75  238.2  238.20  238.25  237.95
2  238.10  238.15  238.05  238.10  238.10  238.0  229.10  229.05  229.05
3  229.35  229.35  229.10  229.10  229.15  229.0  229.05  229.05  229.00

Pandas 合并行 / Dataframe 转换

问题描述

3 个解决方案

解决方案1
1 已采纳 2020-07-11 12:30:59

解决方案2
1 2020-07-11 12:43:44

解决方案3
0 2020-07-11 12:58:40

Pandas 合并行 / Dataframe 转换

问题描述

3 个解决方案

解决方案1 1 已采纳 2020-07-11 12:30:59

解决方案2 1 2020-07-11 12:43:44

解决方案3 0 2020-07-11 12:58:40

解决方案1
1 已采纳 2020-07-11 12:30:59

解决方案2
1 2020-07-11 12:43:44

解决方案3
0 2020-07-11 12:58:40