简体   繁体   English

使用 Pandas 分组、转置或什至 pivot_table

[英]Groupby, transpose, or even pivot_table with pandas

I have a DataFrame like this:我有一个像这样的数据帧:

    Model               R2       RMSE       Average_CV   Destiny
0   Ada Boost         0.5563    125.2569    0.426166        REC
1   Bagging Regressor 0.8363    76.0865     0.582675        REC
2   Random Forest     0.8378    75.7304     0.590683        REC
3   Decision Tree     0.8366    76.0053     0.460394        REC

How can I get the output below?我怎样才能得到下面的输出?

Model               Metrica         REC
Ada Boost           Average_CV      0.426166
                    R2              0.5563
                    RMSE            125.2569
Bagging Regressor   Average_CV      0.582675
                    R2              0.8363
                    RMSE            76.0865
Decision Tree       Average_CV      0.590683
                    R2              0.8366
                    RMSE            76.0053
Random Forest       Average_CV      0.460394
                    R2              0.8378
                    RMSE            75.7304 

I've been trying to groupby, transpose, and even crosstab.我一直在尝试分组、转置甚至交叉表。 But I have no idea how to get the wished output.但我不知道如何获得所需的输出。

It doesn't matter the order of the rows in the column Metrica in the output.输出中 Metrica 列中的行顺序无关紧要。

Thanks谢谢

If Destiny is always Rec, you can do a simple melt .如果命运总是 Rec,你可以做一个简单的melt Then you can set and sort the index to get the output you want.然后你可以设置和排序索引以获得你想要的输出。

df.melt(id_vars="Model", value_vars=["R2", "RMSE", "Average_CV"], var_name="Metrica")


                Model     Metrica       value
0           Ada_Boost          R2    0.556300
1   Bagging_Regressor          R2    0.836300
2       Random_Forest          R2    0.837800
3       Decision_Tree          R2    0.836600
4           Ada_Boost        RMSE  125.256900
5   Bagging_Regressor        RMSE   76.086500
6       Random_Forest        RMSE   75.730400
7       Decision_Tree        RMSE   76.005300
8           Ada_Boost  Average_CV    0.426166
9   Bagging_Regressor  Average_CV    0.582675
10      Random_Forest  Average_CV    0.590683
11      Decision_Tree  Average_CV    0.460394

With setting/sorting the index:通过设置/排序索引:

(df.melt(id_vars="Model", value_vars=["R2", "RMSE", "Average_CV"], var_name="Metrica")
    .set_index(["Model", "Metrica"])
    .sort_index())

                                   value
Model             Metrica               
Ada_Boost         Average_CV    0.426166
                  R2            0.556300
                  RMSE        125.256900
Bagging_Regressor Average_CV    0.582675
                  R2            0.836300
                  RMSE         76.086500
Decision_Tree     Average_CV    0.460394
                  R2            0.836600
                  RMSE         76.005300
Random_Forest     Average_CV    0.590683
                  R2            0.837800
                  RMSE         75.730400

If you Destiny has multiple values and you want 1 column for each of those values, then you'll have to get a little fancier如果您的命运有多个值,并且您希望每个值都有 1 列,那么您将不得不变得更有趣

(df.melt(id_vars="Model", value_vars=["R2", "RMSE", "Average_CV"], var_name="Metrica")
 .merge(df[["Model", "Destiny"]], on="Model")
 .pivot_table(index=["Model", "Metrica"], columns="Destiny", values="value")
 .rename_axis(None, axis=1)
)

                                     REC
Model             Metrica               
Ada_Boost         Average_CV    0.426166
                  R2            0.556300
                  RMSE        125.256900
Bagging_Regressor Average_CV    0.582675
                  R2            0.836300
                  RMSE         76.086500
Decision_Tree     Average_CV    0.460394
                  R2            0.836600
                  RMSE         76.005300
Random_Forest     Average_CV    0.590683
                  R2            0.837800
                  RMSE         75.730400

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM