简体   繁体   中英

Groupby, transpose, or even pivot_table with pandas

I have a DataFrame like this:

    Model               R2       RMSE       Average_CV   Destiny
0   Ada Boost         0.5563    125.2569    0.426166        REC
1   Bagging Regressor 0.8363    76.0865     0.582675        REC
2   Random Forest     0.8378    75.7304     0.590683        REC
3   Decision Tree     0.8366    76.0053     0.460394        REC

How can I get the output below?

Model               Metrica         REC
Ada Boost           Average_CV      0.426166
                    R2              0.5563
                    RMSE            125.2569
Bagging Regressor   Average_CV      0.582675
                    R2              0.8363
                    RMSE            76.0865
Decision Tree       Average_CV      0.590683
                    R2              0.8366
                    RMSE            76.0053
Random Forest       Average_CV      0.460394
                    R2              0.8378
                    RMSE            75.7304 

I've been trying to groupby, transpose, and even crosstab. But I have no idea how to get the wished output.

It doesn't matter the order of the rows in the column Metrica in the output.

Thanks

If Destiny is always Rec, you can do a simple melt . Then you can set and sort the index to get the output you want.

df.melt(id_vars="Model", value_vars=["R2", "RMSE", "Average_CV"], var_name="Metrica")


                Model     Metrica       value
0           Ada_Boost          R2    0.556300
1   Bagging_Regressor          R2    0.836300
2       Random_Forest          R2    0.837800
3       Decision_Tree          R2    0.836600
4           Ada_Boost        RMSE  125.256900
5   Bagging_Regressor        RMSE   76.086500
6       Random_Forest        RMSE   75.730400
7       Decision_Tree        RMSE   76.005300
8           Ada_Boost  Average_CV    0.426166
9   Bagging_Regressor  Average_CV    0.582675
10      Random_Forest  Average_CV    0.590683
11      Decision_Tree  Average_CV    0.460394

With setting/sorting the index:

(df.melt(id_vars="Model", value_vars=["R2", "RMSE", "Average_CV"], var_name="Metrica")
    .set_index(["Model", "Metrica"])
    .sort_index())

                                   value
Model             Metrica               
Ada_Boost         Average_CV    0.426166
                  R2            0.556300
                  RMSE        125.256900
Bagging_Regressor Average_CV    0.582675
                  R2            0.836300
                  RMSE         76.086500
Decision_Tree     Average_CV    0.460394
                  R2            0.836600
                  RMSE         76.005300
Random_Forest     Average_CV    0.590683
                  R2            0.837800
                  RMSE         75.730400

If you Destiny has multiple values and you want 1 column for each of those values, then you'll have to get a little fancier

(df.melt(id_vars="Model", value_vars=["R2", "RMSE", "Average_CV"], var_name="Metrica")
 .merge(df[["Model", "Destiny"]], on="Model")
 .pivot_table(index=["Model", "Metrica"], columns="Destiny", values="value")
 .rename_axis(None, axis=1)
)

                                     REC
Model             Metrica               
Ada_Boost         Average_CV    0.426166
                  R2            0.556300
                  RMSE        125.256900
Bagging_Regressor Average_CV    0.582675
                  R2            0.836300
                  RMSE         76.086500
Decision_Tree     Average_CV    0.460394
                  R2            0.836600
                  RMSE         76.005300
Random_Forest     Average_CV    0.590683
                  R2            0.837800
                  RMSE         75.730400

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM