简体   繁体   English

如何在 Python 中使用 Pivot_Table 操作 Dataframe

[英]How do I manipulate a Dataframe with Pivot_Table in Python

I have spent much time on this but I am nowhere closer to a solution.我在这方面花了很多时间,但我离解决方案还很远。

I have a dataframe which outputs as我有一个 dataframe 输出为

   RegionID  AreaID    Year     Jan     Feb     Mar     Apr     May     Jun
0       20.0     1.0  2020.0  1174.0  1056.0  1051.0  1107.0  1097.0  1118.0   
1       19.0     2.0  2020.0   460.0   451.0   421.0   421.0   420.0   457.0   
2       20.0     3.0  2020.0  2723.0  2594.0  2590.0  2399.0  2377.0  2331.0   
3       21.0     4.0  2020.0   863.0   859.0   813.0   785.0   757.0   765.0   
4       19.0     5.0  2020.0  4037.0  3942.0  4069.0  3844.0  3567.0  3721.0   
5       19.0     6.0  2020.0  1695.0  1577.0  1531.0  1614.0  1671.0  1693.0   
6       18.0     7.0  2020.0  1757.0  1505.0  1445.0  1514.0  1406.0  1444.0   
7       18.0     8.0  2020.0   832.0   721.0   747.0   852.0   885.0   872.0   
8       18.0     9.0  2020.0  2538.0  2000.0  2026.0  1981.0  1987.0  1949.0   
9       21.0    10.0  2020.0  1145.0  1235.0  1114.0  1161.0  1150.0  1189.0   
10      20.0    11.0  2020.0   551.0   497.0   503.0   472.0   505.0   532.0   
11      19.0    12.0  2020.0  1664.0  1526.0  1389.0  1373.0  1384.0  1404.0   
12      21.0    13.0  2020.0   381.0   351.0   299.0   286.0   297.0   319.0   
13      21.0    14.0  2020.0  1733.0  1627.0  1567.0  1561.0  1498.0  1511.0   
14      18.0    15.0  2020.0  1257.0  1257.0  1160.0  1172.0  1124.0  1113.0 

I want to pivot this data so that I have a month combined field like below

RegionID      AreaID    Year    Month   Amout
20.0            1.0     2020    Jan     1174
20.0            1.0     2020    Feb     1056
20.0            1.0     2020    Mar     1051

Can this be done using pandas? I have been trying with the pivot_table but I cant get it to work.

I hope I've understood your question well.我希望我已经很好地理解了你的问题。 You can .set_index() and then .stack() :你可以.set_index()然后.stack()

print(
    df.set_index(["RegionID", "AreaID", "Year"])
    .stack()
    .reset_index()
    .rename(columns={"level_3": "Month", 0: "Amount"})
)

Prints:印刷:

    RegionID  AreaID    Year Month  Amount
0       20.0     1.0  2020.0   Jan  1174.0
1       20.0     1.0  2020.0   Feb  1056.0
2       20.0     1.0  2020.0   Mar  1051.0
3       20.0     1.0  2020.0   Apr  1107.0
4       20.0     1.0  2020.0   May  1097.0
5       20.0     1.0  2020.0   Jun  1118.0
6       19.0     2.0  2020.0   Jan   460.0
7       19.0     2.0  2020.0   Feb   451.0
8       19.0     2.0  2020.0   Mar   421.0
9       19.0     2.0  2020.0   Apr   421.0
10      19.0     2.0  2020.0   May   420.0
11      19.0     2.0  2020.0   Jun   457.0

...

Or:或者:

print(
    df.melt(
        ["RegionID", "AreaID", "Year"], var_name="Month", value_name="Amount"
    )
)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM