[英]Adding level 2 index as a sum of other indexes with a condition
我有一个df
:
df = pd.DataFrame.from_dict({('group', ''): {0: 'A',
1: 'A',
2: 'A',
3: 'A',
4: 'A',
5: 'A',
6: 'A',
7: 'A',
8: 'A',
9: 'B',
10: 'B',
11: 'B',
12: 'B',
13: 'B',
14: 'B',
15: 'B',
16: 'B',
17: 'B',
18: 'all',
19: 'all'},
('category', ''): {0: 'Amazon',
1: 'Apple',
2: 'Facebook',
3: 'Google',
4: 'Netflix',
5: 'Tesla',
6: 'Total',
7: 'Uber',
8: 'total',
9: 'Amazon',
10: 'Apple',
11: 'Facebook',
12: 'Google',
13: 'Netflix',
14: 'Tesla',
15: 'Total',
16: 'Uber',
17: 'total',
18: 'Total',
19: 'total'},
(pd.Timestamp('2020-06-29 00:00:00'), 'last_sales'): {0: 195.0,
1: 61.0,
2: 106.0,
3: 61.0,
4: 37.0,
5: 13.0,
6: 954.0,
7: 4.0,
8: 477.0,
9: 50.0,
10: 50.0,
11: 75.0,
12: 43.0,
13: 17.0,
14: 14.0,
15: 504.0,
16: 3.0,
17: 252.0,
18: 2916.0,
19: 2916.0},
(pd.Timestamp('2020-06-29 00:00:00'), 'sales'): {0: 1268.85,
1: 18274.385000000002,
2: 19722.65,
3: 55547.255,
4: 15323.800000000001,
5: 1688.6749999999997,
6: 227463.23,
7: 1906.0,
8: 113731.615,
9: 3219.6499999999996,
10: 15852.060000000001,
11: 17743.7,
12: 37795.15,
13: 5918.5,
14: 1708.75,
15: 166349.64,
16: 937.01,
17: 83174.82,
18: 787625.7400000001,
19: 787625.7400000001},
(pd.Timestamp('2020-06-29 00:00:00'), 'difference'): {0: 0.0,
1: 0.0,
2: 0.0,
3: 0.0,
4: 0.0,
5: 0.0,
6: 0.0,
7: 0.0,
8: 0.0,
9: 0.0,
10: 0.0,
11: 0.0,
12: 0.0,
13: 0.0,
14: 0.0,
15: 0.0,
16: 0.0,
17: 0.0,
18: 0.0,
19: 0.0},
(pd.Timestamp('2020-07-06 00:00:00'), 'last_sales'): {0: 26.0,
1: 39.0,
2: 79.0,
3: 49.0,
4: 10.0,
5: 10.0,
6: 436.0,
7: 5.0,
8: 218.0,
9: 89.0,
10: 34.0,
11: 133.0,
12: 66.0,
13: 21.0,
14: 20.0,
15: 732.0,
16: 3.0,
17: 366.0,
18: 2336.0,
19: 2336.0},
(pd.Timestamp('2020-07-06 00:00:00'), 'sales'): {0: 3978.15,
1: 12138.96,
2: 19084.175,
3: 40033.46000000001,
4: 4280.15,
5: 1495.1,
6: 165548.29,
7: 1764.15,
8: 82774.145,
9: 8314.92,
10: 12776.649999999996,
11: 28048.075,
12: 55104.21000000002,
13: 6962.844999999999,
14: 3053.2000000000003,
15: 231049.11000000002,
16: 1264.655,
17: 115524.55500000001,
18: 793194.8000000002,
19: 793194.8000000002},
(pd.Timestamp('2020-07-06 00:00:00'), 'difference'): {0: 0.0,
1: 0.0,
2: 0.0,
3: 0.0,
4: 0.0,
5: 0.0,
6: 0.0,
7: 0.0,
8: 0.0,
9: 0.0,
10: 0.0,
11: 0.0,
12: 0.0,
13: 0.0,
14: 0.0,
15: 0.0,
16: 0.0,
17: 0.0,
18: 0.0,
19: 0.0},
(pd.Timestamp('2021-06-28 00:00:00'), 'last_sales'): {0: 96.0,
1: 56.0,
2: 106.0,
3: 44.0,
4: 34.0,
5: 13.0,
6: 716.0,
7: 9.0,
8: 358.0,
9: 101.0,
10: 22.0,
11: 120.0,
12: 40.0,
13: 13.0,
14: 8.0,
15: 610.0,
16: 1.0,
17: 305.0,
18: 2652.0,
19: 2652.0},
(pd.Timestamp('2021-06-28 00:00:00'), 'sales'): {0: 5194.95,
1: 19102.219999999994,
2: 22796.420000000002,
3: 30853.115,
4: 11461.25,
5: 992.6,
6: 188143.41,
7: 3671.15,
8: 94071.705,
9: 6022.299999999998,
10: 7373.6,
11: 33514.0,
12: 35943.45,
13: 4749.000000000001,
14: 902.01,
15: 177707.32,
16: 349.3,
17: 88853.66,
18: 731701.46,
19: 731701.46},
(pd.Timestamp('2021-06-28 00:00:00'), 'difference'): {0: 0.0,
1: 0.0,
2: 0.0,
3: 0.0,
4: 0.0,
5: 0.0,
6: 0.0,
7: 0.0,
8: 0.0,
9: 0.0,
10: 0.0,
11: 0.0,
12: 0.0,
13: 0.0,
14: 0.0,
15: 0.0,
16: 0.0,
17: 0.0,
18: 0.0,
19: 0.0},
(pd.Timestamp('2021-07-07 00:00:00'), 'last_sales'): {0: 45.0,
1: 47.0,
2: 87.0,
3: 45.0,
4: 13.0,
5: 8.0,
6: 494.0,
7: 2.0,
8: 247.0,
9: 81.0,
10: 36.0,
11: 143.0,
12: 56.0,
13: 9.0,
14: 9.0,
15: 670.0,
16: 1.0,
17: 335.0,
18: 2328.0,
19: 2328.0},
(pd.Timestamp('2021-07-07 00:00:00'), 'sales'): {0: 7556.414999999998,
1: 14985.05,
2: 16790.899999999998,
3: 36202.729999999996,
4: 4024.97,
5: 1034.45,
6: 163960.32999999996,
7: 1385.65,
8: 81980.16499999998,
9: 5600.544999999999,
10: 11209.92,
11: 32832.61,
12: 42137.44500000001,
13: 3885.1499999999996,
14: 1191.5,
15: 194912.34000000003,
16: 599.0,
17: 97456.17000000001,
18: 717745.3400000001,
19: 717745.3400000001},
(pd.Timestamp('2021-07-07 00:00:00'), 'difference'): {0: 0.0,
1: 0.0,
2: 0.0,
3: 0.0,
4: 0.0,
5: 0.0,
6: 0.0,
7: 0.0,
8: 0.0,
9: 0.0,
10: 0.0,
11: 0.0,
12: 0.0,
13: 0.0,
14: 0.0,
15: 0.0,
16: 0.0,
17: 0.0,
18: 0.0,
19: 0.0}}).set_index(['group','category'])
我想创建一个2级index
称为combined
这将是总和sales & last_sales
所有的categories
,除了Facebook
和total
/ Total
。 所以df
看起来像这样:
我尝试用.loc
来做,但没有成功:
s = df_out.stack(0)
s['combined'] = 0
s.loc[(slice(None),[x for x in s.loc[(slice(None),:) if x != 'Facebook']].sum()
level=0
删除all
,同样在level=1
drop
其他不需要的级别值level=0
上的sum
以聚合帧Multindex
以添加聚合帧中combined
的附加级别s = df.drop('all').drop(['Facebook', 'total', 'Total'], level=1).sum(level=0)
s.index = pd.MultiIndex.from_product([s.index, ['combined']])
df_out = df.append(s).sort_index()
2020-06-29 00:00:00 2020-07-06 00:00:00 2021-06-28 00:00:00 2021-07-07 00:00:00
last_sales sales difference last_sales sales difference last_sales sales difference last_sales sales difference
group category
A Amazon 195.0 1268.850 0.0 26.0 3978.150 0.0 96.0 5194.950 0.0 45.0 7556.415 0.0
Apple 61.0 18274.385 0.0 39.0 12138.960 0.0 56.0 19102.220 0.0 47.0 14985.050 0.0
Facebook 106.0 19722.650 0.0 79.0 19084.175 0.0 106.0 22796.420 0.0 87.0 16790.900 0.0
Google 61.0 55547.255 0.0 49.0 40033.460 0.0 44.0 30853.115 0.0 45.0 36202.730 0.0
Netflix 37.0 15323.800 0.0 10.0 4280.150 0.0 34.0 11461.250 0.0 13.0 4024.970 0.0
Tesla 13.0 1688.675 0.0 10.0 1495.100 0.0 13.0 992.600 0.0 8.0 1034.450 0.0
Total 954.0 227463.230 0.0 436.0 165548.290 0.0 716.0 188143.410 0.0 494.0 163960.330 0.0
Uber 4.0 1906.000 0.0 5.0 1764.150 0.0 9.0 3671.150 0.0 2.0 1385.650 0.0
combined 371.0 94008.965 0.0 139.0 63689.970 0.0 252.0 71275.285 0.0 160.0 65189.265 0.0
total 477.0 113731.615 0.0 218.0 82774.145 0.0 358.0 94071.705 0.0 247.0 81980.165 0.0
B Amazon 50.0 3219.650 0.0 89.0 8314.920 0.0 101.0 6022.300 0.0 81.0 5600.545 0.0
Apple 50.0 15852.060 0.0 34.0 12776.650 0.0 22.0 7373.600 0.0 36.0 11209.920 0.0
Facebook 75.0 17743.700 0.0 133.0 28048.075 0.0 120.0 33514.000 0.0 143.0 32832.610 0.0
Google 43.0 37795.150 0.0 66.0 55104.210 0.0 40.0 35943.450 0.0 56.0 42137.445 0.0
Netflix 17.0 5918.500 0.0 21.0 6962.845 0.0 13.0 4749.000 0.0 9.0 3885.150 0.0
Tesla 14.0 1708.750 0.0 20.0 3053.200 0.0 8.0 902.010 0.0 9.0 1191.500 0.0
Total 504.0 166349.640 0.0 732.0 231049.110 0.0 610.0 177707.320 0.0 670.0 194912.340 0.0
Uber 3.0 937.010 0.0 3.0 1264.655 0.0 1.0 349.300 0.0 1.0 599.000 0.0
combined 177.0 65431.120 0.0 233.0 87476.480 0.0 185.0 55339.660 0.0 192.0 64623.560 0.0
total 252.0 83174.820 0.0 366.0 115524.555 0.0 305.0 88853.660 0.0 335.0 97456.170 0.0
all Total 2916.0 787625.740 0.0 2336.0 793194.800 0.0 2652.0 731701.460 0.0 2328.0 717745.340 0.0
total 2916.0 787625.740 0.0 2336.0 793194.800 0.0 2652.0 731701.460 0.0 2328.0 717745.340 0.0
您可以在0级组,并concat
每组的总和,如果name
不是“所有”,同时加入索引。 然后将此输出与原始数据和sort_index
按level=0
:
pd.concat([df,
pd.concat({(name, 'combined'): d.drop(['Facebook', 'total', 'Total'], level=1).sum()
for name,d in df.groupby(level=0) if name != 'all'}, axis=1).T
]).sort_index(level=0)
输出:
2020-06-29 00:00:00 2020-07-06 00:00:00 2021-06-28 00:00:00 2021-07-07 00:00:00
last_sales sales difference last_sales sales difference last_sales sales difference last_sales sales difference
group category
A Amazon 195.0 1268.850 0.0 26.0 3978.150 0.0 96.0 5194.950 0.0 45.0 7556.415 0.0
Apple 61.0 18274.385 0.0 39.0 12138.960 0.0 56.0 19102.220 0.0 47.0 14985.050 0.0
Facebook 106.0 19722.650 0.0 79.0 19084.175 0.0 106.0 22796.420 0.0 87.0 16790.900 0.0
Google 61.0 55547.255 0.0 49.0 40033.460 0.0 44.0 30853.115 0.0 45.0 36202.730 0.0
Netflix 37.0 15323.800 0.0 10.0 4280.150 0.0 34.0 11461.250 0.0 13.0 4024.970 0.0
Tesla 13.0 1688.675 0.0 10.0 1495.100 0.0 13.0 992.600 0.0 8.0 1034.450 0.0
Total 954.0 227463.230 0.0 436.0 165548.290 0.0 716.0 188143.410 0.0 494.0 163960.330 0.0
Uber 4.0 1906.000 0.0 5.0 1764.150 0.0 9.0 3671.150 0.0 2.0 1385.650 0.0
combined 371.0 94008.965 0.0 139.0 63689.970 0.0 252.0 71275.285 0.0 160.0 65189.265 0.0
total 477.0 113731.615 0.0 218.0 82774.145 0.0 358.0 94071.705 0.0 247.0 81980.165 0.0
B Amazon 50.0 3219.650 0.0 89.0 8314.920 0.0 101.0 6022.300 0.0 81.0 5600.545 0.0
Apple 50.0 15852.060 0.0 34.0 12776.650 0.0 22.0 7373.600 0.0 36.0 11209.920 0.0
Facebook 75.0 17743.700 0.0 133.0 28048.075 0.0 120.0 33514.000 0.0 143.0 32832.610 0.0
Google 43.0 37795.150 0.0 66.0 55104.210 0.0 40.0 35943.450 0.0 56.0 42137.445 0.0
Netflix 17.0 5918.500 0.0 21.0 6962.845 0.0 13.0 4749.000 0.0 9.0 3885.150 0.0
Tesla 14.0 1708.750 0.0 20.0 3053.200 0.0 8.0 902.010 0.0 9.0 1191.500 0.0
Total 504.0 166349.640 0.0 732.0 231049.110 0.0 610.0 177707.320 0.0 670.0 194912.340 0.0
Uber 3.0 937.010 0.0 3.0 1264.655 0.0 1.0 349.300 0.0 1.0 599.000 0.0
combined 177.0 65431.120 0.0 233.0 87476.480 0.0 185.0 55339.660 0.0 192.0 64623.560 0.0
total 252.0 83174.820 0.0 366.0 115524.555 0.0 305.0 88853.660 0.0 335.0 97456.170 0.0
all Total 2916.0 787625.740 0.0 2336.0 793194.800 0.0 2652.0 731701.460 0.0 2328.0 717745.340 0.0
total 2916.0 787625.740 0.0 2336.0 793194.800 0.0 2652.0 731701.460 0.0 2328.0 717745.340 0.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.