简体   繁体   English

在多索引熊猫数据框中添加每个 2 级索引的总数

[英]Adding a total per level-2 index in a multiindex pandas dataframe

I have a dataframe:我有一个数据框:

df_full = pd.DataFrame.from_dict({('group', ''): {0: 'A',
  1: 'A',
  2: 'A',
  3: 'A',
  4: 'A',
  5: 'A',
  6: 'A',
  7: 'B',
  8: 'B',
  9: 'B',
  10: 'B',
  11: 'B',
  12: 'B',
  13: 'B'},
 ('category', ''): {0: 'Books',
  1: 'Candy',
  2: 'Pencil',
  3: 'Table',
  4: 'PC',
  5: 'Printer',
  6: 'Lamp',
  7: 'Books',
  8: 'Candy',
  9: 'Pencil',
  10: 'Table',
  11: 'PC',
  12: 'Printer',
  13: 'Lamp'},
 (pd.Timestamp('2021-06-28 00:00:00'),
  'Sales_1'): {0: 9.937449997200002, 1: 30.71300000639998, 2: 58.81199999639999, 3: 25.661999978399994, 4: 3.657999996, 5: 12.0879999972, 6: 61.16600000040001, 7: 6.319439989199998, 8: 12.333119997600003, 9: 24.0544100028, 10: 24.384659998799997, 11: 1.9992000012000002, 12: 0.324, 13: 40.69122000000001},
 (pd.Timestamp('2021-06-28 00:00:00'),
  'Sales_2'): {0: 21.890370397789923, 1: 28.300470581874837, 2: 53.52039700062155, 3: 52.425508769690694, 4: 6.384936971649232, 5: 6.807138946302334, 6: 52.172, 7: 5.916852561, 8: 5.810764652, 9: 12.1243325, 10: 17.88071596, 11: 0.913782413, 12: 0.869207661, 13: 20.9447844},
 (pd.Timestamp('2021-06-28 00:00:00'), 'last_week_sales'): {0: np.nan,
  1: np.nan,
  2: np.nan,
  3: np.nan,
  4: np.nan,
  5: np.nan,
  6: np.nan,
  7: np.nan,
  8: np.nan,
  9: np.nan,
  10: np.nan,
  11: np.nan,
  12: np.nan,
  13: np.nan},
 (pd.Timestamp('2021-06-28 00:00:00'), 'total_orders'): {0: 86.0,
  1: 66.0,
  2: 188.0,
  3: 556.0,
  4: 12.0,
  5: 4.0,
  6: 56.0,
  7: 90.0,
  8: 26.0,
  9: 49.0,
  10: 250.0,
  11: 7.0,
  12: 2.0,
  13: 44.0},
 (pd.Timestamp('2021-06-28 00:00:00'), 'total_sales'): {0: 4390.11,
  1: 24825.059999999998,
  2: 48592.39999999998,
  3: 60629.77,
  4: 831.22,
  5: 1545.71,
  6: 34584.99,
  7: 5641.54,
  8: 6798.75,
  9: 13290.13,
  10: 42692.68000000001,
  11: 947.65,
  12: 329.0,
  13: 29889.65},
 (pd.Timestamp('2021-07-05 00:00:00'),
  'Sales_1'): {0: 13.690399997999998, 1: 38.723000005199985, 2: 72.4443400032, 3: 36.75802000560001, 4: 5.691999996, 5: 7.206999998399999, 6: 66.55265999039996, 7: 6.4613199911999954, 8: 12.845630001599998, 9: 26.032340003999998, 10: 30.1634600016, 11: 1.0203399996, 12: 1.4089999991999997, 13: 43.67116000320002},
 (pd.Timestamp('2021-07-05 00:00:00'),
  'Sales_2'): {0: 22.874363860953647, 1: 29.5726042895728, 2: 55.926190956481534, 3: 54.7820864335212, 4: 6.671946105284065, 5: 7.113126469779095, 6: 54.517, 7: 6.194107518, 8: 6.083562133, 9: 12.69221484, 10: 18.71872129, 11: 0.956574175, 12: 0.910216433, 13: 21.92632044},
 (pd.Timestamp('2021-07-05 00:00:00'), 'last_week_sales'): {0: 4390.11,
  1: 24825.059999999998,
  2: 48592.39999999998,
  3: 60629.77,
  4: 831.22,
  5: 1545.71,
  6: 34584.99,
  7: 5641.54,
  8: 6798.75,
  9: 13290.13,
  10: 42692.68000000001,
  11: 947.65,
  12: 329.0,
  13: 29889.65},
 (pd.Timestamp('2021-07-05 00:00:00'), 'total_orders'): {0: 109.0,
  1: 48.0,
  2: 174.0,
  3: 587.0,
  4: 13.0,
  5: 5.0,
  6: 43.0,
  7: 62.0,
  8: 13.0,
  9: 37.0,
  10: 196.0,
  11: 8.0,
  12: 1.0,
  13: 33.0},
 (pd.Timestamp('2021-07-05 00:00:00'), 'total_sales'): {0: 3453.02,
  1: 17868.730000000003,
  2: 44707.82999999999,
  3: 60558.97999999999,
  4: 1261.0,
  5: 1914.6000000000001,
  6: 24146.09,
  7: 6201.489999999999,
  8: 5513.960000000001,
  9: 9645.87,
  10: 25086.785,
  11: 663.0,
  12: 448.61,
  13: 26332.7}}).set_index(['group','category'])

I am trying to get a total for each column per category .我正在尝试为每个category每列获取total So in this df example adding 2 lines below Lamp denoting the totals of each column.所以在这个df示例中,在Lamp下方添加 2 行,表示每列的总数。 Red lines indicate the desired totals placement:红线表示所需的totals位置:

在此处输入图片说明

What I've tried:我试过的:

df_out['total'] = df_out.sum(level=1).loc[:, (slice(None), 'total_sales')]

But get:但是得到:

ValueError: Wrong number of items passed 4, placement implies 1 ValueError: 错误数量的项目通过 4,放置意味着 1

I also checked this question but could not apply it to my self.我也检查了这个问题,但无法将其应用于我自己。

Let us try groupby on level=0让我们在level=0上尝试groupby

s = df_full.groupby(level=0).sum()
s.index = pd.MultiIndex.from_product([s.index, ['Total']])

df_out = df_full.append(s).sort_index()

print(df_out)
                           2021-06-28 00:00:00                                                      2021-07-05 00:00:00                                                     
                           Sales_1     Sales_2 last_week_sales total_orders total_sales             Sales_1     Sales_2 last_week_sales total_orders total_sales
group category                                                                                                                                                  
A     Books                9.93745   21.890370             NaN         86.0     4390.11            13.69040   22.874364         4390.11        109.0    3453.020
      Candy               30.71300   28.300471             NaN         66.0    24825.06            38.72300   29.572604        24825.06         48.0   17868.730
      Lamp                61.16600   52.172000             NaN         56.0    34584.99            66.55266   54.517000        34584.99         43.0   24146.090
      PC                   3.65800    6.384937             NaN         12.0      831.22             5.69200    6.671946          831.22         13.0    1261.000
      Pencil              58.81200   53.520397             NaN        188.0    48592.40            72.44434   55.926191        48592.40        174.0   44707.830
      Printer             12.08800    6.807139             NaN          4.0     1545.71             7.20700    7.113126         1545.71          5.0    1914.600
      Table               25.66200   52.425509             NaN        556.0    60629.77            36.75802   54.782086        60629.77        587.0   60558.980
      Total              202.03645  221.500823             0.0        968.0   175399.26           241.06742  231.457318       175399.26        979.0  153910.250
B     Books                6.31944    5.916853             NaN         90.0     5641.54             6.46132    6.194108         5641.54         62.0    6201.490
      Candy               12.33312    5.810765             NaN         26.0     6798.75            12.84563    6.083562         6798.75         13.0    5513.960
      Lamp                40.69122   20.944784             NaN         44.0    29889.65            43.67116   21.926320        29889.65         33.0   26332.700
      PC                   1.99920    0.913782             NaN          7.0      947.65             1.02034    0.956574          947.65          8.0     663.000
      Pencil              24.05441   12.124332             NaN         49.0    13290.13            26.03234   12.692215        13290.13         37.0    9645.870
      Printer              0.32400    0.869208             NaN          2.0      329.00             1.40900    0.910216          329.00          1.0     448.610
      Table               24.38466   17.880716             NaN        250.0    42692.68            30.16346   18.718721        42692.68        196.0   25086.785
      Total              110.10605   64.460440             0.0        468.0    99589.40           121.60325   67.481717        99589.40        350.0   73892.415

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM