繁体   English   中英

带有 .loc 的多索引数据帧中的数学运算

[英]Math operations in multiindiex dataframe with .loc

我有一个多索引df

                    created_at          2020-06-29                                       2020-07-06
                                        sales   orders  last_sales      differennce      sales  orders  last_sales      differennce

group               category            10      10      10              0                10      10      20             50  
A                   a1
                    a2
                    a3
B                   a1                  ...     ...     ...            ...               ...     ...     ...
                    a2
                    a3
all                 Total               100     100     100            0                 150     150     150            0

我正在尝试计算all & Total索引的difference ,即((sales / last_sales) -1) * 100其中created_at2020-06-29 to 2021-07-19

我试过了:

df.loc[('all','Total'),(slice(None),'difference')] = 
((df.loc[('all','Total'),(slice(None),'sales')] / 
df.loc[('all','Total'),:'2021-07-19','last_sales']) - 1) * 100

但我收到一个错误:

IndexError:列表索引超出范围

我似乎无法选择我想要的created_at范围,但这有效:

df.loc[:,:'2021-07-19'] # returning the dates I want, from first to selected

但是当我尝试选择一行时,它会中断:

df_out.loc[:,(:'2021-07-19','sales')] # SyntaxError: invalid syntax

&

df_out.loc[:,:'2021-07-19','sales'] # IndexError: list index out of range

我如何选择group = allcategory = Total和日期从2020-06-29 to 2020-07-19

我也试过:

df.loc[('all','Total'),(slice(None),'difference')] = 
((df.loc[('all','Total'),(slice(None),'sales')] / 
df.loc[('all','Total'),(slice(None),'last_sales')]) - 1) * 100 # slice(None) instead of date

但是,这将引发我ZeroDivisionError: float division by zero ,因为有last_sales = 0以上日期2021-07-19 也许还有其他处理0s 我试着像这样加1 df.loc[('all','Total'),(slice(None),'last_sales')]+1) - 1) * 100但我仍然得到同样的错误。

示例df

df = pd.DataFrame.from_dict({('group', ''): {0: 'A',
  1: 'A',
  2: 'A',
  3: 'A',
  4: 'A',
  5: 'A',
  6: 'A',
  7: 'A',
  8: 'A',
  9: 'B',
  10: 'B',
  11: 'B',
  12: 'B',
  13: 'B',
  14: 'B',
  15: 'B',
  16: 'B',
  17: 'B',
  18: 'all',
  19: 'all'},
 ('category', ''): {0: 'Amazon',
  1: 'Apple',
  2: 'Facebook',
  3: 'Google',
  4: 'Netflix',
  5: 'Tesla',
  6: 'Total',
  7: 'Uber',
  8: 'total',
  9: 'Amazon',
  10: 'Apple',
  11: 'Facebook',
  12: 'Google',
  13: 'Netflix',
  14: 'Tesla',
  15: 'Total',
  16: 'Uber',
  17: 'total',
  18: 'Total',
  19: 'total'},
 (pd.Timestamp('2020-06-29 00:00:00'), 'last_sales'): {0: 195.0,
  1: 61.0,
  2: 106.0,
  3: 61.0,
  4: 37.0,
  5: 13.0,
  6: 954.0,
  7: 4.0,
  8: 477.0,
  9: 50.0,
  10: 50.0,
  11: 75.0,
  12: 43.0,
  13: 17.0,
  14: 14.0,
  15: 504.0,
  16: 3.0,
  17: 252.0,
  18: 2916.0,
  19: 2916.0},
 (pd.Timestamp('2020-06-29 00:00:00'), 'sales'): {0: 1268.85,
  1: 18274.385000000002,
  2: 19722.65,
  3: 55547.255,
  4: 15323.800000000001,
  5: 1688.6749999999997,
  6: 227463.23,
  7: 1906.0,
  8: 113731.615,
  9: 3219.6499999999996,
  10: 15852.060000000001,
  11: 17743.7,
  12: 37795.15,
  13: 5918.5,
  14: 1708.75,
  15: 166349.64,
  16: 937.01,
  17: 83174.82,
  18: 787625.7400000001,
  19: 787625.7400000001},
 (pd.Timestamp('2020-06-29 00:00:00'), 'difference'): {0: 0.0,
  1: 0.0,
  2: 0.0,
  3: 0.0,
  4: 0.0,
  5: 0.0,
  6: 0.0,
  7: 0.0,
  8: 0.0,
  9: 0.0,
  10: 0.0,
  11: 0.0,
  12: 0.0,
  13: 0.0,
  14: 0.0,
  15: 0.0,
  16: 0.0,
  17: 0.0,
  18: 0.0,
  19: 0.0},
 (pd.Timestamp('2020-07-06 00:00:00'), 'last_sales'): {0: 26.0,
  1: 39.0,
  2: 79.0,
  3: 49.0,
  4: 10.0,
  5: 10.0,
  6: 436.0,
  7: 5.0,
  8: 218.0,
  9: 89.0,
  10: 34.0,
  11: 133.0,
  12: 66.0,
  13: 21.0,
  14: 20.0,
  15: 732.0,
  16: 3.0,
  17: 366.0,
  18: 2336.0,
  19: 2336.0},
 (pd.Timestamp('2020-07-06 00:00:00'), 'sales'): {0: 3978.15,
  1: 12138.96,
  2: 19084.175,
  3: 40033.46000000001,
  4: 4280.15,
  5: 1495.1,
  6: 165548.29,
  7: 1764.15,
  8: 82774.145,
  9: 8314.92,
  10: 12776.649999999996,
  11: 28048.075,
  12: 55104.21000000002,
  13: 6962.844999999999,
  14: 3053.2000000000003,
  15: 231049.11000000002,
  16: 1264.655,
  17: 115524.55500000001,
  18: 793194.8000000002,
  19: 793194.8000000002},
 (pd.Timestamp('2020-07-06 00:00:00'), 'difference'): {0: 0.0,
  1: 0.0,
  2: 0.0,
  3: 0.0,
  4: 0.0,
  5: 0.0,
  6: 0.0,
  7: 0.0,
  8: 0.0,
  9: 0.0,
  10: 0.0,
  11: 0.0,
  12: 0.0,
  13: 0.0,
  14: 0.0,
  15: 0.0,
  16: 0.0,
  17: 0.0,
  18: 0.0,
  19: 0.0},
 (pd.Timestamp('2021-06-28 00:00:00'), 'last_sales'): {0: 96.0,
  1: 56.0,
  2: 106.0,
  3: 44.0,
  4: 34.0,
  5: 13.0,
  6: 716.0,
  7: 9.0,
  8: 358.0,
  9: 101.0,
  10: 22.0,
  11: 120.0,
  12: 40.0,
  13: 13.0,
  14: 8.0,
  15: 610.0,
  16: 1.0,
  17: 305.0,
  18: 2652.0,
  19: 2652.0},
 (pd.Timestamp('2021-06-28 00:00:00'), 'sales'): {0: 5194.95,
  1: 19102.219999999994,
  2: 22796.420000000002,
  3: 30853.115,
  4: 11461.25,
  5: 992.6,
  6: 188143.41,
  7: 3671.15,
  8: 94071.705,
  9: 6022.299999999998,
  10: 7373.6,
  11: 33514.0,
  12: 35943.45,
  13: 4749.000000000001,
  14: 902.01,
  15: 177707.32,
  16: 349.3,
  17: 88853.66,
  18: 731701.46,
  19: 731701.46},
 (pd.Timestamp('2021-06-28 00:00:00'), 'difference'): {0: 0.0,
  1: 0.0,
  2: 0.0,
  3: 0.0,
  4: 0.0,
  5: 0.0,
  6: 0.0,
  7: 0.0,
  8: 0.0,
  9: 0.0,
  10: 0.0,
  11: 0.0,
  12: 0.0,
  13: 0.0,
  14: 0.0,
  15: 0.0,
  16: 0.0,
  17: 0.0,
  18: 0.0,
  19: 0.0},
 (pd.Timestamp('2021-07-07 00:00:00'), 'last_sales'): {0: 45.0,
  1: 47.0,
  2: 87.0,
  3: 45.0,
  4: 13.0,
  5: 8.0,
  6: 494.0,
  7: 2.0,
  8: 247.0,
  9: 81.0,
  10: 36.0,
  11: 143.0,
  12: 56.0,
  13: 9.0,
  14: 9.0,
  15: 670.0,
  16: 1.0,
  17: 335.0,
  18: 2328.0,
  19: 2328.0},
 (pd.Timestamp('2021-07-07 00:00:00'), 'sales'): {0: 7556.414999999998,
  1: 14985.05,
  2: 16790.899999999998,
  3: 36202.729999999996,
  4: 4024.97,
  5: 1034.45,
  6: 163960.32999999996,
  7: 1385.65,
  8: 81980.16499999998,
  9: 5600.544999999999,
  10: 11209.92,
  11: 32832.61,
  12: 42137.44500000001,
  13: 3885.1499999999996,
  14: 1191.5,
  15: 194912.34000000003,
  16: 599.0,
  17: 97456.17000000001,
  18: 717745.3400000001,
  19: 717745.3400000001},
 (pd.Timestamp('2021-07-07 00:00:00'), 'difference'): {0: 0.0,
  1: 0.0,
  2: 0.0,
  3: 0.0,
  4: 0.0,
  5: 0.0,
  6: 0.0,
  7: 0.0,
  8: 0.0,
  9: 0.0,
  10: 0.0,
  11: 0.0,
  12: 0.0,
  13: 0.0,
  14: 0.0,
  15: 0.0,
  16: 0.0,
  17: 0.0,
  18: 0.0,
  19: 0.0}}).set_index(['group','category'])

我们可以使用pd.IndexSlice并且我们需要对pd.Timestamp进行切片,因为您的第一级列的类型为 datetime:

idx = pd.IndexSlice
sales = df.loc[('all','Total'),(slice(None),'sales')].droplevel(level=1)
last_sales = df.loc[('all','Total'), :pd.Timestamp("2021-07-07")].loc[idx[:, "last_sales"]]

df.loc[('all','Total'),(slice(None),'difference')] = sales.div(last_sales).sub(1).mul(100).to_numpy()
               2020-06-29 00:00:00                      2020-07-06 00:00:00                      2021-06-28 00:00:00                      2021-07-07 00:00:00                     
                        last_sales     sales difference          last_sales     sales difference          last_sales     sales difference          last_sales     sales difference
group category                                                                                                                                                                    
A     Amazon                195.00   1268.85       0.00               26.00   3978.15       0.00               96.00   5194.95       0.00               45.00   7556.41       0.00
      Apple                  61.00  18274.39       0.00               39.00  12138.96       0.00               56.00  19102.22       0.00               47.00  14985.05       0.00
      Facebook              106.00  19722.65       0.00               79.00  19084.17       0.00              106.00  22796.42       0.00               87.00  16790.90       0.00
      Google                 61.00  55547.25       0.00               49.00  40033.46       0.00               44.00  30853.12       0.00               45.00  36202.73       0.00
      Netflix                37.00  15323.80       0.00               10.00   4280.15       0.00               34.00  11461.25       0.00               13.00   4024.97       0.00
      Tesla                  13.00   1688.67       0.00               10.00   1495.10       0.00               13.00    992.60       0.00                8.00   1034.45       0.00
      Total                 954.00 227463.23       0.00              436.00 165548.29       0.00              716.00 188143.41       0.00              494.00 163960.33       0.00
      Uber                    4.00   1906.00       0.00                5.00   1764.15       0.00                9.00   3671.15       0.00                2.00   1385.65       0.00
      total                 477.00 113731.62       0.00              218.00  82774.15       0.00              358.00  94071.71       0.00              247.00  81980.16       0.00
B     Amazon                 50.00   3219.65       0.00               89.00   8314.92       0.00              101.00   6022.30       0.00               81.00   5600.54       0.00
      Apple                  50.00  15852.06       0.00               34.00  12776.65       0.00               22.00   7373.60       0.00               36.00  11209.92       0.00
      Facebook               75.00  17743.70       0.00              133.00  28048.08       0.00              120.00  33514.00       0.00              143.00  32832.61       0.00
      Google                 43.00  37795.15       0.00               66.00  55104.21       0.00               40.00  35943.45       0.00               56.00  42137.45       0.00
      Netflix                17.00   5918.50       0.00               21.00   6962.84       0.00               13.00   4749.00       0.00                9.00   3885.15       0.00
      Tesla                  14.00   1708.75       0.00               20.00   3053.20       0.00                8.00    902.01       0.00                9.00   1191.50       0.00
      Total                 504.00 166349.64       0.00              732.00 231049.11       0.00              610.00 177707.32       0.00              670.00 194912.34       0.00
      Uber                    3.00    937.01       0.00                3.00   1264.65       0.00                1.00    349.30       0.00                1.00    599.00       0.00
      total                 252.00  83174.82       0.00              366.00 115524.56       0.00              305.00  88853.66       0.00              335.00  97456.17       0.00
all   Total                2916.00 787625.74   26910.48             2336.00 793194.80   33855.26             2652.00 731701.46   27490.55             2328.00 717745.34   30730.99
      total                2916.00 787625.74       0.00             2336.00 793194.80       0.00             2652.00 731701.46       0.00             2328.00 717745.34       0.00

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM