Python Pandas：如何使用返回pd.Series的函數進行聚合

Question

我有一個多索引數據框，我想在其上聚合一些索引。 如果聚合器函數返回一個浮點數，那么事情就沒有問題。 但我找不到如何使用具有更復雜回報的函數（例如，pd.Series）。 使用返回pd.Series的函數給出了這個錯誤： Exception: Must produce aggregated value錯誤。

這是一個示例數據幀：

df = pd.DataFrame({
    'A': {
        (1, 0): 85, (1, 1): 75,
        (2, 0): 12, (2, 1): 15,
        (3, 0): 2,  (3, 1): 26,
    },
    'B': {
        (1, 0): 86, (1, 1): 76,
        (2, 0): 13, (2, 1): 17,
        (3, 0): 19, (3, 1): 18,
    }
}).stack()
df.index.rename(['idx', 'bar', 'label'], inplace=True)

df的內容是：

idx  bar  label
1    0    A        85
          B        86
     1    A        75
          B        76
2    0    A        12
          B        13
     1    A        15
          B        17
3    0    A         2
          B        19
     1    A        26
          B        18
dtype: int64

讓我們定義一個返回pd.Series的簡單聚合器：

def my_func(subframe):
  subframe = subframe.unstack('label')
  mean_A_plus_B = np.mean(subframe['B'] + subframe['A'])
  mean_A_minus_B = np.mean(subframe['B'] - subframe['A'])
  return pd.Series([mean_A_plus_B, mean_A_minus_B], index=['A+B', 'A-B'])
  # return mean_A_plus_B  ## <- this one works.

應用聚合器如下例外：

df.groupby('idx').agg(my_func)
.
.
.
py/pandas/core/groupby/generic.py in _aggregate_named(self, func, *args, **kwargs)
    907             output = func(group, *args, **kwargs)
    908             if isinstance(output, (Series, Index, np.ndarray)):
--> 909                 raise Exception('Must produce aggregated value')
    910             result[name] = self._try_cast(output, group)

Exception: Must produce aggregated value

我希望收到的是：

       A+B    A-B
idx
1    161.0    1.0
2     28.5    1.5
3     32.5    4.5
dtype: float64

這樣做的正確方法是什么？

Answer 1

只需用.agg()替換.agg() .apply() ：

df.groupby('idx').apply(my_func).unstack(level=-1)

輸出：

       A+B  A-B
idx            
1    161.0  1.0
2     28.5  1.5
3     32.5  4.5

Python Pandas：如何使用返回pd.Series的函數進行聚合

問題描述

1 個解決方案

解決方案1
3 已采納 2019-08-28 01:25:03

Python Pandas：如何使用返回pd.Series的函數進行聚合

問題描述

1 個解決方案

解決方案1 3 已采納 2019-08-28 01:25:03

解決方案1
3 已采納 2019-08-28 01:25:03