通過應用在重新采樣的對象上使用自定義函數時出現 IndexingError

Question

我正在嘗試在重采樣對象上使用 apply 部署自定義函數。 該函數的棘手部分是它遍歷傳遞的數據幀的每個時間戳，並根據該時間戳的其他列的值執行操作。 然后它將輸出與輸入行數相同的數據幀（在我的玩具示例中，我沒有這樣做，只是返回一個列表）。 我提供的示例中的邏輯比我的用例中的要簡單得多。

獲取 IndexingError：索引器過多

import numpy as np
import pandas as pd

df = pd.DataFrame({'a': np.random.randint(0, 100, 10), 'b': np.random.randint(0, 1000, 10), 'c': np.random.uniform(0, 100, 10)},
index = pd.date_range("2021-01-01", "2021-01-10"))

def test_func(df):
    new_ser = []
    for i in range(df.shape[0]):
        if i==0:
            new_ser.append(np.NaN)
        if df.iloc[i,:]['a'] < df.iloc[i,:]['b']:
            new_ser.append(1)
        else:
            new_ser.append(0)

    return new_ser

df.resample('2D').apply(test_func)

IndexingError: Too many indexers

Answer 1

問題是 Resampler.apply 中的df.iloc[i,:]['a'] ，傳遞給Resampler.apply的值是原始數據幀的重新Resampler.apply列，例如

2021-01-01    81
2021-01-02    90
Freq: D, Name: a, dtype: int64

2021-01-01    395
2021-01-02    845
Freq: D, Name: b, dtype: int64

你可能想要groupby(pd.Grouper).apply()

def test_func(df):
    new_ser = []
    for i in range(df.shape[0]):
        if i==0:
            new_ser.append(np.NaN)
        if df.iloc[i,:]['a'] < df.iloc[i,:]['b']:
            new_ser.append(1)
        else:
            new_ser.append(0)
    return new_ser

out = df.groupby(pd.Grouper(freq='2D')).apply(test_func)

print(out)


2021-01-01    [nan, 1, 1]
2021-01-03    [nan, 1, 1]
2021-01-05    [nan, 1, 1]
2021-01-07    [nan, 1, 1]
2021-01-09    [nan, 1, 1]
Freq: 2D, dtype: object

通過應用在重新采樣的對象上使用自定義函數時出現 IndexingError

問題描述

1 個解決方案

解決方案1
0 2022-05-26 15:01:47

通過應用在重新采樣的對象上使用自定義函數時出現 IndexingError

問題描述

1 個解決方案

解決方案1 0 2022-05-26 15:01:47

解決方案1
0 2022-05-26 15:01:47