簡體   English   中英

應用自定義累積 Function 到 Pandas

[英]Apply Customize Cumulative Function to Pandas

我有以下 dataframe 並喜歡將自定義累積公式應用於列。

我應該如何將它們放入 function 並將它們應用於新列? 非常感謝!

Excel 公式:

new column =(previous period value + 1) * (1 + current period of test_A's value) ^ (1/12) -1

# For example: 2021-12-11 as current period 
=(0.01 + 1 ) * (1 + 0.01 ) ^ (1/12) -1 = 0.0016598

在此處輸入圖像描述

我將通過迭代行來實現這一點。

df['Cumulative'] = np.nan  # create a new column
prev = 0
for i, row in df.iterrows():
    prev = (1 + prev) * (1 + row['test_A']) ** (1/12) - 1
    df.loc[i, 'Cumulative'] = prev

如果性能很重要,我認為最好使用 numba:

df = pd.DataFrame({
    "test_A": [0.01] * 1000,
   
})

    
from numba import jit

@jit(nopython=True)
def f(a):
    d = np.empty(a.shape)
    p = 0
    for i in range(a.shape[0]):
        p = (1 + p) * (1 + a[i]) ** (1/12) - 1
        d[i] = p
    return d


df['Cumulative1'] = f(df['test_A'].to_numpy())
print (df)
     test_A  Cumulative  Cumulative1
0      0.01    0.000830     0.000830
1      0.01    0.001660     0.001660
2      0.01    0.002491     0.002491
3      0.01    0.003322     0.003322
4      0.01    0.004155     0.004155
..      ...         ...          ...
995    0.01    1.283884     1.283884
996    0.01    1.285778     1.285778
997    0.01    1.287675     1.287675
998    0.01    1.289572     1.289572
999    0.01    1.291472     1.291472

[1000 rows x 3 columns]

In [46]: %%timeit
    ...: df['Cumulative'] = np.nan  # create a new column
    ...: prev = 0
    ...: for i, row in df.iterrows():
    ...:     prev = (1 + prev) * (1 + row['test_A']) ** (1/12) - 1
    ...:     df.loc[i, 'Cumulative'] = prev
    ...:     
140 ms ± 998 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [47]: %%timeit 
    ...: df['Cumulative1'] = f(df['test_A'].to_numpy())
    ...: 
133 µs ± 570 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM