有沒有比使用 .unique() 更好的方法來編寫遞歸的 df.loc(t-1) 賦值？

Question

遞歸函數很難向量化，因為時間 t 的每個輸入都取決於時間 t-1 的前一個輸入。

import pandas
df1 = pandas.DataFrame({'year':range(2020,2024),'a':range(3,7)})
# Set the initial value
t0 = min(df1.year)
df1.loc[df1.year==t0, "x"] = 0

當等式的右側是 pandas.core.series.Series 時，此分配不起作用

for t in range (min(df1.year)+1, max(df1.year)+1):
    df1.loc[df1.year==t, "x"] = df1.loc[df1.year==t-1,"x"] + df1.loc[df1.year==t-1,"a"]
print(df1)
#    year  a    x
# 0  2020  3  0.0
# 1  2021  4  NaN
# 2  2022  5  NaN
# 3  2023  6  NaN
print(type(df1.loc[df1.year==t-1,"x"] + df1.loc[df1.year==t-1,"a"]))
# <class 'pandas.core.series.Series'>

當方程的右側是一個 numpy 數組時，賦值有效

for t in range (min(df1.year)+1, max(df1.year)+1):
    df1.loc[df1.year==t, "x"] = (df1.loc[df1.year==t-1,"x"] + df1.loc[df1.year==t-1,"a"]).unique()
    #break
print(df1)
#    year  a     x
# 0  2020  3   0.0
# 1  2021  4   3.0
# 2  2022  5   7.0
# 3  2023  6  12.0
print(type((df1.loc[df1.year==t-1,"x"] + df1.loc[df1.year==t-1,"a"]).unique()))
# <class 'numpy.ndarray'>

當 .loc() 選擇使用年份索引時，分配直接工作

df2 = df.set_index("year").copy()
# Set the initial value
df2.loc[df2.index.min(), "x"] = 0
for t in range (df2.index.min()+1, df2.index.max()+1):
    df2.loc[t, "x"] = df2.loc[t-1, "x"] + df2.loc[t-1,"a"]
    #break
print(df2)
#       a     x
# year
# 2020  3   0.0
# 2021  4   3.0
# 2022  5   7.0
# 2023  6  12.0
print(type(df2.loc[t-1, "x"] + df2.loc[t-1,"a"]))
# <class 'numpy.float64'>

type(df1.loc[df1.year==t-1,"x"] + df1.loc[df1.year==t-1,"a"])是一個熊貓系列，而type(df2.loc[t-1, "x"] + df2.loc[t-1,"a"])是一個 numpy 浮點數。 為什么這些類型不同？
如果我不想在計算前使用set_index() 。 有沒有比使用.unique()更好的方法來編寫遞歸.loc()賦值？

也可以看看：

關於遞歸分配的相關問答
[Mutating User Defined Function methods]的相關文檔（ https://pandas.pydata.org/pandas-docs/stable/user_guide/gotchas.html#mutating-with-user-defined-function-udf-methods ）

Answer 1

對不起，如果我不明白，你想要這個嗎？

df1['x']= df1['a'].cumsum().shift().fillna(0)
print(df1)

輸出：

   year  a     x
0  2020  3   0.0
1  2021  4   3.0
2  2022  5   7.0
3  2023  6  12.0

有沒有比使用 .unique() 更好的方法來編寫遞歸的 df.loc(t-1) 賦值？

問題描述

1 個解決方案

解決方案1
0 2021-11-16 17:52:36

有沒有比使用 .unique() 更好的方法來編寫遞歸的 df.loc(t-1) 賦值？

問題描述

1 個解決方案

解決方案1 0 2021-11-16 17:52:36

解決方案1
0 2021-11-16 17:52:36