简体   繁体   English

将 function 应用到引用前一行数据的 dataframe 行

[英]Apply function to dataframe row which references previous row's data

I'm trying to set col 'b' of my dataframe based on it's previous value from the row above.我正在尝试根据上一行的先前值设置我的 dataframe 的 col 'b'。 Is there any way to do this without iterating through the rows or using decorators to the pd.apply function?有没有办法在不遍历行或对 pd.apply function 使用装饰器的情况下做到这一点?

Psuedo code:伪代码:

if row != 0:
    curr_row['b'] = prev_row['b'] + curr_row['a']
else: 
    curr_row['b'] = curr_row['a']

Here's what i've tried:这是我尝试过的:

df = pd.DataFrame({'a': [1,2,3,4,5],
                   'b': [0,0,0,0,0]})

df.b = df.apply(lambda row: row.a if row.name < 1 else (df.iloc[row.name-1].b + row.a), axis=1)

output: output:

    a   b
0   1   1
1   2   2
2   3   3
3   4   4
4   5   5

Desired output:所需的 output:

    a   b
0   1   1
1   2   3
2   3   6
3   4   10
4   5   15

if I run the apply function a second time on the new df one more row value of c is correct.:如果我在新的df上第二次运行apply function c 的另一行值是正确的:

    a   b
0   1   1
1   2   3
2   3   5
3   4   7
4   5   9

This pattern continues if I continue to re-run the apply function until the output is correct.如果我继续重新运行应用 function 直到 output 正确,则此模式将继续。

I'm guessing the issue has something to do with the mechanics of how the apply function works which makes it break when you use a value from the same column you are 'applying' on.我猜这个问题与应用 function 的工作原理有关,当您使用“应用”同一列的值时,它会中断。 That or I'm just being an idiot somehow (very plausible).那或者我只是在某种程度上是个白痴(非常合理)。 Can someone explain this?有人可以解释一下吗?

Do I have to use decorators to store the previous row or is there a cleaner way to do this?我必须使用装饰器来存储前一行还是有更清洁的方法来做到这一点?

Your requirement is cumsum()你的要求是cumsum()

df = pd.DataFrame({'a': [1,2,3,4,5],
                   'b': [0,0,0,0,0]})
df.assign(b=df.a.cumsum())

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM