熊猫使用先前值的输出顺序应用功能

Question

我想计算一系列的“残留”。 这将为每行计算一个值，然后将其添加到先前计算的值（对于前一行）。

如何在熊猫中做到这一点？

decay = 0.5
test = pd.DataFrame(np.random.randint(1,10,12),columns = ['val'])
test
    val
0   4
1   5
2   7
3   9
4   1
5   1
6   8
7   7
8   3
9   9
10  7
11  2

decayed = []
for i, v in test.iterrows():
    if i ==0:
        decayed.append(v.val)
        continue
    d = decayed[i-1] + v.val*decay
    decayed.append(d)

test['loop_decay'] = decayed
test.head()

    val loop_decay
0   4   4.0
1   5   6.5
2   7   10.0
3   9   14.5
4   1   15.0

Answer 1

考虑一个带有cumsum()的矢量化版本，其中您与第一个val进行累积求和（val *衰减）。

但是，由于cumsum()包含它，因此您需要减去第一个（val *衰减）：

test['loop_decay'] = (test.ix[0,'val']) + (test['val']*decay).cumsum() - (test.ix[0,'val']*decay)

Answer 2

您可以利用pd.Series.shift()创建具有val [i]和val [i-1]的数据pd.Series.shift() ，然后将函数应用于单个轴（在这种情况下为1）：

 # Create a series that shifts the rows by 1
 test['val2'] = test.val.shift()
 # Set the first row on the shifted series to 0
 test['val2'].ix[0] = 0
 # Apply the decay formula:
 test['loop_decay'] = test.apply(lambda x: x['val'] + x['val2'] * 0.5, axis=1)

熊猫使用先前值的输出顺序应用功能

问题描述

2 个解决方案

解决方案1
2 已采纳 2017-09-18 20:23:44

解决方案2
1 2017-09-18 20:03:33

熊猫使用先前值的输出顺序应用功能

问题描述

2 个解决方案

解决方案1 2 已采纳 2017-09-18 20:23:44

解决方案2 1 2017-09-18 20:03:33

解决方案1
2 已采纳 2017-09-18 20:23:44

解决方案2
1 2017-09-18 20:03:33