简体   繁体   English

熊猫rolling_apply用于列表元素

[英]Pandas rolling_apply for List Elements

I have accumulated counter, whose elements are lists. 我已经积累了计数器,其元素是列表。 Each list element is accumulated, compared to the previously row. 与上一行相比,每个列表元素都是累积的。

import pandas as pd
d=pd.DataFrame({'counter': {0: [1,0,2], 1:[1,2,3], 2:[4, 4, 5]}})

I can get the maxcounter using the apply function. 我可以使用apply函数获取maxcounter。

def maxf(x): return max(x.counter)
d['maxcounter']=d.apply(lambda row: maxf(row), axis=1)  

Now I would also like to have the field "max_increment", get the difference between current row and the previously row, then compute the maximum. 现在,我还想拥有字段“ max_increment”,获取当前行与上一行之间的差,然后计算最大值。 Could we use rolling_apply for this? 我们可以为此使用rolling_apply吗?

The expected output is as below. 预期的输出如下。

     counter    maxcounter  increase_max
0   [1, 0, 2]   2   NaN 
1   [1, 2, 3]   3   2
2   [4, 4, 5]   5   3

Note: counter is a list, each element is incremented from separated sensor. 注意:计数器是一个列表,每个元素从分开的传感器开始递增。 This is not optimized structure, but what we get now. 这不是优化的结构,而是我们现在得到的。

     counter    counter_incr    increase_max   max_incr_index
0   [1, 0, 2]   
1   [1, 2, 3]   [0, 2, 1]         2                1
2   [4, 4, 5]   [3, 2, 2]         3                0

get the difference between current row and the previously row 得到当前行和前一行之间的差异

Since your type is not numeric (it is a list), it seems the easiest thing is to perform a pd.Series.shift followed by an operation subtracting each element from the previous one: 由于您的类型不是数字(它是一个列表),因此似乎最简单的方法是执行pd.Series.shift然后执行从前一个元素减去每个元素的操作:

import numpy as np
>>> [(np.array(c) - np.array(p)) \
    for c, p in zip(d.counter, d.counter.shift(-1))]
[array([ 0, -2, -1]), array([-3, -2, -2]), array([ nan,  nan,  nan])]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM