按数据框分组：在函数的当前行和上一行中使用列值

Question

I've got a dataframe with this kind of structure: 我有一个具有这种结构的数据框：

import pandas as pd
from geopy.distance import vincenty

data = {'id': [1, 2, 3, 1, 2 , 3],
        'coord': [[10.1, 30.3], [10.5, 32.3], [11.1, 31.3],
                  [10.1, 30.3], [10.5, 32.3], [61, 29.1]],
       }
df = pd.DataFrame(data)

This is how it looks: 它是这样的：

           coord    id
0   [10.1, 30.3]    1
1   [10.5, 32.3]    2
2   [11.1, 31.3]    3
3   [10.1, 30.3]    1
4   [10.5, 32.3]    2
5   [61, 29.1]      3

Now, I want to group by id . 现在，我想按id分组。 Then, I want to use the current and previous row of coords . 然后，我要使用当前和上一行的coords 。 These should be used in a function to compute the distance between the two coordinates: 这些应在函数中用于计算两个坐标之间的距离：

This is what I've tried: 这是我尝试过的：

df.groupby('id')['coord'].apply(lambda x: vincenty(x, x.shift(1)))

vincenty(x,y) expects x like (10, 20) and the same for y and returns a float. vincenty(x,y)期望x像（10，20）并且对y一样，并返回浮点数。

Obviously, this does not work. 显然，这是行不通的。 The function receives two Series objects instead of the two lists. 该函数接收两个Series对象，而不是两个列表。 So probably using x.values.tolist() should be the next step. 因此，下一步可能应该使用x.values.tolist() 。 However, my understanding of things ends here. 但是，我对事物的理解到此为止。 Hence, I'd appreciate any ideas on how to tackle this! 因此，对于任何解决此问题的想法，我将不胜感激！

Answer 1

I think you need shift column per group and then apply function with filter out NaN s rows: 我认为您需要按组shift列，然后应用功能过滤掉NaN的行：

def vincenty(x, y):
    print (x,y)
    return x + y

df['new'] = df.groupby('id')['coord'].shift()

m = df['new'].notnull()
df.loc[m, 'out'] = df.loc[m, :].apply(lambda x: vincenty(x['coord'], x['new']), axis=1)
print (df)
          coord  id           new                       out
0  [10.1, 30.3]   1           NaN                       NaN
1  [10.5, 32.3]   2           NaN                       NaN
2  [11.1, 31.3]   3           NaN                       NaN
3  [10.1, 30.3]   1  [10.1, 30.3]  [10.1, 30.3, 10.1, 30.3]
4  [10.5, 32.3]   2  [10.5, 32.3]  [10.5, 32.3, 10.5, 32.3]
5    [61, 29.1]   3  [11.1, 31.3]    [61, 29.1, 11.1, 31.3]

按数据框分组：在函数的当前行和上一行中使用列值

问题描述

1 个解决方案

解决方案1
2 已采纳 2018-02-03 10:19:30

按数据框分组：在函数的当前行和上一行中使用列值

问题描述

1 个解决方案

解决方案1 2 已采纳 2018-02-03 10:19:30

解决方案1
2 已采纳 2018-02-03 10:19:30