简体   繁体   English

按数据框分组:在函数的当前行和上一行中使用列值

[英]Grouped-By DataFrame: Use column-values in current and previous row in Function

I've got a dataframe with this kind of structure: 我有一个具有这种结构的数据框:

import pandas as pd
from geopy.distance import vincenty

data = {'id': [1, 2, 3, 1, 2 , 3],
        'coord': [[10.1, 30.3], [10.5, 32.3], [11.1, 31.3],
                  [10.1, 30.3], [10.5, 32.3], [61, 29.1]],
       }
df = pd.DataFrame(data)

This is how it looks: 它是这样的:

           coord    id
0   [10.1, 30.3]    1
1   [10.5, 32.3]    2
2   [11.1, 31.3]    3
3   [10.1, 30.3]    1
4   [10.5, 32.3]    2
5   [61, 29.1]      3

Now, I want to group by id . 现在,我想按id分组。 Then, I want to use the current and previous row of coords . 然后,我要使用当前和上一行的coords These should be used in a function to compute the distance between the two coordinates: 这些应在函数中用于计算两个坐标之间的距离:

This is what I've tried: 这是我尝试过的:

df.groupby('id')['coord'].apply(lambda x: vincenty(x, x.shift(1)))

vincenty(x,y) expects x like (10, 20) and the same for y and returns a float. vincenty(x,y)期望x像(10,20)并且对y一样,并返回浮点数。

Obviously, this does not work. 显然,这是行不通的。 The function receives two Series objects instead of the two lists. 该函数接收两个Series对象,而不是两个列表。 So probably using x.values.tolist() should be the next step. 因此,下一步可能应该使用x.values.tolist() However, my understanding of things ends here. 但是,我对事物的理解到此为止。 Hence, I'd appreciate any ideas on how to tackle this! 因此,对于任何解决此问题的想法,我将不胜感激!

I think you need shift column per group and then apply function with filter out NaN s rows: 我认为您需要按组shift列,然后应用功能过滤掉NaN的行:

def vincenty(x, y):
    print (x,y)
    return x + y

df['new'] = df.groupby('id')['coord'].shift()

m = df['new'].notnull()
df.loc[m, 'out'] = df.loc[m, :].apply(lambda x: vincenty(x['coord'], x['new']), axis=1)
print (df)
          coord  id           new                       out
0  [10.1, 30.3]   1           NaN                       NaN
1  [10.5, 32.3]   2           NaN                       NaN
2  [11.1, 31.3]   3           NaN                       NaN
3  [10.1, 30.3]   1  [10.1, 30.3]  [10.1, 30.3, 10.1, 30.3]
4  [10.5, 32.3]   2  [10.5, 32.3]  [10.5, 32.3, 10.5, 32.3]
5    [61, 29.1]   3  [11.1, 31.3]    [61, 29.1, 11.1, 31.3]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM