如何在pandas数据帧中的一组行上执行函数？

Question

I am trying to implement an algorithm . 我正在尝试实现一种算法。 Let's say the algorithm is executed as the function "xyz" 假设算法作为函数“xyz”执行

The function is specifically designed to operate on trajectory data, ie (x,y) coordinates. 该功能专门设计用于轨迹数据，即（x，y）坐标。

The function takes two arguments: 该函数有两个参数：

the first argument is a list of tuples of (x,y) points, 第一个参数是（x，y）点的元组列表，

and the second is a constant value. 第二个是常数值。

It can be illustrated as follows: 它可以说明如下：

 line = [(0,0),(1,0),(2,0),(2,1),(2,2),(1,2),(0,2),(0,1),(0,0)]
 xyz(line, 5.0) #calling the function

Output: 输出：

 [(0, 0), (2, 0), (2, 2), (0, 2), (0, 0)]

This can be easily implemented when there is only one line. 当只有一条线时，这很容易实现。 But I have a huge data frame as follows: 但我有一个庞大的数据框如下：

     id      x     y    x,y
  0  1       0     0    (0,0)
  1  1       1     0    (1,0)
  2  1       2     0    (2,0)
  3  1       2     1    (2,1)
  4  1       2     2    (2,2)
  5  1       1     2    (1,2)
  6  2       1     3    (1,3)
  7  2       1     4    (1,4)
  8  2       2     3    (2,3)
  9  2       1     2    (1,2)
 10  3       2     5    (2,5)
 11  3       3     3    (3,3)
 12  3       1     9    (1,9)
 13  3       4     6    (4,6)

In the above data frame, rows with same "id" forms the points of one separate trajectory/ line. 在上述数据框中，具有相同“id”的行形成一个单独轨迹/线的点。 I want to implement the above mentioned function for each of these lines. 我想为这些行中的每一行实现上述功能。

We can observe from the df there are 3 different trajectories with ids 1,2,3. 我们可以从df观察到有3种不同的轨迹，其中id为1,2,3。 Trajectory 1 has its x, y value in row (0-5), trajectory 2 has its points in rows (6-9) and so on.. 轨迹1的x，y值在行（0-5）中，轨迹2的点在行（6-9）中，依此类推。

How to implement function "xyz" for each of these lines, and since output of this function is again a list of tuples of x,y coordinates, how to store this list? 如何为这些行中的每一行实现函数“xyz”，并且由于此函数的输出再次是x，y坐标的元组列表，如何存储此列表？ Note: The output list can contain any random number of tuples. 注意：输出列表可以包含任意随机数的元组。

Answer 1

I think you need groupby with apply : 我认为你需要groupby with apply ：

print (df.groupby('id')['x,y'].apply(lambda x: xyz(x, 5.0)))

Or: 要么：

print (df.groupby('id')['x,y'].apply(xyz, 5.0))

Sample with rdp function - is necessary add tolist , else get KeyError: -1 : 使用rdp函数的示例 - 必须添加tolist ，否则获取tolist KeyError: -1 ：

print (df.groupby('id')['x,y'].apply(lambda x: rdp(x.tolist(), 5.0)))
#alternative with list
#print (df.groupby('id')['x,y'].apply(lambda x: rdp(list(x), 5.0))
id
1    [(0, 0), (1, 2)]
2    [(1, 3), (1, 2)]
3    [(2, 5), (4, 6)]
Name: x,y, dtype: object

如何在pandas数据帧中的一组行上执行函数？

问题描述

1 个解决方案

解决方案1
4 已采纳 2017-04-28 05:38:55

如何在pandas数据帧中的一组行上执行函数？

问题描述

1 个解决方案

解决方案1 4 已采纳 2017-04-28 05:38:55

解决方案1
4 已采纳 2017-04-28 05:38:55