简体   繁体   English

将 function 逐行应用于 pandas dataframe

[英]Apply function row wise to pandas dataframe

I have to calculate the distance on a hilbert-curve from 2D-Coordinates.我必须从二维坐标计算希尔伯特曲线上的距离。 With the hilbertcurve-package i built my own "hilbert"-function, to do so.使用 hilbertcurve-package 我构建了自己的“hilbert”-function,来做到这一点。 The coordinates are stored in a dataframe (col_1 and col_2).坐标存储在 dataframe(col_1 和 col_2)中。 As you see, my function works when applied to two values (test).如您所见,我的 function 在应用于两个值(测试)时有效。

However it just does not work when applied row wise via apply-function?但是,当通过应用功能逐行应用时,它只是不起作用吗? Why is this?为什么是这样? what am I doing wrong here.我在这里做错了什么。 I need an additional column "hilbert" with the hilbert-distances from the x- and y-coordinate given in columns "col_1" and "col_2".我需要一个附加列“hilbert”,其中包含“col_1”和“col_2”列中给出的 x 和 y 坐标的希尔伯特距离。

import pandas as pd
from hilbertcurve.hilbertcurve import HilbertCurve

df = pd.DataFrame({'ID': ['1', '2', '3'],
                   'col_1': [0, 2, 3],
                   'col_2': [1, 4, 5]})


def hilbert(x, y):
    n = 2
    p = 7
    hilcur = HilbertCurve(p, n)
    dist = hilcur.distance_from_coordinates([x, y])
    return dist


test = hilbert(df.col_1[2], df.col_2[2])

df["hilbert"] = df.apply(hilbert(df.col_1, df.col_2), axis=0)

The last command ends in error:最后一条命令以错误结尾:

The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Thank you for your help!谢谢您的帮助!

Since you have hilbert(df.col_1, df.col_2) in the apply, that's immediately trying to call your function with the full pd.Series es for those two columns, triggering that error.由于您在应用中有hilbert(df.col_1, df.col_2) ,因此立即尝试使用这两列的完整pd.Series es 调用您的 function ,从而触发该错误。 What you should be doing is:你应该做的是:

df.apply(lambda x: hilbert(x['col_1'], x['col_2']), axis=1)

so that the lambda function given will be applied to each row.这样给定的 lambda function 将应用于每一行。

You have to define your axis as 1, because you want to apply your function on the rows, not the columns.您必须将轴定义为 1,因为您想将 function 应用于行,而不是列。

You can define a lambda function to apply the hilbert only for the two rows like this:您可以定义 lambda function 以仅对两行应用希尔伯特,如下所示:

df['hilbert'] = df.apply(lambda row: hilbert(row['col_1'], row['col_2']), axis=1)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM