[英]In Pandas, how do I apply a function to a row of a dataframe, where each item in the row should be passed to the function as an argument?
In other words, say I have a dataframe with some columns, and numerical data in the table.换句话说,假设我有一个 dataframe 有一些列,表中有数字数据。 For example, I have height, weight, age.例如,我有身高、体重、年龄。 Simple dataframe packed with numbers.简单的 dataframe 挤满了数字。
What I want is to make a new series (and add it to the dataframe) that is the result of some calculation using each item from each row.我想要的是制作一个新系列(并将其添加到数据框中),这是使用每一行中的每个项目进行一些计算的结果。 So I have a function f(height, weight, age) and I want the numerical result from that function stored as it's own new column.所以我有一个 function f(height, weight, age) 我希望 function 的数值结果存储为它自己的新列。
So on a given row, I'd have the height, weight, age, and the result of f().所以在给定的行上,我会有身高、体重、年龄和 f() 的结果。
I'm sorry, I explored lots of pandas apply examples and can't find anything that quite does what I have in mind here, though it seems like it's something that should be doable!对不起,我探索了很多 pandas 应用示例,但在这里找不到任何完全符合我想法的东西,尽管它似乎应该是可行的!
Thanks in advance!提前致谢!
Let us take an examle, in which we have a dataframe in which we have weight and height.让我们举个例子,其中我们有一个 dataframe,其中我们有体重和身高。
We can use the apply
function to apply a function on each row with all column or selected columns as follows:我们可以使用apply
function 在每行的所有列或选定列上应用 function,如下所示:
df = pd.DataFrame({"height": [180, 178, 190, 166], 'weight': [78, 72, 89, 75] })
print(df)
height weight
0 180 78
1 178 72
2 190 89
3 166 75
def bmi(x):
return x.weight/((x.height/100)**2)
df['bmi'] = df.apply(lambda x: bmi(x), axis=1)
print(df)
height weight bmi
0 180 78 24.074074
1 178 72 22.724403
2 190 89 24.653740
3 166 75 27.217303
Now if i understood correctly, your function calculated a value using your data (height, weight, and age) for each row and that will be in a new column in the dataframe, am i right?现在,如果我理解正确,您的 function 使用您的每行数据(身高、体重和年龄)计算了一个值,该值将位于 dataframe 的新列中,对吗?
Now you want to do that row by row, which i'm not sure why?现在你想逐行做,我不知道为什么? do you want to iteritate over your dataframe?你想迭代你的 dataframe 吗? I don't have enough information about your function to tell whether that's really needed or not, but generally i'd avoid that approach as it's much slower than running a victorisation process like below:我没有关于您的 function 的足够信息来判断这是否真的需要,但通常我会避免这种方法,因为它比运行如下胜利过程要慢得多:
df['result'] = #whatever your function is doing using the df['height', 'weight', 'age']
example, let's assume your function multiplies height by weight and then divides by age, so you can do the following:例如,假设您的 function 将身高乘以体重,然后除以年龄,因此您可以执行以下操作:
df['result'] = (df['height'] * df['weight']) / df['age']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.