如何将自定义函数应用于每行的pandas数据框

Question

I want to apply a custom function and create a derived column called population2050 that is based on two columns already present in my data frame. 我想应用自定义函数并创建一个名为population2050的派生列，该列基于我的数据框中已存在的两列。

import pandas as pd
import sqlite3
conn = sqlite3.connect('factbook.db')
query = "select * from facts where area_land =0;"
facts = pd.read_sql_query(query,conn)
print(list(facts.columns.values))

def final_pop(initial_pop,growth_rate):
    final = initial_pop*math.e**(growth_rate*35)
    return(final)

facts['pop2050'] = facts['population','population_growth'].apply(final_pop,axis=1)

When I run the above code, I get an error. 当我运行上面的代码时，我收到一个错误。 Am I not using the 'apply' function correctly? 我没有正确使用'apply'功能吗？

Answer 1

Apply will pass you along the entire row with axis=1. Apply将沿着整个行传递，轴= 1。 Adjust like this assuming your two columns are called initial_pop and growth_rate 假设您的两列名为initial_pop和growth_rate ，请growth_rate

def final_pop(row):
    return row.initial_pop*math.e**(row.growth_rate*35)

Answer 2

You were almost there: 你几乎在那里：

facts['pop2050'] = facts.apply(lambda row: final_pop(row['population'],row['population_growth']),axis=1)

Using lambda allows you to keep the specific (interesting) parameters listed in your function, rather than bundling them in a 'row'. 使用lambda允许您保留函数中列出的特定（有趣）参数，而不是将它们捆绑在“行”中。

Answer 3

You can achieve the same result without the need for DataFrame.apply() . 无需DataFrame.apply()即可获得相同的结果。 Pandas series (or dataframe columns) can be used as direct arguments for NumPy functions and even built-in Python operators, which are applied element-wise. Pandas系列（或数据帧列）可以用作NumPy函数的直接参数，甚至是内置的Python运算符，它们是按元素应用的。 In your case, it is as simple as the following: 在您的情况下，它就像以下一样简单：

import numpy as np

facts['pop2050'] = facts['population'] * np.exp(35 * facts['population_growth'])

This multiplies each element in the column population_growth , applies numpy's exp() function to that new column ( 35 * population_growth ) and then adds the result with population . 这会将列population_growth中的每个元素相乘，将numpy的exp()函数应用于该新列（ 35 * population_growth ），然后将结果与population一起添加。

Answer 4

Your function, 你的功能，

def function(x):
  // your operation
  return x

call your function as, 把你的职能称为，

df['column']=df['column'].apply(function)

如何将自定义函数应用于每行的pandas数据框

问题描述

4 个解决方案

解决方案1
11 已采纳 2016-11-01 03:15:56

解决方案2
5 2016-11-01 03:33:32

解决方案3
4 2018-01-22 21:14:59

解决方案4
3 2019-01-11 11:48:42

如何将自定义函数应用于每行的pandas数据框

问题描述

4 个解决方案

解决方案1 11 已采纳 2016-11-01 03:15:56

解决方案2 5 2016-11-01 03:33:32

解决方案3 4 2018-01-22 21:14:59

解决方案4 3 2019-01-11 11:48:42

解决方案1
11 已采纳 2016-11-01 03:15:56

解决方案2
5 2016-11-01 03:33:32

解决方案3
4 2018-01-22 21:14:59

解决方案4
3 2019-01-11 11:48:42