简体   繁体   English

Calling a Python function/class that takes an entire pandas dataframe or series as input, for all rows in another dataframe

[英]Calling a Python function/class that takes an entire pandas dataframe or series as input, for all rows in another dataframe

I have a Python class that takes a geopandas Series or Dataframe to initialize (specifically working with geopandas, but I imagine it to be the same solution as pandas).我有一个 Python class 需要一个 geopandas 系列或 Dataframe 来初始化(特别是与 geopandas 一起使用,但我想它是相同的解决方案)。 This class has attributes/methods that utilize the various columns in the series/dataframe.此 class 具有利用系列/数据帧中的各个列的属性/方法。 Outside of this, I have a dataframe with many rows.除此之外,我有一个 dataframe 有很多行。 I would like to iterate through (ideally in an efficient/parallel manner as each row is independent of each other) this dataframe, and call a method in the class for each row (aka Series).我想遍历这个 dataframe(理想情况下以高效/并行的方式,因为每一行彼此独立),并为每一行(又名系列)调用 class 中的一个方法。 And append the results as a column to the dataframe.并将 append 的结果作为 dataframe 的列。 But I am having trouble with this.但我在这方面遇到了麻烦。 With the standard list comprehension/pandas apply() methods, I can call like this eg:使用标准列表理解/pandas apply() 方法,我可以这样调用,例如:

gdf1['function_return_col'] = list(map((lambda f: my_function(f)), gdf2['date']))

But if said function (or in my case, class) needs the entire gdf, and I call like this:但是如果说 function (或者在我的情况下,类)需要整个 gdf,我这样称呼:

gdf1['function_return_col'] = list(map((lambda f: my_function(f)), gdf2))

It does not work because 'my_function()' takes a dataframe or series, while what is being sent to it is the column names (strings) of gdf2.它不起作用,因为 'my_function()' 采用 dataframe 或系列,而发送给它的是 gdf2 的列名(字符串)。

How can I apply a function to all rows in a dataframe if said function takes an entire dataframe/series and not just select column(s)? How can I apply a function to all rows in a dataframe if said function takes an entire dataframe/series and not just select column(s)? In my specific case, since it's a method in a class, I would like to do this, or something similar to call this method on all rows in a dataframe:在我的具体情况下,因为它是 class 中的一个方法,所以我想这样做,或者类似于在 dataframe 中的所有行上调用此方法:

gdf1['function_return_col'] = list(map((lambda f: my_class(f).my_method()), gdf2))

Or am I just thinking of this in the entirely wrong way?还是我只是以完全错误的方式思考这个问题?

Have you tried using pandas dataframe method called "apply".您是否尝试过使用称为“应用”的 pandas dataframe 方法。
Here is an example of using it for both row axis and column axis.这是将它用于行轴和列轴的示例。

import pandas as pd
import numpy as np

df = pd.DataFrame({'A': [1, 2], 'B': [10, 20]})

df1 = df.apply(np.sum, axis=0)
print(df1)

df1 = df.apply(np.sum, axis=1)
print(df1)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM