简体   繁体   English

熊猫数据框将功能应用于整个列

[英]Pandas dataframe apply function to entire column

I have some functions that act on a list and return a list. 我有一些作用于列表并返回列表的函数。 I would like to create a column on a Pandas Dataframe such that the new column is the list returned by one of the functions acting on some other column of the dataframe. 我想在Pandas数据框上创建一列,以使新列是由作用于数据框其他列的函数之一返回的列表。

In python-like pseudocode: 在类似python的伪代码中:

def function(parameter, list):
    ...
    return output_list

df['New Column'] = function(parameter, df['Old Column'])

I have tried different options including something like the code above, using .apply() method and others... with no success. 我尝试了不同的选项,包括类似上面的代码,使用.apply()方法和其他方法……均未成功。

Is there a way to do this? 有没有办法做到这一点? Thank you! 谢谢!

EDIT: See Brian Pendleton's answer for the solution. 编辑:有关解决方案,请参阅Brian Pendleton的答案。 Columns in a dataframe are pandas' Series objects. 数据框中的列是熊猫的Series对象。 Just have to create a Series out of the desired list. 只需从所需列表中创建一个系列即可。

df['New_Column'] = pd.Series(data=function(parameter,list))

If you're sending df["old column"] to the function, then you're sending a pandas.Series object. 如果要向函数发送df["old column"]pandas.Series发送pandas.Series对象。 Why not just operate with that series and return a new series of the same shape. 为什么不只是对该系列进行操作并返回相同形状的新系列。 Then you can just use your assignment to new column as you have it. 然后,您可以按需使用对新列的分配。

I think that you want to: 我认为您要:

  • apply a function that return a list 应用返回列表的函数
  • store the result in a DataFrame 将结果存储在DataFrame
  • split the list into different column 将列表分成不同的列

Here is an example. 这是一个例子。

def my_funct(parameter):
return (1,2,3) + parameter

df = pd.DataFrame(np.random.randint(low=1, high=10, size=3), columns=['my_funct'])
#df['New Column'] = function(parameter, df['Old Column'])
df['my_funct'] = df['my_funct'].apply(lambda x: my_funct(x))

# After the function call the resulting lists are stored in one column

       my_funct
0     [5, 6, 7]
1  [10, 11, 12]
2     [6, 7, 8]

# Here is how to split the list into several columns

df = df['my_funct'].apply(pd.Series)

    0   1   2
0   5   6   7
1  10  11  12
2   6   7   8

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM