简体   繁体   English

将功能应用于数据框列?

[英]Applying function to dataframe column?

I have the following function (one-hot encoding function that takes a column as an input). 我具有以下功能(采用列作为输入的单热编码功能)。 I basically want to apply it to a column in my dataframe, but can't seem to understand what's going wrong. 我基本上想将其应用于数据框中的一列,但似乎无法理解到底出了什么问题。

def dummies(dataframe, col):
    dataframe[col] = pd.Categorical(dataframe[col])
    pd.concat([dataframe,pd.get_dummies(dataframe[col],prefix = 'c')],axis=1)

df1 = df['X'].apply(dummies)

Guessing something is wrong with how I'm calling it? 猜猜我的说法有问题吗?

you need to make sure you're returning a value from the function, currently you are not..also when you apply a function to a column you are basically passing the value of each row in the column into the function, so your function is set up wrong..typically you'd do it like this: 您需要确保要从函数中返回一个值,当前您不是。.此外,当您将函数应用于列时,您基本上是将列中每一行的值传递给函数,因此您的函数是设置错误..通常您会这样做:

def function1(value):
    new_value = value*2 #some operation
    return new_value

then: 然后:

df['X'].apply(function1)

currently your function is set up to take entire df, and the name of a column, so likely your function might work if you call it like this: 当前,您的函数设置为采用整个df和一列的名称,因此,如果您按以下方式调用它,则函数可能会起作用:

df1 = dummies(df, 'X')

but you still need to add a return statement 但是您仍然需要添加一个return语句

If you want to apply it to that one column you don't need to make a new dataframe. 如果要将其应用于该列,则无需创建新的数据框。 This is the correct syntax. 这是正确的语法。 Please read the docs . 请阅读文档

df['X'] = df['X'].apply(lambda x : dummies(x))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM