简体   繁体   English

如何将我的函数应用于DataFrame列?

[英]How to apply my function to DataFrame column?

I've defined the following function: 我定义了以下函数:

def clearString(myString):
    forbidden = r'/\:*?"<>|'
    for character in forbidden:
        if character in myString:
           myString = myString.replace(character,'')
    return myString

To remove unwanted characters in file names. 删除文件名中不需要的字符。 I have a data frame with book titles in a column and I'm trying to apply the function to all the strings inplace, to clear them, but have been unable to, I keep getting the DataFrame back with untouched data. 我有一列中包含书名的数据框,并且试图将函数应用于所有字符串,以清除它们,但是一直无法,我一直使用未修改的数据来恢复DataFrame。

I've already tried the apply function, both in the column alone and the entire DataFrame, and none of that yields a positive result, be it assigning the DataFrame back to it self, as in: 我已经在单独的列和整个DataFrame中尝试了apply函数,但是都没有产生积极的结果,因为它是将DataFrame自身分配回去,如下所示:

df = df.apply(clearString)
#Or even
df = clearString(df)

Or even defining a new one: 甚至定义一个新的:

df_new = df.apply(clearString)
#Or even
df_new = clearString(df)

Is there something wrong with my function maybe, like not properly handling DataFrames or something? 我的函数可能有问题,例如未正确处理DataFrames或其他问题?

apply isn't working because, by default, it applies the given function to each column (and not to each element). apply不起作用,因为默认情况下,它将给定的函数应用于每一列(而不是应用于每个元素)。 In the given examples, clearString would receive a Series argument, not a str . 在给定的示例中, clearString将接收Series参数,而不是str

To apply a function to all the elements of a DataFrame, one can use the applymap method ( docs ). 要将函数应用于DataFrame的所有元素 ,可以使用applymap方法( docs )。

Examples: 例子:

# if you wanna replace the old dataframe
df = df.applymap(clearString)

# if you wanna keep the old dataframe
new_df = df.applymap(clearString)

您可以使用地图,甚至可以应用和地图组合。

If you want to modify a single column you can try these approaches: 如果要修改单个列,可以尝试以下方法:

df = pd.DataFrame({"Title": ["one ", "two", "three", "four"]})
def clean(title):
    return title.upper()
df["Title"] = df["Title"].apply(lambda x: clean(x))
# OR 
df["Modified_Title"] = df["Title"].apply(lambda x: clean(x))
# OR 
df["Modified_Title1"] = df.apply(lambda x: clean(x["Title"]), axis=1)
# OR 
new_df = df.applymap(lambda x: clean(x)) 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM