简体   繁体   English

如何将带有参数的函数应用于Pandas数据框

[英]How to apply a function with argument to a Pandas dataframe

I am trying to remove all accents in the data. 我正在尝试删除数据中的所有重音符号。 I found a function but I am not able to apply the same on entire dataframe at once. 我找到了一个函数,但无法一次将其应用于整个数据框。

import unicodedata
import pandas as pd

def remove_accents(input_str):
    nfkd_form = unicodedata.normalize('NFKD', input_str)
    only_ascii = nfkd_form.encode('ASCII', 'ignore')
    return only_ascii


data = {'name': ['Guzmán', 'Molly'],
        'year': [2012, 2012]}
df = pd.DataFrame(data)
df

How can I apply the above function? 如何应用以上功能?

Is there any parameter in pandas read_csv that I can use to achieve similar output? 我可以使用pandas read_csv中的任何参数来实现类似的输出吗?

As others have pointed out, this is pretty straightforward: 正如其他人指出的那样,这非常简单:

df['name'] = df['name'].apply(remove_accents)

Also, in case you are using Python 3, I would recommend changing the last line of your remove_accents function. 另外,如果您使用的是Python 3,我建议您更改remove_accents函数的最后一行。 only_ascii is returning binary data, and it's usually best practice to keep unicode text as regular (Python 3) str . only_ascii返回二进制数据,通常最佳做法是将unicode文本保留为常规(Python 3) str

def remove_accents(input_str):
    nfkd_form = unicodedata.normalize('NFKD', input_str)
    only_ascii = nfkd_form.encode('ASCII', 'ignore')
    return only_ascii.decode('utf-8')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM