简体   繁体   English

将具有多个参数的函数应用于Pandas中的整个数据框

[英]Apply a function with multiple arguments on an entire dataframe in Pandas

I have the following dataframe in pandas: 我在熊猫中有以下数据框:

df = pd.DataFrame({'field_1' : ['a', 'b', np.nan, 'a', 'c'], 'field_2': ['c', 'b', 'a', np.nan, 'c']}, index=[1,2,3,4,5])

I want to apply the following function on the entire dataframe that replaces each value with something else. 我想在整个数据框中应用以下功能,以其他方式替换每个值。

For example: 例如:

def func_replace(value, n):
    if value == 'a':
        return 'This is a'*n
    elif value == 'b':
        return 'This is b'*n
    elif value == 'c':
        return 'This is c'*n
    elif str(value) == 'nan':
        return np.nan
    else:
         'The value is not included'

so that the final product would look like (given that n=1 ). 因此最终产品看起来像(假设n=1 )。

For example: 例如:

df = pd.DataFrame({'field_1' : ['This is a', 'This is b', np.nan, 'This is a', 'This is c'], 'field_2': ['This is c', 'This is b', 'This is a', np.nan, 'This is c']}, index=[1,2,3,4,5])

I tried the following: 我尝试了以下方法:

df.apply(func_replace, args=(1), axis=1)

and bunch of other options, but it always gives me an error. 和其他选项,但这总是给我一个错误。

I know that I can write a for loop that goes through every column and uses lambda function to solve this problem, but I feel that there is an easier option. 我知道我可以编写一个遍历每一列的for循环,并使用lambda函数来解决此问题,但是我觉得有一个更简单的选择。

I feel the solution is easier than I think, but I just can't figure out the correct syntax. 我觉得该解决方案比我想象的要容易,但是我无法弄清楚正确的语法。

Any help would be really appreciated. 任何帮助将非常感激。

Just modify your function to operate at the level of each value in a Series and use applymap . 只需修改您的函数以在Series中每个值的级别进行操作,然后使用applymap

df = pd.DataFrame({'field_1' : ['a', 'b', np.nan, 'a', 'c'], 'field_2': ['c', 'b', 'a', np.nan, 'c']}, index=[1,2,3,4,5])

df
Out[35]: 
  field_1 field_2
1       a       c
2       b       b
3     NaN       a
4       a     NaN
5       c       c

Now, if we define the function as: 现在,如果我们将函数定义为:

def func_replace(value):
    if value == 'a':
        return 'This is a'
    elif value == 'b':
        return 'This is b'
    elif value == 'c':
        return 'This is c'
    elif str(value) == 'nan':
        return np.nan
    else:
        'The value is not included'

Calling this function on each value on the DataFrame is very straightforward: DataFrame每个值上调用此函数非常简单:

df.applymap(func_replace)
Out[42]: 
     field_1    field_2
1  This is a  This is c
2  This is b  This is b
3        NaN  This is a
4  This is a        NaN
5  This is c  This is c

I think you need: 我认为您需要:

def func_replace(df, n):
    df_temp = df.replace({r"[^abc]": "The value is not included"}, regex=True)
    return df_temp.replace(["a", "b", "c"], ["This is a " * n, "This is b " * n, "This is c " * n])

df.apply(func_replace, args=(2,))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM