简体   繁体   English

使用函数从使用熊猫的特定列输入返回多列输出

[英]Use a function to return multiple column outputs from specific column inputs using Pandas

I would like to add two new columns to my dataframe by applying a function that takes inputs from multiple, specific pre-existing columns. 我想通过应用从多个特定的现有列中获取输入的函数,将两个新列添加到我的数据框中。

Here is my approach which works for returning one column, but not multiple: 这是我的方法,可用于返回一列,但不能返回多列:

Here is my DataFrame: 这是我的DataFrame:

d = {'a': [3,0,2,2],
    'b': [0,1,2,3],
    'c': [1,1,2,3],
    'd': [2,2,1,3]}

df = pd.DataFrame(d)

I'm trying to apply this function: 我正在尝试应用此功能:

def myfunc(a,b,c):
    if a > 2 and b > 2:
        print('condition 1',a,b)
        return pd.Series((a,b))
    elif a < 2 and c < 2:
        print('condition 2',a,c)
        return pd.Series((b,c))
    else:
        print('no condition')
        return pd.Series((None,None))

Like this: 像这样:

df['e'],df['f'] = df.apply(lambda x: myfunc(x['a'],x['b'],x['c']),axis=1)

Output: 输出:

no condition
no condition
condition 2 0 1
no condition
no condition

DataFrame result: DataFrame结果:

在此处输入图片说明

How can I input multiple columns and get multiple columns out? 如何输入多列并取出多列?

Your function is going to return one series with either NAs or with a 2-tuple when my_funct matched. 当my_funct匹配时,您的函数将使用NAs或2元组返回一个系列。

One way to fix it is to return Series instead, that will be automatically expanded by apply: 解决它的一种方法是返回Series,该序列将通过apply自动扩展:

def myfunc(col1,col2,col3):
    if col1 == 'x' and col2 == 'y':
        return pd.Series((col1,col2))
    if col2 == 'a' and col3 == 'b':
        return pd.Series(('yes','no'))

Note the double brackets to pass one argument as a tuple. 请注意使用双括号将一个参数作为元组传递。 A list would be fine too. 列表也可以。

The issue is with the assignment, not myfunc 问题在于分配,而不是myfunc

When you try to unpack a dataframe as a tuple, it returns the column lables. 当您尝试将数据框解压缩为元组时,它将返回列标签。 That's why you get (0, 1) for everything 这就是为什么您得到所有​​东西的(0,1)

df['e'], df['f'] = pd.DataFrame([[8, 9]] * 1000000, columns=['Told', 'You'])
print(df)

   a  b  c  d     e    f
0  3  0  1  2  Told  You
1  0  1  1  2  Told  You
2  2  2  2  1  Told  You
3  2  3  3  3  Told  You

Use join 使用join

df.join(df.apply(lambda x: myfunc(x['a'],x['b'],x['c']),axis=1))

Or pd.concat pd.concat

pd.concat([df, df.apply(lambda x: myfunc(x['a'],x['b'],x['c']),axis=1)], axis=1)

both give 都给

   a  b  c  d    e    f
0  3  0  1  2  NaN  NaN
1  0  1  1  2  1.0  1.0
2  2  2  2  1  NaN  NaN
3  2  3  3  3  NaN  NaN

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用 pandas 从另一列和多个输入之间最接近的匹配中查找一列的值 - Using pandas to look up the values of one column from the closest match between another column and multiple inputs pandas 应用 function 与多个输入创建一个新列 - pandas apply function with multiple inputs to create a new column Pandas:如何应用具有多列输入和 where 条件的 function - Pandas : How to apply a function with multiple column inputs and where condition 使用 pandas 在多列中执行应用 function - Perform apply function in multiple column using pandas 将 pandas lambda 与多个输入和输出一起使用时出现 ValueError - Getting ValueError when using pandas lambda with multiple inputs and outputs 带有pandas数据框和列名称作为输入的Python函数 - Python function with pandas dataframe and column name as inputs 使用一个函数返回多个函数的输出 - return Outputs of multiple function using one function 使用数据集创建不重复的特定列的输入计数字典 - Using a dataset to create a dictionary of the count of inputs from a specific column with no repeats 在具有多个输入的函数上使用 df.apply 以生成多个输出 - Using df.apply on a function with multiple inputs to generate multiple outputs 特定的pandas列作为df.apply输出的新列中的参数 - Specific pandas columns as arguments in new column of df.apply outputs
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM