简体   繁体   English

如果等于另一列中的值,则熊猫从多列返回值

[英]Pandas return value from multiple columns if equal to value in another column

I have a Pandas dataframe like this:我有一个像这样的 Pandas 数据框:

  A      B        C         D
0 month   month+1 quarter+1 season+1
1 season  month+5 quarter+3 season+2
2 day     month+1 quarter+2 season+1
3 year    month+3 quarter+4 season+2
4 quarter month+2 quarter+1 season+1
5 month   month+4 quarter+1 season+2

I would like to insert a new column called 'E' based on several IF conditions.我想根据几个 IF 条件插入一个名为“E”的新列。 If column 'A' equals 'month' then return values in 'B', if column 'A' equals 'quarter' then return values in 'C', if column 'A' equals 'season' then return values in 'D', and if not then return values in column 'A'如果“A”列等于“月”,则返回“B”中的值,如果“A”列等于“季度”,则返回“C”中的值,如果“A”列等于“季节”,则返回“D”中的值,如果不是,则返回“A”列中的值

  A      B        C         D        E
0 month   month+1 quarter+1 season+1 month+1
1 season  month+5 quarter+3 season+2 season+2
2 day     month+1 quarter+2 season+1 day
3 year    month+3 quarter+4 season+2 year
4 quarter month+2 quarter+1 season+1 quarter+1
5 month   month+4 quarter+1 season+2 month+4

I am having trouble doing this.我在做这件事时遇到了麻烦。 I have tried playing around with a function but it did not work.我试过玩弄一个函数,但它没有用。 See my attempt:看我的尝试:

def f(row):
    if row['A'] == 'month':
        val = ['B']
    elif row['A'] == 'quarter':
        val = ['C']
    elif row['A'] == 'season':
        val = ['D']
    else:
        val = ['A']
    return val

df['E'] = df.apply(f, axis=1)

EDITED: to change the last else to column 'A'编辑:将最后一个else更改为“A”列

Frist, I recommend you see: When should I want to use apply() in my code.首先,我建议你看看: When should I want to use apply() in my code.

I would use Series.replace我会使用Series.replace

df['E'] = df['A'].replace(['month','quarter','season'],
                          [df['B'], df['C'], df['D']]) 

or numpy.selectnumpy.select

cond = [df['A'].eq('month'), df['A'].eq('quarter'), df['A'].eq('season')]
values= [df['B'], df['C'], df['D']]
df['E']=np.select(cond,values,default=df['A'])

  A      B        C         D        E
0 month   month+1 quarter+1 season+1 month+1
1 season  month+5 quarter+3 season+2 season+2
2 day     month+1 quarter+2 season+1 day
3 year    month+3 quarter+4 season+2 year
4 quarter month+2 quarter+1 season+1 quarter+1
5 month   month+4 quarter+1 season+2 month+4

Just use np.select只需使用np.select

c1 = df['A'] == 'month'
c2 = df['A'] == 'quarter'
c3 = df['A'] == 'season'

df['E'] = np.select([c1, c2, c3], [df['B'], df['C'], df['D']], df['A'])

Out[271]:
         A        B          C         D          E
0    month  month+1  quarter+1  season+1    month+1
1   season  month+5  quarter+3  season+2   season+2
2      day  month+1  quarter+2  season+1        day
3     year  month+3  quarter+4  season+2       year
4  quarter  month+2  quarter+1  season+1  quarter+1
5    month  month+4  quarter+1  season+2    month+4

You probably need to fix your code like this:您可能需要像这样修复您的代码:

def f(row):
    if row['A'] == 'month':
        val = row['B']
    elif row['A'] == 'quarter':
        val = row['C']
    elif row['A'] == 'season':
        val = row['D']
    else:
        val = row['D']
    return val

df['E'] = df.apply(f, axis=1)

note: you forgot to include row注意:你忘了包括row

val = ['B'] # before
val = row['B'] # after

Edit: This is just to point out the problem in the code, for better approaches check out the other answers related to the usage of numpy.select编辑:这只是为了指出代码中的问题,为了更好的方法,请查看与使用numpy.select相关的其他答案

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM