[英]Pandas return value from multiple columns if equal to value in another column
我有一个像这样的 Pandas 数据框:
A B C D
0 month month+1 quarter+1 season+1
1 season month+5 quarter+3 season+2
2 day month+1 quarter+2 season+1
3 year month+3 quarter+4 season+2
4 quarter month+2 quarter+1 season+1
5 month month+4 quarter+1 season+2
我想根据几个 IF 条件插入一个名为“E”的新列。 如果“A”列等于“月”,则返回“B”中的值,如果“A”列等于“季度”,则返回“C”中的值,如果“A”列等于“季节”,则返回“D”中的值,如果不是,则返回“A”列中的值
A B C D E
0 month month+1 quarter+1 season+1 month+1
1 season month+5 quarter+3 season+2 season+2
2 day month+1 quarter+2 season+1 day
3 year month+3 quarter+4 season+2 year
4 quarter month+2 quarter+1 season+1 quarter+1
5 month month+4 quarter+1 season+2 month+4
我在做这件事时遇到了麻烦。 我试过玩弄一个函数,但它没有用。 看我的尝试:
def f(row):
if row['A'] == 'month':
val = ['B']
elif row['A'] == 'quarter':
val = ['C']
elif row['A'] == 'season':
val = ['D']
else:
val = ['A']
return val
df['E'] = df.apply(f, axis=1)
编辑:将最后一个else
更改为“A”列
首先,我建议你看看: When should I want to use apply() in my code.
我会使用Series.replace
df['E'] = df['A'].replace(['month','quarter','season'],
[df['B'], df['C'], df['D']])
cond = [df['A'].eq('month'), df['A'].eq('quarter'), df['A'].eq('season')]
values= [df['B'], df['C'], df['D']]
df['E']=np.select(cond,values,default=df['A'])
A B C D E
0 month month+1 quarter+1 season+1 month+1
1 season month+5 quarter+3 season+2 season+2
2 day month+1 quarter+2 season+1 day
3 year month+3 quarter+4 season+2 year
4 quarter month+2 quarter+1 season+1 quarter+1
5 month month+4 quarter+1 season+2 month+4
只需使用np.select
c1 = df['A'] == 'month'
c2 = df['A'] == 'quarter'
c3 = df['A'] == 'season'
df['E'] = np.select([c1, c2, c3], [df['B'], df['C'], df['D']], df['A'])
Out[271]:
A B C D E
0 month month+1 quarter+1 season+1 month+1
1 season month+5 quarter+3 season+2 season+2
2 day month+1 quarter+2 season+1 day
3 year month+3 quarter+4 season+2 year
4 quarter month+2 quarter+1 season+1 quarter+1
5 month month+4 quarter+1 season+2 month+4
您可能需要像这样修复您的代码:
def f(row):
if row['A'] == 'month':
val = row['B']
elif row['A'] == 'quarter':
val = row['C']
elif row['A'] == 'season':
val = row['D']
else:
val = row['D']
return val
df['E'] = df.apply(f, axis=1)
注意:你忘了包括row
val = ['B'] # before
val = row['B'] # after
编辑:这只是为了指出代码中的问题,为了更好的方法,请查看与使用numpy.select相关的其他答案
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.