[英]Pandas add a new column based on conditional logic of many other columns
I have a pandas dataframe like this:我有一个像这样的熊猫数据框:
aa bb cc dd ee
a a b b foo
a b a a foo
b a a a bar
b b b b bar
I want to add a new column if value in columns 1 to 4 is a
如果第 1 到第 4 列中的值是a
我想添加一个新列
The results would be like this:结果是这样的:
aa bb cc dd ee ff
a a b b foo a
a b a a foo a
b a a a bar a
b b b b bar b
The logic is: if value in any of columns 1 to 4 is a
then column ff
is a
else it's b
逻辑是:如果第 1 到第 4 列中的任何一个值是a
则列ff
是a
否则它是b
I can define a function and do each column manually like:我可以定义一个函数并手动执行每一列,例如:
def some_function(row);
if row['aa']=='a' or row['bb']=='a' or row['cc']=='a' or row[dd]=='a':
return 'a'
return 'b'
But I'm looking for a solution that can scale across n
number of columns.但我正在寻找一种可以跨n
列扩展的解决方案。
Appreciate any help!感谢任何帮助!
Use numpy.where
with condition created by eq
(==) with any
for check at least one True
per row:将numpy.where
与eq
(==) 创建的条件与any
以检查每行至少一个True
:
cols = ['aa','bb','cc', 'dd']
df['ff'] = np.where(df[cols].eq('a').any(1), 'a', 'b')
print (df)
aa bb cc dd ee ff
0 a a b b foo a
1 a b a a foo a
2 b a a a bar a
3 b b b b bar b
Detail:细节:
print (df[cols].eq('a'))
aa bb cc
0 True True False
1 True False True
2 False True True
3 False False False
print (df[cols].eq('a').any(1))
0 True
1 True
2 True
3 False
dtype: bool
If need custom function:如果需要自定义功能:
def some_function(row):
if row[cols].eq('a').any():
return 'a'
return 'b'
df['ff'] = df.apply(some_function, 1)
print (df)
aa bb cc dd ee ff
0 a a b b foo a
1 a b a a foo a
2 b a a a bar a
3 b b b b bar b
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.