[英]How to append strings inside dataframe cells based on column values
Given a dataframe:给定一个数据框:
import pandas as pd
df = pd.DataFrame(data= {'Col1': ['No', 'Yes', 'No', 'Maybe'], 'Col2': ['Yes', 'No', 'No', 'No'], 'Result': ''})
I want to populate Result
with a list that may need to be appended based upon a column value.我想用一个列表填充
Result
,该列表可能需要根据列值进行附加。 In this case, the parameters would be:在这种情况下,参数将是:
If the value is 'Yes' keep the current value of Result
, if the value is 'Maybe' append 'Attention needed (insert column name)', if the value is 'No' append 'Failure (insert column name)'如果值为 'Yes' 保留
Result
的当前值,如果值为 'Maybe' append 'Attention needed (insert column name)',如果值为 'No' append 'Failure (insert column name)'
Not very pretty, but you could create a dict
, then use stack
, map
and groupby
with join
aggregation:不是很漂亮,但您可以创建一个
dict
,然后将stack
、 map
和groupby
与join
聚合一起使用:
d = {'No': 'Failure', 'Maybe': 'Attention needed'}
s = df[['Col1', 'Col2']].stack().map(d).dropna()
df['Result'] = (s + ' ' + s.index.get_level_values(1)).groupby(level=0).agg(', '.join)
[out] [出去]
Col1 Col2 Result
0 No Yes Failure Col1
1 Yes No Failure Col2
2 No No Failure Col1, Failure Col2
3 Maybe No Attention needed Col1, Failure Col2
Try this one liner code using lambda
function:使用
lambda
函数试试这个单行代码:
df['Result'] = df[['Col1','Col2']].apply(lambda x: 'Failure Col1' if (x[0]=='No' and x[1]=='Yes') else ('Failure Col2' if (x[1]=='No' and x[0]=='Yes') else ('Failure Col1, Failure Col2' if (x[0]=='No' and x[1]=='No') else("Attention needed Col1, Failure Col2" if (x[0]=='Maybe' and x[1]=='No') else None))), axis=1)
Output:输出:
Col1 Col2 Result
0 No Yes Failure Col1
1 Yes No Failure Col2
2 No No Failure Col1, Failure Col2
3 Maybe No Attention needed Col1, Failure Col2
您可以首先将结果列构造为一个 numpy 数组,同时遍历数据框列并检查值,然后您可以添加新的结果列并删除旧的结果列。
Construct a dictionary to replace values in df
and Using *
and +
to construct a series of appropriate message strings and finally join them and assign to df.Result
构造一个字典来替换
df
值并使用*
和+
构造一系列合适的消息字符串,最后将它们连接起来并赋值给df.Result
d = {'Yes': '', 'No': 'Failure ', 'Maybe': 'Attention needed '}
df1 = df[['Col1', 'Col2']]
df['Result'] = ((df1.replace(d)
+ df1.ne('Yes').values * df1.columns.values).agg(','.join, axis=1)
.str.strip(','))
Or或者
df['Result'] = ((df1.replace(d)
+ df1.ne('Yes').values * (df1.columns+',').values).sum(1)
.str.strip(','))
Out[267]:
Col1 Col2 Result
0 No Yes Failure Col1
1 Yes No Failure Col2
2 No No Failure Col1,Failure Col2
3 Maybe No Attention needed Col1,Failure Col2
Here the detail这里的细节
df1.replace(d) + df1.ne('Yes').values * df1.columns.values
Out[268]:
Col1 Col2
0 Failure Col1
1 Failure Col2
2 Failure Col1 Failure Col2
3 Attention needed Col1 Failure Col2
((df1.replace(d) + df1.ne('Yes').values * df1.columns.values).agg(','.join, axis=1)
.str.strip(','))
Out[269]:
0 Failure Col1
1 Failure Col2
2 Failure Col1,Failure Col2
3 Attention needed Col1,Failure Col2
dtype: object
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.