简体   繁体   English

根据另一列中的值创建新列

[英]Create a new column based on value in another column

I have a data frame:我有一个数据框:

 id|concept    |description
 12|           |rewards member
 12|tier one   |
 12|not avail  |rewards member

GOAL: Create a new column final_desc with the content in either the concept or description column目标:使用conceptdescription列中的内容创建一个新列final_desc

There are 4 possible scenarios:有4种可能的情况:

  1. There is a value in concept column and not in description , in which final_desc is the value in the conceptconcept列有值,在description中没有,其中final_descconcept中的值

  2. There is a value in description column and not in concept , in which final_desc is the value in the description description列中有值, concept中没有,其中final_descdescription中的值

  3. The value in concept column is not avail , in which final_desc is the value in the description concept列中的值无效,其中final_descdescription中的值

  4. Both the concept and description column are empty, in which final_desc is empty conceptdescription栏均为空,其中final_desc为空

I tried using a where statement but that does not account for scenario 3.我尝试使用 where 语句,但这不考虑场景 3。

df['final_desc'] = np.where(df['concept'].isnull(), df['description'], df['concept'])

I think I need a custom function but am not sure how to write to work across columns我想我需要一个自定义 function 但不知道如何编写跨列工作

You can combine a replace and ffill/bfill :您可以结合使用replaceffill/bfill

df['final_desc'] = (df[['concept','description']].replace('not avail',np.nan)
                     .bfill(1)['concept']
                   )

Output: Output:

   id    concept     description      final_desc
0  12        NaN  rewards member  rewards member
1  12   tier one             NaN        tier one
2  12  not avail  rewards member  rewards member

This might do the trick:这可能会奏效:

df['final_desc'] = df.concept.replace('not avail',np.nan).fillna(df.description).fillna(df.concept.replace('not avail',np.nan))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM