简体   繁体   English

如何基于其他现有列的条件添加具有值的新列?

[英]How to add a new column with values based on conditions of other existing columns?

This is the current df_treatments. 这是当前的df_treatments。

在此处输入图片说明

I want to add a new field "treatment_type" with values that should be based on the values in columns (metformin, glipizide, insulin): 我想添加一个新字段“ treatment_type”,其值应基于列中的值(二甲双胍,格列吡嗪,胰岛素):

("value of treatment_type": (value of metformin,value of glipizide, value of insulin)) (“治疗类型值” :(二甲双胍值,格列吡嗪值,胰岛素值))

"No Treatment" (NO, NO, NO)
"Metformin" (YES, NO, NO)
"Glipizide" (NO, YES, NO)
"Insulin" (NO, NO, YES)
"Metformin-Glipizide" (YES, YES, NO)
"Metformin-Insulin" (YES, NO, YES)
"Glipizide-Insulin" (NO, YES, YES)
"Metformin-Glipizide-Insulin" (YES, YES, YES)

How can I do this? 我怎样才能做到这一点?

Thank you, 谢谢,

There are a few approaches. 有几种方法。 One is to use a dictionary to store your treatments and conditions: 一种是使用字典来存储您的治疗和状况:

d = {"No Treatment": ('NO', 'NO', 'NO'),
     "Metformin": ('YES', 'NO', 'NO')
     "Glipizide": ('NO', 'YES', 'NO'),
     ...}

Then iterate your dictionary and update your series: 然后迭代您的字典并更新您的系列:

arr = df[['metformin', 'glipizide', 'insulin']].values

for treatment, flags in d.items():
    df.loc[(arr == flags).all(1), 'treatment_type'] = treatment

The only improvement I suggest is to convert all 'NO' / 'YES' values to Boolean False / True . 我建议的唯一改进是将所有'NO' / 'YES'值转换为Boolean False / True This will be considerably more efficient as Boolean series support vectorised operations. 由于布尔序列支持向量化操作,因此这将大大提高效率。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM