简体   繁体   English

根据另一列包含的单词有条件地为一列赋值

[英]Conditionally give value to one column based on the words that another column contains

I have a dataframe "df":我有一个数据框“df”:

Patient    Condition
John       Adductor tear left
Mary       Adductor sprain left
Henry      Hamstring sprain
Lucy       Hamstring tear

You do not need to split the condition column.您不需要拆分条件列。 You can use the following code given the assumptions:给定假设,您可以使用以下代码:

  • every row in df contains either "Adductor" or "Hamstring" in the Condition column, case sensitive. df中的每一行在Condition列中都包含"Adductor""Hamstring" ,区分大小写。
  • same thing with the words "tear" and "sprain" ."tear""sprain"这两个词相同。
df["Muscle"] = df["Condition"].apply(lambda x: 1 if "Adductor" in x else 2)

Same thing with the Injury column.Injury栏相同。 You can try it and let me know if you need help.您可以尝试一下,如果您需要帮助,请告诉我。

If you do not want to worry about words being in upper or lower cases, you can use:如果您不想担心单词是大写还是小写,可以使用:

df["Muscle"] = df["Condition"].apply(lambda x: 1 if "adductor" in x.lower() else 2)

You could also use str.contains<\/code><\/a> to search for specific strings in Condition column and assign values using np.select<\/code><\/a> .您还可以使用str.contains<\/code><\/a>在 Condition 列中搜索特定字符串并使用np.select<\/code><\/a>分配值。

import numpy as np
df['Muscle'] = np.select([df['Condition'].str.contains('Adductor'), df['Condition'].str.contains('Hamstring')], [1,2], np.nan)
df['Injury'] = np.select([df['Condition'].str.contains('tear'), df['Condition'].str.contains('sprain')], [1,2], np.nan)

Here is another way using enumerate()<\/code> and assign()<\/code>这是使用enumerate()<\/code>和assign()<\/code>的另一种方法

m = {j:i for i,j in enumerate(['adductor','hamstring'],1)}
i = {j:i for i,j in enumerate(['tear','sprain'],1)}

col = df['Condition'].str.split()
df.assign(Muscle = col.str[0].str.lower().map(m), Injury = col.str[1].str.lower().map(i))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM