[英]Conditionally give value to one column based on the words that another column contains
I have a dataframe "df":我有一个数据框“df”:
Patient Condition
John Adductor tear left
Mary Adductor sprain left
Henry Hamstring sprain
Lucy Hamstring tear
You do not need to split the condition column.您不需要拆分条件列。 You can use the following code given the assumptions:给定假设,您可以使用以下代码:
df
contains either "Adductor"
or "Hamstring"
in the Condition
column, case sensitive. df
中的每一行在Condition
列中都包含"Adductor"
或"Hamstring"
,区分大小写。"tear"
and "sprain"
.与"tear"
和"sprain"
这两个词相同。df["Muscle"] = df["Condition"].apply(lambda x: 1 if "Adductor" in x else 2)
Same thing with the Injury
column.与Injury
栏相同。 You can try it and let me know if you need help.您可以尝试一下,如果您需要帮助,请告诉我。
If you do not want to worry about words being in upper or lower cases, you can use:如果您不想担心单词是大写还是小写,可以使用:
df["Muscle"] = df["Condition"].apply(lambda x: 1 if "adductor" in x.lower() else 2)
You could also use str.contains<\/code><\/a> to search for specific strings in Condition column and assign values using
np.select<\/code><\/a> .
您还可以使用
str.contains<\/code><\/a>在 Condition 列中搜索特定字符串并使用
np.select<\/code><\/a>分配值。
import numpy as np
df['Muscle'] = np.select([df['Condition'].str.contains('Adductor'), df['Condition'].str.contains('Hamstring')], [1,2], np.nan)
df['Injury'] = np.select([df['Condition'].str.contains('tear'), df['Condition'].str.contains('sprain')], [1,2], np.nan)
Here is another way using enumerate()<\/code> and
assign()<\/code>
这是使用
enumerate()<\/code>和
assign()<\/code>的另一种方法
m = {j:i for i,j in enumerate(['adductor','hamstring'],1)}
i = {j:i for i,j in enumerate(['tear','sprain'],1)}
col = df['Condition'].str.split()
df.assign(Muscle = col.str[0].str.lower().map(m), Injury = col.str[1].str.lower().map(i))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.