[英]How to conditionally replace NaN values in a column based on values in another column
Say I have a dataframe with the following values:假设我有一个具有以下值的 dataframe:
Course_Code科目编号 | Department部门 |
---|---|
CS201 CS201 | CompSci计算机科学 |
CS202 CS202 | NaN钠盐 |
I would appreciate if someone could help me how to replace the NaN values in the "Department" column based on the values in the column "Course Code".如果有人可以帮助我如何根据“课程代码”列中的值替换“部门”列中的 NaN 值,我将不胜感激。 The logic to follow is to replace the NaN as "CompSci" if "CS" is in the "Course Code" entry for that row.要遵循的逻辑是,如果“CS”在该行的“课程代码”条目中,则将 NaN 替换为“CompSci”。
You could create a mapping that you could use to fill in NaN values.您可以创建一个可用于填充 NaN 值的映射。 One option to create the mapping is to use mask
to select values where Course_Code
starts with "CompSci":创建映射的一个选项是使用mask
到 select 值,其中Course_Code
以“CompSci”开头:
df['Department'] = df['Department'].mask((df['Course_Code'].str.startswith('CS')) & df['Department'].isna(), 'CompSci')
Output: Output:
Course_Code Department
0 CS201 CompSci
1 CS202 CompSci
You could use more condition as 'CS' for CompSci.您可以使用更多条件作为 CompSci 的“CS”。
import pandas as pd
import numpy as np
df = pd.DataFrame([['CS201', 'CompSci'],['CS201', np.NaN]], columns = ['Course_Code', 'Department'])
def condition(x):
if (x['Course_Code'].startswith('CS')):
return "CompSci"
df['Department'] = df.apply(condition, axis=1)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.