[英]How to create dataframe columns based on dictionaries for non-null columns in Python
I have a data frame and a dictionary like this:我有一个数据框和一个像这样的字典:
df:
ID Science Social
1 12 24
2 NaN 13
3 26 NaN
4 23 35
count_dict = {Science:30, Social: 40}
For every course column in the data frame, I want to create 2 new columns such that:对于数据框中的每个课程列,我想创建 2 个新列,以便:
Col-1(Course_Count): If the course column is not null, then the new column gets the value from the dictionary, else it will remain Null. Col-1(Course_Count):如果课程列不是 null,则新列从字典中获取值,否则将保持 Null。
Col-2(Course_%): Course/Course_Count Col-2(Course_%):Course/Course_Count
The output looks like this: output 看起来像这样:
df:
ID Science Science_Count Science_% Social Social_Count Social_%
1 12 30 12/30 24 40 24/40
2 NaN 13 40 13/40
3 26 30 26/30 NaN
4 23 30 23/30 35 40 35/40
Can anyone help me with this?谁能帮我这个?
If not any column in your dataframe is a course column, you can specify only the course column names in the courses
list.如果您的 dataframe 中的任何列都不是课程列,则您只能在
courses
列表中指定课程列名称。 Now I am just skipping the first column there ('ID'):现在我只是跳过那里的第一列('ID'):
courses = df.columns[1:]
order = ['ID'] + [col for course in courses for col in (course, course+'_Count', course+'_%')]
for course in courses:
df[course + '_Count'] = count_dict[course]
df.loc[df[course].isna(), course + '_Count'] = np.nan
df[course + '_%'] = df[course] / df[course + '_Count']
df = df[order] # reorder the columns
Result:结果:
ID Science Science_Count Science_% Social Social_Count Social_%
0 1 12.0 30.0 0.400000 24.0 40.0 0.600
1 2 NaN NaN NaN 13.0 40.0 0.325
2 3 26.0 30.0 0.866667 NaN NaN NaN
3 4 23.0 30.0 0.766667 35.0 40.0 0.875
try this:尝试这个:
column_name=list(df.columns)
for column in column_name:
df[f"{column}_Count"]=df.apply(lambda x:count_dict[column] if x==None else None,axis=1)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.