[英]Modify column in according another column dataframe python
I have two dataframes.我有两个数据框。 One is the master dataframe and the other df is used to fil my master dataframe.
一个是master dataframe,另一个df用来归档我的master dataframe。
what I want is fil one column in according another column without alter the others columns.我想要的是根据另一列填写一列而不改变其他列。
This is example of master df这是主 df 的示例
| id | Purch. order | cost | size | code |
| 1 | G918282 | 8283 | large| hchs |
| 2 | EE18282 | 1283 | small| ueus |
| 3 | DD08282 | 5583 | large| kdks |
| 4 | GU88912 | 8232 | large| jdhd |
| 5 | NaN | 1283 | large| jdjd |
| 6 | Nan | 5583 | large| qqas |
| 7 | Nan | 8232 | large| djjs |
This is example of the another df这是另一个df的例子
| id | Purch. order | cost |
| 1 | G918282 | 7728 |
| 2 | EE18282 | 2211 |
| 3 | DD08282 | 5321 |
| 4 | GU88912 | 4778 |
| 5 | NaN | 4283 |
| 6 | Nan | 9993 |
| 7 | Nan | 3442 |
This is the result I'd like这是我想要的结果
| id | Purch. order | cost | size | code |
| 1 | G918282 | 7728 | large| hchs |
| 2 | EE18282 | 2211 | small| ueus |
| 3 | DD08282 | 5321 | large| kdks |
| 4 | GU88912 | 4778 | large| jdhd |
| 5 | NaN | 1283 | large| jdjd |
| 6 | Nan | 5583 | large| qqas |
| 7 | Nan | 8232 | large| djjs |
Where only the cost column is modified only if the secondary df coincides with the purch.仅当次要 df 与购买一致时才修改成本列。 order and if it's not NaN.
顺序,如果它不是 NaN。
I hope you can help me... and I'm sorry if my english is so basic, not is my mother language.我希望你能帮助我......如果我的英语太基础了,我很抱歉,而不是我的母语。 Thanks a lot.
非常感谢。
You can do it with merge
followed by updating the cost column based on where the Nan
are:您可以通过
merge
来执行此操作,然后根据Nan
的位置更新成本列:
final_df = df1.merge(df2[~df2["Purch. order"].isna()], on = 'Purch. order', how="left")
final_df.loc[~final_df['Purch. order'].isnull(), "cost"] = final_df['cost_y'] # not nan
final_df.loc[final_df['Purch. order'].isnull(), "cost"] = final_df['cost_x'] # nan
final_df = final_df.drop(['id_y','cost_x','cost_y'],axis=1)
Output: Output:
id _x Purch. order size code cost
0 1 G918282 large hchs 7728.0
1 2 EE18282 small ueus 2211.0
2 3 DD08282 large kdks 5321.0
3 4 GU88912 large jdhd 4778.0
4 5 NaN large jdjd 1283.0
5 6 NaN large qqas 5583.0
6 7 NaN large djjs 8232.0
lets try Update
which works along indexes, by default overwrite
is set to True
which will overwrite overlapping values in your target dataframe.让我们尝试
Update
,它适用于索引,默认情况下overwrite
设置为True
,这将覆盖目标 dataframe 中的重叠值。 use overwrite=False
if you only want to change NA values.如果您只想更改 NA 值,请使用
overwrite=False
。
master_df = master_df.set_index(['id','Purch. order'])
another_df = another_df.dropna(subset=['Purch. order']).set_index(['id','Purch. order'])
master_df.update(another_df)
print(master_df)
cost size code
id Purch. order
1 G918282 7728.0 large hchs
2 EE18282 2211.0 small ueus
3 DD08282 5321.0 large kdks
4 GU88912 4778.0 large jdhd
5 NaN 1283.0 large jdjd
6 Nan 5583.0 large qqas
7 Nan 8232.0 large djjs
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.