根据另一列修改列 dataframe python

Question

I have two dataframes.我有两个数据框。 One is the master dataframe and the other df is used to fil my master dataframe.一个是master dataframe，另一个df用来归档我的master dataframe。

what I want is fil one column in according another column without alter the others columns.我想要的是根据另一列填写一列而不改变其他列。

This is example of master df这是主 df 的示例

| id | Purch. order | cost | size | code |
| 1  |    G918282   | 8283 | large| hchs |
| 2  |    EE18282   | 1283 | small| ueus |
| 3  |    DD08282   | 5583 | large| kdks |
| 4  |    GU88912   | 8232 | large| jdhd |
| 5  |     NaN      | 1283 | large| jdjd |
| 6  |     Nan      | 5583 | large| qqas |
| 7  |     Nan      | 8232 | large| djjs |

This is example of the another df这是另一个df的例子

| id | Purch. order | cost |
| 1  |    G918282   | 7728 |
| 2  |    EE18282   | 2211 |
| 3  |    DD08282   | 5321 |
| 4  |    GU88912   | 4778 |
| 5  |     NaN      | 4283 |
| 6  |     Nan      | 9993 |
| 7  |     Nan      | 3442 |

This is the result I'd like这是我想要的结果

| id | Purch. order | cost | size | code |
| 1  |    G918282   | 7728 | large| hchs |
| 2  |    EE18282   | 2211 | small| ueus |
| 3  |    DD08282   | 5321 | large| kdks |
| 4  |    GU88912   | 4778 | large| jdhd |
| 5  |     NaN      | 1283 | large| jdjd |
| 6  |     Nan      | 5583 | large| qqas |
| 7  |     Nan      | 8232 | large| djjs |

Where only the cost column is modified only if the secondary df coincides with the purch.仅当次要 df 与购买一致时才修改成本列。 order and if it's not NaN.顺序，如果它不是 NaN。

I hope you can help me... and I'm sorry if my english is so basic, not is my mother language.我希望你能帮助我......如果我的英语太基础了，我很抱歉，而不是我的母语。 Thanks a lot.非常感谢。

Answer 1

You can do it with merge followed by updating the cost column based on where the Nan are:您可以通过merge来执行此操作，然后根据Nan的位置更新成本列：

final_df = df1.merge(df2[~df2["Purch. order"].isna()], on = 'Purch. order', how="left")

final_df.loc[~final_df['Purch. order'].isnull(), "cost"] = final_df['cost_y'] # not nan
final_df.loc[final_df['Purch. order'].isnull(), "cost"] = final_df['cost_x'] # nan
final_df = final_df.drop(['id_y','cost_x','cost_y'],axis=1)

Output: Output：

id _x  Purch. order size    code    cost
    0   1   G918282 large   hchs    7728.0
    1   2   EE18282 small   ueus    2211.0
    2   3   DD08282 large   kdks    5321.0
    3   4   GU88912 large   jdhd    4778.0
    4   5   NaN     large   jdjd    1283.0
    5   6   NaN     large   qqas    5583.0
    6   7   NaN     large   djjs    8232.0

Answer 2

lets try Update which works along indexes, by default overwrite is set to True which will overwrite overlapping values in your target dataframe.让我们尝试Update ，它适用于索引，默认情况下overwrite设置为True ，这将覆盖目标 dataframe 中的重叠值。 use overwrite=False if you only want to change NA values.如果您只想更改 NA 值，请使用overwrite=False 。

master_df = master_df.set_index(['id','Purch. order'])
another_df = another_df.dropna(subset=['Purch. order']).set_index(['id','Purch. order'])

master_df.update(another_df)
print(master_df)
                   cost    size   code
id Purch. order                       
1  G918282       7728.0   large   hchs
2  EE18282       2211.0   small   ueus
3  DD08282       5321.0   large   kdks
4  GU88912       4778.0   large   jdhd
5  NaN           1283.0   large   jdjd
6  Nan           5583.0   large   qqas
7  Nan           8232.0   large   djjs

根据另一列修改列 dataframe python

问题描述

2 个解决方案

解决方案1
1 2020-06-16 16:13:30

解决方案2
1 2020-06-16 16:30:26

根据另一列修改列 dataframe python

问题描述

2 个解决方案

解决方案1 1 2020-06-16 16:13:30

解决方案2 1 2020-06-16 16:30:26

解决方案1
1 2020-06-16 16:13:30

解决方案2
1 2020-06-16 16:30:26