[英]Update pandas dataframe with values from another dataframe
Assuming a dataframe where values from any of the columns can change, Given another dataframe which contains the old value, new value and column it belongs to, how to update dataframe using information about changes? 假设一个数据框的任何列中的值都可以更改,给定另一个包含旧值,新值和它所属列的数据框,如何使用有关更改的信息更新数据框? For example:
例如:
>>> my_df
x y z
0 1 2 5
1 2 3 9
2 8 7 2
3 3 4 7
4 6 7 7
my_df_2
contains information about changed values and their columns: my_df_2
包含有关更改的值及其列的信息:
>>> my_df_2
changed_col old_value new_value
0 x 2 10
1 z 9 20
2 x 1 12
3 y 4 23
How to use information in my_df_2
to update my_df
such that my_df
now becomes: 如何使用
my_df_2
信息更新my_df
,以使my_df
现在变为:
>>> my_df
x y z
0 12 2 5
1 10 3 20
2 8 7 2
3 3 23 7
4 6 7 7
You can create a dictionary for the changes as follows: 您可以为更改创建字典,如下所示:
d = {i: dict(zip(j['old_value'], j['new_value'])) for i, j in my_df_2.groupby('changed_col')}
d
Out: {'x': {1: 12, 2: 10}, 'y': {4: 23}, 'z': {9: 20}}
Then use it in DataFrame.replace : 然后在DataFrame.replace中使用它:
my_df.replace(d)
Out:
x y z
0 12 2 5
1 10 3 20
2 8 7 2
3 3 23 7
4 6 7 7
You can use the update method. 您可以使用更新方法。 See http://pandas.pydata.org/pandas-docs/version/0.17.1/generated/pandas.DataFrame.update.html
参见http://pandas.pydata.org/pandas-docs/version/0.17.1/generated/pandas.DataFrame.update.html
Example: 例:
old_df = pd.DataFrame({"a":np.arange(5), "b": np.arange(4,9)})
+----+-----+-----+
| | a | b |
|----+-----+-----|
| 0 | 0 | 4 |
| 1 | 1 | 5 |
| 2 | 2 | 6 |
| 3 | 3 | 7 |
| 4 | 4 | 8 |
+----+-----+-----+
new_df = pd.DataFrame({"a":np.arange(7,8), "b": np.arange(10,11)})
+----+-----+-----+
| | a | b |
|----+-----+-----|
| 0 | 7 | 10 |
+----+-----+-----+
old_df.update(new_df)
+----+-----+-----+
| | a | b |
|----+-----+-----|
| 0 | 7 | 10 | #Changed row
| 1 | 1 | 5 |
| 2 | 2 | 6 |
| 3 | 3 | 7 |
| 4 | 4 | 8 |
+----+-----+-----+
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.