在熊猫中加入或合并覆盖

Question

I want to perform a join/merge/append operation on a dataframe with datetime index.我想对具有日期时间索引的数据帧执行连接/合并/追加操作。

Let's say I have df1 and I want to add df2 to it.假设我有df1并且我想向其中添加df2 。 df2 can have fewer or more columns, and overlapping indexes. df2可以有更少或更多的列，以及重叠的索引。 For all rows where the indexes match, if df2 has the same column as df1 , I want the values of df1 be overwritten with those from df2 .对于索引匹配的所有行，如果df2与df1具有相同的列，我希望df1的值被df2的值覆盖。

How can I obtain the desired result?我怎样才能获得想要的结果？

Answer 1

How about: df2.combine_first(df1) ?怎么样： df2.combine_first(df1) ？

In [33]: df2
Out[33]: 
                   A         B         C         D
2000-01-03  0.638998  1.277361  0.193649  0.345063
2000-01-04 -0.816756 -1.711666 -1.155077 -0.678726
2000-01-05  0.435507 -0.025162 -1.112890  0.324111
2000-01-06 -0.210756 -1.027164  0.036664  0.884715
2000-01-07 -0.821631 -0.700394 -0.706505  1.193341
2000-01-10  1.015447 -0.909930  0.027548  0.258471
2000-01-11 -0.497239 -0.979071 -0.461560  0.447598

In [34]: df1
Out[34]: 
                   A         B         C
2000-01-03  2.288863  0.188175 -0.040928
2000-01-04  0.159107 -0.666861 -0.551628
2000-01-05 -0.356838 -0.231036 -1.211446
2000-01-06 -0.866475  1.113018 -0.001483
2000-01-07  0.303269  0.021034  0.471715
2000-01-10  1.149815  0.686696 -1.230991
2000-01-11 -1.296118 -0.172950 -0.603887
2000-01-12 -1.034574 -0.523238  0.626968
2000-01-13 -0.193280  1.857499 -0.046383
2000-01-14 -1.043492 -0.820525  0.868685

In [35]: df2.comb
df2.combine        df2.combineAdd     df2.combine_first  df2.combineMult    

In [35]: df2.combine_first(df1)
Out[35]: 
                   A         B         C         D
2000-01-03  0.638998  1.277361  0.193649  0.345063
2000-01-04 -0.816756 -1.711666 -1.155077 -0.678726
2000-01-05  0.435507 -0.025162 -1.112890  0.324111
2000-01-06 -0.210756 -1.027164  0.036664  0.884715
2000-01-07 -0.821631 -0.700394 -0.706505  1.193341
2000-01-10  1.015447 -0.909930  0.027548  0.258471
2000-01-11 -0.497239 -0.979071 -0.461560  0.447598
2000-01-12 -1.034574 -0.523238  0.626968       NaN
2000-01-13 -0.193280  1.857499 -0.046383       NaN
2000-01-14 -1.043492 -0.820525  0.868685       NaN

Note that it takes the values from df1 for indices that do not overlap with df2 .请注意，对于不与df2重叠的索引，它从df1获取df2 。 If this doesn't do exactly what you want I would be willing to improve this function / add options to it.如果这不能完全满足您的要求，我愿意改进此功能/为其添加选项。

Answer 2

For a merge like this, the update method of a DataFrame is useful.对于这样的合并，DataFrame 的update方法很有用。

Taking the examples from the documentation :从文档中获取示例：

import pandas as pd
import numpy as np

df1 = pd.DataFrame([[np.nan, 3., 5.], [-4.6, 2.1, np.nan],
                   [np.nan, 7., np.nan]])
df2 = pd.DataFrame([[-42.6, np.nan, -8.2], [-5., 1.6, 4]],
                   index=[1, 2])

Data before the update : update前数据：

>>> df1
     0    1    2
0  NaN  3.0  5.0
1 -4.6  2.1  NaN
2  NaN  7.0  NaN
>>>
>>> df2
      0    1    2
1 -42.6  NaN -8.2
2  -5.0  1.6  4.0

Let's update df1 with data from df2 :让我们用来自df2数据更新df1 ：

df1.update(df2)

Data after the update:更新后数据：

>>> df1
      0    1    2
0   NaN  3.0  5.0
1 -42.6  2.1 -8.2
2  -5.0  1.6  4.0

Remarks:评论：

It's important to notice that this is an operation "in place", modifying the DataFrame that calls update .需要注意的是，这是一个“就地”操作，修改调用update的 DataFrame 。
Also note that non NaN values in df1 are not overwritten with NaN values in df2另请注意， df1中的非 NaN 值不会被df2 NaN 值覆盖

在熊猫中加入或合并覆盖

问题描述

2 个解决方案

解决方案1
52 已采纳 2012-03-20 21:02:32

解决方案2
33 2017-03-29 03:32:43

在熊猫中加入或合并覆盖

问题描述

2 个解决方案

解决方案1 52 已采纳 2012-03-20 21:02:32

解决方案2 33 2017-03-29 03:32:43

解决方案1
52 已采纳 2012-03-20 21:02:32

解决方案2
33 2017-03-29 03:32:43