连接具有相同列的两个熊猫数据框并合并具有相同索引的行

Question

I have two dataframes df1 and df2 each with the same column names using timestamps as indicies.我有两个数据帧df1和df2 ，每个数据帧都具有相同的列名，使用时间戳作为索引。 I want to concatenate the two dataframes whilst merging rows with the same index choosing the values stored in df2 as preference.我想连接两个数据帧，同时合并具有相同索引的行，选择存储在df2中的值作为首选项。 This is poorly worded but see below.这措辞不好，但见下文。 Eg例如

>>> df1= TimeStamp A_Output B_Output C_Output
          00:00:00  20       15       5
          00:00:06  20       NaN      3
          00:00:15  15       6      NaN
          00:00:20  20       NaN      5
          00:00:30  25       14      10


 >>> df2= TimeStamp A_Output B_Output C_Output
          00:00:00  15       5        8
          00:00:04  16       NaN      NaN
          00:00:06  17       NaN      NaN
          00:00:15  NaN      NaN      2
          00:00:18  19       NaN      NaN
          00:00:21  14       NaN      NaN
          00:00:26  32       NaN      5
          

 >>> df3= TimeStamp A_Output B_Output C_Output
          00:00:00  15       5        8
          00:00:04  16       NaN      NaN
          00:00:06  17       NaN      3
          00:00:15  15       6        2
          00:00:18  19       NaN      NaN
          00:00:21  14       NaN      NaN
          00:00:26  32       NaN      5
          00:00:30  25       14      10

df3 is what I would like to achieve. df3是我想要实现的。 Here there is a timestamp for every index in df1 and df2 .这里df1和df2中的每个索引都有一个时间戳。 For each common index, where db2 is not NaN, we take the values, otherwise we preserve those stored in df1 .对于每个公共索引，其中 db2 不是 NaN，我们取值，否则我们保留存储在df1中的值。

df1 >>> 00:00:15  15        6     NaN
df2 >>> 00:00:15  NaN      NaN     2
df3 >>> 00:00:15  15        6      2

df1 >>> 00:00:00  20        15     5
df2 >>> 00:00:00  15         5     8
df3 >>> 00:00:00  15         5     8

For clarification see the above examples.有关说明，请参见上述示例。 I really can't find a way to do this -- for reference each dataframe has around 90 columns and 100k+ rows.我真的找不到这样做的方法——作为参考，每个 dataframe 大约有 90 列和 100k+ 行。

Answer 1

Try combine first:先试试结合：

df3 = df2.combine_first(df1)

print(df3)

           A_Output  B_Output  C_Output
TimeStamp                              
00:00:00       15.0       5.0       8.0
00:00:04       16.0       NaN       NaN
00:00:06       17.0       NaN       3.0
00:00:15       15.0       6.0       2.0
00:00:18       19.0       NaN       NaN
00:00:20       20.0       NaN       5.0
00:00:21       14.0       NaN       NaN
00:00:26       32.0       NaN       5.0
00:00:30       25.0      14.0      10.0

连接具有相同列的两个熊猫数据框并合并具有相同索引的行

问题描述

1 个解决方案

解决方案1
2 已采纳 2020-08-07 12:28:04

连接具有相同列的两个熊猫数据框并合并具有相同索引的行

问题描述

1 个解决方案

解决方案1 2 已采纳 2020-08-07 12:28:04

解决方案1
2 已采纳 2020-08-07 12:28:04