比较两个数据帧并将第二个 dataframe 中的新值添加到第一个数据帧

Question

I have two dataframes with the same headers我有两个具有相同标题的数据框

df1\ df1\

      **Date  prix moyen    mini   maxi H-Value C-Value**

0   17/09/20     8     6       9    122 2110122\
1   15/09/20     8     6       9    122 2110122\
2   10/09/20     8     6       9    122 2110122

and和

df2 df2

      **Date     prix   moyen    mini   maxi    H-Value C-Value**\
1   07/09/17     1.80    1.50    2.00   170     3360170\
1   17/09/20     8.00    6.00    9.00   122     2110122\
2   17/09/20     9.00    8.00   12.00   122     2150122\
3   17/09/20    10.00    8.00   12.00   122     14210122

I want to compare the two dataframes alone 3 parameters (Date, H-Value and C-Value), identify the new values present in df2 (values which do not occur in df1) and then append them in df1.我想单独比较两个数据帧的 3 个参数（日期、H 值和 C 值），确定 df2 中存在的新值（df1 中没有出现的值），然后在 df1 中确定它们的 append。

I am using我在用

df_unique = df2[~(df2['Date'].isin(df1['Date']) & df2['H-Value'].isin(df1['H-Value']) & df2['C-Value'].isin(df1['C-Value']) )].dropna().reset_index(drop=True)

and it is not working in identifying the new values in df2.它无法识别 df2 中的新值。 The resulting table only identifies some values and not others.结果表只标识了一些值，而不是其他值。

Where am I going wrong?我哪里错了？

Answer 1

What is your question?你的问题是什么？

In [4]: df2[~(df2['Date'].isin(df1['Date']) & df2['H-Value'].isin(df1['H-Value']
   ...: ) & df2['C-Value'].isin(df1['C-Value']) )].dropna().reset_index(drop=Tru
   ...: e)
Out[4]: 
   Date      prix  moyen  mini  maxi  H-Value   C-Value
0     1  07/09/17    1.8   1.5   2.0      170   3360170
1     2  17/09/20    9.0   8.0  12.0      122   2150122
2     3  17/09/20   10.0   8.0  12.0      122  14210122

These are all rows in df2 that are not present in df1.这些都是 df2 中不存在于 df1 中的所有行。 Looks good to me...在我看来很好...

Answer 2

I was actually able to solve the problem.我实际上能够解决问题。 The issue was not the command being used to compare the two datasets but rather the fact that one of the columns in df2 had a data format different from the same column in df1, rendering a direct comparison not possible.问题不在于用于比较两个数据集的命令，而在于 df2 中的一列具有与 df1 中的同一列不同的数据格式，从而无法进行直接比较。

Answer 3

Here's what I try这是我尝试的

df1 = pd.concat([df1, df2[~df2.set_index(['Date', 'H-Value', 'C-Value']).index.isin(df1.set_index(['Date', 'H-Value', 'C-Value']).index)]])

比较两个数据帧并将第二个 dataframe 中的新值添加到第一个数据帧

问题描述

3 个解决方案

解决方案1
1 2021-03-18 10:56:26

解决方案2
1 2021-03-30 14:58:40

解决方案3
0 2021-03-18 10:57:08

比较两个数据帧并将第二个 dataframe 中的新值添加到第一个数据帧

问题描述

3 个解决方案

解决方案1 1 2021-03-18 10:56:26

解决方案2 1 2021-03-30 14:58:40

解决方案3 0 2021-03-18 10:57:08

解决方案1
1 2021-03-18 10:56:26

解决方案2
1 2021-03-30 14:58:40

解决方案3
0 2021-03-18 10:57:08