[英]Compare two dataframes and add new values in second dataframe to the first data-frame
I have two dataframes with the same headers我有两个具有相同标题的数据框
df1\ df1\
**Date prix moyen mini maxi H-Value C-Value**
0 17/09/20 8 6 9 122 2110122\
1 15/09/20 8 6 9 122 2110122\
2 10/09/20 8 6 9 122 2110122
and和
df2 df2
**Date prix moyen mini maxi H-Value C-Value**\
1 07/09/17 1.80 1.50 2.00 170 3360170\
1 17/09/20 8.00 6.00 9.00 122 2110122\
2 17/09/20 9.00 8.00 12.00 122 2150122\
3 17/09/20 10.00 8.00 12.00 122 14210122
I want to compare the two dataframes alone 3 parameters (Date, H-Value and C-Value), identify the new values present in df2 (values which do not occur in df1) and then append them in df1.我想单独比较两个数据帧的 3 个参数(日期、H 值和 C 值),确定 df2 中存在的新值(df1 中没有出现的值),然后在 df1 中确定它们的 append。
I am using我在用
df_unique = df2[~(df2['Date'].isin(df1['Date']) & df2['H-Value'].isin(df1['H-Value']) & df2['C-Value'].isin(df1['C-Value']) )].dropna().reset_index(drop=True)
and it is not working in identifying the new values in df2.它无法识别 df2 中的新值。 The resulting table only identifies some values and not others.
结果表只标识了一些值,而不是其他值。
Where am I going wrong?我哪里错了?
What is your question?你的问题是什么?
In [4]: df2[~(df2['Date'].isin(df1['Date']) & df2['H-Value'].isin(df1['H-Value']
...: ) & df2['C-Value'].isin(df1['C-Value']) )].dropna().reset_index(drop=Tru
...: e)
Out[4]:
Date prix moyen mini maxi H-Value C-Value
0 1 07/09/17 1.8 1.5 2.0 170 3360170
1 2 17/09/20 9.0 8.0 12.0 122 2150122
2 3 17/09/20 10.0 8.0 12.0 122 14210122
These are all rows in df2 that are not present in df1.这些都是 df2 中不存在于 df1 中的所有行。 Looks good to me...
在我看来很好...
I was actually able to solve the problem.我实际上能够解决问题。 The issue was not the command being used to compare the two datasets but rather the fact that one of the columns in df2 had a data format different from the same column in df1, rendering a direct comparison not possible.
问题不在于用于比较两个数据集的命令,而在于 df2 中的一列具有与 df1 中的同一列不同的数据格式,从而无法进行直接比较。
Here's what I try这是我尝试的
df1 = pd.concat([df1, df2[~df2.set_index(['Date', 'H-Value', 'C-Value']).index.isin(df1.set_index(['Date', 'H-Value', 'C-Value']).index)]])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.