[英]Compare two dataframes
I have two dataframes as below: I want to compare two dataframes row by row and if suppose row1 of df1 is not equal to row1 of df2 print an error 我有两个数据帧,如下所示:我想逐行比较两个数据帧,如果假设df1的row1不等于df2的row1,则打印错误
df1
A B
0 1 3
1 2 4
2 3 5
3 4 6
4 5 7
df2
A B
0 1 3
1 2 4
2 3 5
3 4 5
4 5 7
I want to print an error for row#4 because df1 has the value of 6 for variable 'B' and df1 has the value of 5 我想为第4行打印错误,因为df1的变量“ B”的值为6,而df1的值为5
Did you take a look at the documentation ? 您看过文档了吗?
df1.eq(df2)
A B
0 True True
1 True True
2 True True
3 True False
4 True True
If you want to see the specific values and rows you can do this 如果要查看特定的值和行,可以执行此操作
df1[~df1.eq(df2)].dropna(how='all')
A B
3 NaN 6.0
I like @aws_apprentice's answer. 我喜欢@aws_apprentice的答案。 But, since you asked to "print an error", consider also
pandas.testing.assert_frame_equal
(docs) , which will raise an AssertionError
exception if the DataFrames are not identical and give you diagnostic output. 但是,由于您要求“打印错误”,因此还
pandas.testing.assert_frame_equal
考虑pandas.testing.assert_frame_equal
(docs) ,如果DataFrames不相同,则会引发AssertionError
异常并为您提供诊断输出。
you should check Andy Hayden's answer here: Outputting difference in two Pandas dataframes side by side - highlighting the difference 您应该在此处检查Andy Hayden的答案: 在两个Pandas数据框中并排输出差异-突出显示差异
what you are trying to do(print error if a row is different) may not be the best option here. 您尝试执行的操作(如果行不同则打印错误)可能不是此处的最佳选择。 which dataframe do you intend to uses as a basis for comparison and add the error column?
您打算将哪个数据框用作比较基础并添加错误列? suppose you choose df1 and compare it to df2, what if df2 has additional rows that are not present in df1;
假设您选择df1并将其与df2进行比较,如果df2具有df1中不存在的其他行,该怎么办; in this case there is no row in df1 to add the error msg.
在这种情况下,df1中没有添加错误msg的行。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.