[英]PANDAS/Python check if the value from 2 datasets is equal and change the 1&0 to True or False
I want to check if the value in both datasets is equal.我想检查两个数据集中的值是否相等。 But the datasets are not in the same order so need to loop through the datasets.
但是数据集的顺序不同,因此需要遍历数据集。
Dataset 1 contract: enter image description here数据集 1 合同:在此处输入图像描述
Part number![]() |
H50 ![]() |
H51 ![]() |
H53 ![]() |
---|---|---|---|
ID001 ![]() |
1 ![]() |
1 ![]() |
1 ![]() |
ID002 ![]() |
1 ![]() |
1 ![]() |
1 ![]() |
ID003 ![]() |
0 ![]() |
1 ![]() |
0 ![]() |
ID004 ![]() |
1 ![]() |
1 ![]() |
1 ![]() |
ID005 ![]() |
1 ![]() |
1 ![]() |
1 ![]() |
data 2 anx: enter image description here数据 2 anx:在此处输入图像描述
So the partnumber are not in the same order, but to check the value the partnumber needs to be equal from each file.所以零件号的顺序不一样,但要检查每个文件中零件号的值需要相等。 Then if the part nr is the same, check if the Hcolumn is the same too.
然后,如果部件号相同,请检查 Hcolumn 是否也相同。 If both partnumber and the H(header)nr are the same, check if the value is the same.
如果 partnumber 和 H(header)nr 相同,则检查值是否相同。
Part number![]() |
H50 ![]() |
H51 ![]() |
H53 ![]() |
---|---|---|---|
ID001 ![]() |
1 ![]() |
1 ![]() |
1 ![]() |
ID003 ![]() |
0 ![]() |
0 ![]() |
1 ![]() |
ID004 ![]() |
0 ![]() |
1 ![]() |
1 ![]() |
ID002 ![]() |
1 ![]() |
0 ![]() |
1 ![]() |
ID005 ![]() |
1 ![]() |
1 ![]() |
1 ![]() |
Expecting outcome:期待结果:
If the value 1==1 or 0 == 0 from both dataset -> change to TRUE.如果两个数据集中的值 1==1 或 0 == 0 -> 更改为 TRUE。 If the value = 1 in dataset1 but = 0 in dataset2 -> change the value to FALSE.
如果数据集 1 中的值 = 1 但数据集 2 中的值 = 0 -> 将值更改为 FALSE。 and safe all the rows that contains FALSE value into an excel file name "Not in contract" If the value = 0 in dataset1 but 1 in dataset2 -> change the value to FALSE
并将所有包含 FALSE 值的行安全到 excel 文件名“不在合同中”如果 dataset1 中的值 = 0 但 dataset2 中的 1 -> 将值更改为 FALSE
Example expected outcome示例预期结果
Part number![]() |
H50 ![]() |
H51 ![]() |
H53 ![]() |
---|---|---|---|
ID001 ![]() |
TRUE![]() |
TRUE![]() |
TRUE![]() |
ID002 ![]() |
TRUE![]() |
FALSE![]() |
TRUE![]() |
ID003 ![]() |
TRUE![]() |
FALSE![]() |
FALSE![]() |
ID004 ![]() |
FALSE![]() |
TRUE![]() |
TRUE![]() |
ID005 ![]() |
TRUE![]() |
TRUE![]() |
TRUE![]() |
df_merged = df1.merge(df2, on='Part number')
a = df_merged[df_merged.columns[df_merged.columns.str.contains('_x')]]
b = df_merged[df_merged.columns[df_merged.columns.str.contains('_y')]]
out = pd.concat([df_merged['Part number'], pd.DataFrame(a.values == b.values, columns=df1.columns[1:4])], axis=1)
out
Part number H50 H51 H53
0 ID001 True True True
1 ID002 True False True
2 ID003 True False False
3 ID004 False True True
4 ID005 True True True
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.