[英]SQLite compare query Python
I've been trying to figure out the best way to write a query to compare the rows in two tables. 我一直在尝试找出编写查询以比较两个表中的行的最佳方法。 My goal is to see if the two tuples in result Set A are in the larger result set B. I only want to see the tuples that are different in the query results.
我的目标是查看结果集A中的两个元组是否在较大的结果集B中。我只想查看查询结果中不同的元组。
'''SELECT table1.field_b, table1.field_c, table1.field_d
'''FROM table1
'''ORDER BY field_b
results_a = [(101010101, 111111111, 999999999), (121212121, 222222222, 999999999)]
'''SELECT table2.field_a, table2.fieldb, table3.field3
'''FROM table2
'''ORDER BY field_a
results_b =[(101010101, 111111111, 999999999), (121212121, 333333333, 999999999), (303030303, 444444444, 999999999)]
So what I want to do is take results_a and make sure that they have an exact match somewhere in results_b. 因此,我要执行的操作是使用results_a并确保它们在results_b中的某个位置完全匹配。 So since the second record in the second tuple is different than what is in results_a, I would like to return the second tuple in results_a.
因此,由于第二个元组中的第二个记录与results_a中的第二个记录不同,因此我想在results_a中返回第二个元组。
Ultimately I would like to return a set that also has the second tuple that did not match in the other set so I could reference both in my program. 最终,我想返回一个集合,该集合还具有另一个集合中不匹配的第二个元组,因此我可以在程序中引用两者。 Ideally since the second tuples primary key (field_b in table1) didn't match the corresponding primary key (field_a) in table2 then I would want to display results_c ={(121212121, 222222222, 999999999):(121212121, 222222222, 999999999)}.
理想情况下,由于第二个元组主键(表1中的field_b)与表2中的对应主键(field_a)不匹配,因此我想显示results_c = {(121212121,222222222,999999999):( 121212121,222222222,999999999)} 。 This is complicated by the facts that the results in both tables will not be in the same order so I can't write code that says (compare tuple2 in results_a to tuple2 in results_b).
由于两个表中的结果的顺序不同,因此这使事实变得复杂,因此我无法编写这样的代码(将results_a中的tuple2与results_b中的tuple2进行比较)。 It is more like (compare tuple2 in results_a and see if it matches any record in results_b. If the primary keys match and none of the tuples in results b completely match or no partial match is found return the records that don't match.)
它更像是(比较results_a中的tuple2并查看它是否与results_b中的任何记录相匹配。如果主键匹配并且结果b中的所有元组都不完全匹配或未找到部分匹配,则返回不匹配的记录。)
I apologize that this is so wordy. 我很抱歉,这太罗word了。 I couldn't think of a better way to explain it.
我想不出更好的方法来解释它。 Any help would be much appreciated.
任何帮助将非常感激。
Thanks! 谢谢!
UPDATED EFFORT ON PARTIAL MATCHES 对部分比赛进行了更新
a = [(1, 2, 3),(4,5,7)]
b = [(1, 2, 3),(4,5,6)]
pmatch = dict([])
def partial_match(x,y):
return sum(ea == eb for (ea,eb) in zip(x,y))>=2
for el_a in a:
pmatch[el_a] = [el_b for el_b in b if partial_match(el_a,el_b)]
print(pmatch)
OUTPUT = {(4, 5, 7): [(4, 5, 6)], (1, 2, 3): [(1, 2, 3)]}. 输出= {(4,5,7):[(4,5,6)],(1,2,3):[(1,2,3)]}。 I would have expected it to be just {(4,5,7):(4,5,6)} because those are the only sets that are different.
我本来希望它只是{(4,5,7):( 4,5,6)},因为这些是唯一不同的集合。 Any ideas?
有任何想法吗?
Take results_a and make sure that they have an exact match somewhere in results_b: 以results_a并确保它们在results_b的某处完全匹配:
for el in results_a:
if el in results_b:
...
Get partial matches: 获取部分匹配:
pmatch = dict([])
def partial_match(a,b):
# for instance ...
return sum(ea == eb for (ea,eb) in zip(a,b)) >= 2
for el_a in results_a:
pmatch[el_a] = [el_b for el_b in results_b if partial_macth(el_a,el_b)]
Return the records that don't match: 返回不匹配的记录:
no_match = [el for el in results_a if el not in results_b]
-- EDIT / Another possible partial_match -编辑/另一个可能的partial_match
def partial_match(x,y):
nb_matches = sum(ea == eb for (ea,eb) in zip(x,y))
return 0.6 < float(nb_matches) / len(x) < 1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.