简体   繁体   English

使用python比较两个sqlite3表

[英]Comparing two sqlite3 tables using python

Now the question is a little tricky.... I have 2 tables that i want to compare them for their content. 现在的问题有点棘手。...我有2个表,我想将它们的内容进行比较。 The tables have same no. 这些表有相同的编号。 of columns and same column names and same ordering of columns(if there is such thing). 列,相同列名和相同列顺序(如果有的话)。

Now i want to compare their contents but the trick is the ordering of their rows can be different ,ie, row no. 现在我想比较它们的内容,但诀窍是它们的行的顺序可以不同,即行号。 1 in table 1 can be present in row no. 表1中的1可以出现在行1中。 1000 in table 2. I want to compare their contents such that the ordering of the rows don't matter. 表2中的1000。我想比较它们的内容,以使行的顺序无关紧要。 And also remember that their is no such thing as primary key. 还要记住,它们不是主键。

Now i can use and design Data structures or i can use an existing library to do the job. 现在,我可以使用和设计数据结构,也可以使用现有的库来完成这项工作。 I want to use some existing APIs (if any). 我想使用一些现有的API(如果有)。 So can any1 point me in the right direction?? 那么任何人都能指出我正确的方向吗?

Make two text files. 制作两个文本文件。 Sort them. 对它们进行排序。 Compare them with diff . diff

Alternatively, import them into SQLite tables. 或者,将它们导入SQLite表。 Then you can use queries like the following: 然后,您可以使用如下查询:

SELECT * FROM a INTERSECT SELECT * FROM b;
SELECT * FROM a EXCEPT    SELECT * FROM b;

to get rows that exist in both tables, or only in one table. 获取两个表或仅一个表中存在的行。

You'd need to be more precise on how you intend to compare the tables' content and what is the expected outcome. 您需要更加精确地了解打算如何比较表的内容以及预期的结果是什么。 Sqlite3 itself is a good tool for comparison and you can easily query the comparison results you wish to get. Sqlite3本身是比较的好工具,您可以轻松查询想要获得的比较结果。

If these tables however are located in different databases, you can dump them into temporary db using python's sqlite3 bulit-in module. 如果这些表位于不同的数据库中,则可以使用python的sqlite3 bulit-in模块将它们转储到临时数据库中。

You can also dump the query results into a data collection such as list and then perform your comparison but then again we can't help you if we don't know the expected outcome. 您还可以将查询结果转储到诸如list之类的数据集中,然后执行比较,但是如果我们不知道预期结果,那么我们也将无济于事。

You say "there is no PRIMARY KEY". 您说“没有主键”。 Does this mean there is truly no way to establish the identity of the item represented by each row? 这是否意味着真的没有办法确定每行代表的商品的身份? If that is true, your problem is insoluble since you can never determine which row in one table to compare with each row in the other table. 如果是这样,您的问题就无法解决,因为您永远无法确定一个表中的哪一行与另一表中的每一行进行比较。

If there is a set of columns that establish identity, then you would read each row from table 1, read the row with the same identity from table 2, and compare the non-identity columns. 如果一组建立身份列的,那么你会从表1中读取每一行,与表2相同的身份读取行,比较非标识列。 If you find all the table 1 rows in table 2, and the non-identity columns are identical, then you finish up with a check for table 2 rows with identities that are not in table 1. 如果在表2中找到所有表1行,并且非标识列相同,那么您将检查表2行是否具有不在表1中的标识。

If there is no identity and if you don't care about identity, but just whether the two tables would appear identical, then you would read the records from each table sorted in some particular order. 如果没有身份 ,并且您不关心身份,只是两个表是否看起来相同,那么您将从每个表中读取以特定顺序排序的记录。 Compare row 1 to row 1, row 2 to row 2, etc. When you hit a row that's different, you know the tables are not the same. 将第1行与第1行,第2行与第2行进行比较,依此类推。当您点击不同的行时,您会知道表是不同的。

As a shortcut, you could just use SQLite to dump the data into two text files (again, ordered the same way for both tables) and compare the file contents. 作为快捷方式,您可以只使用SQLite将数据转储到两个文本文件中(同样,两个表的排序方式相同)并比较文件内容。

You may need to include all the columns in your ORDER BY clause if there is not a subset of columns that guarantee a unique sort order. 如果没有保证唯一排序顺序的列子集,则可能需要在ORDER BY子句中包括所有列。 (If there is such a subset of columns, then those columns would constitute the identity for the rows and you would use the above algorithm). (如果列的这样的子集,则那些列将构成这些行中的身份和你可以使用上面的算法)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM