简体   繁体   中英

Comparing two sqlite3 tables using python

Now the question is a little tricky.... I have 2 tables that i want to compare them for their content. The tables have same no. of columns and same column names and same ordering of columns(if there is such thing).

Now i want to compare their contents but the trick is the ordering of their rows can be different ,ie, row no. 1 in table 1 can be present in row no. 1000 in table 2. I want to compare their contents such that the ordering of the rows don't matter. And also remember that their is no such thing as primary key.

Now i can use and design Data structures or i can use an existing library to do the job. I want to use some existing APIs (if any). So can any1 point me in the right direction??

Make two text files. Sort them. Compare them with diff .

Alternatively, import them into SQLite tables. Then you can use queries like the following:

SELECT * FROM a INTERSECT SELECT * FROM b;
SELECT * FROM a EXCEPT    SELECT * FROM b;

to get rows that exist in both tables, or only in one table.

You'd need to be more precise on how you intend to compare the tables' content and what is the expected outcome. Sqlite3 itself is a good tool for comparison and you can easily query the comparison results you wish to get.

If these tables however are located in different databases, you can dump them into temporary db using python's sqlite3 bulit-in module.

You can also dump the query results into a data collection such as list and then perform your comparison but then again we can't help you if we don't know the expected outcome.

You say "there is no PRIMARY KEY". Does this mean there is truly no way to establish the identity of the item represented by each row? If that is true, your problem is insoluble since you can never determine which row in one table to compare with each row in the other table.

If there is a set of columns that establish identity, then you would read each row from table 1, read the row with the same identity from table 2, and compare the non-identity columns. If you find all the table 1 rows in table 2, and the non-identity columns are identical, then you finish up with a check for table 2 rows with identities that are not in table 1.

If there is no identity and if you don't care about identity, but just whether the two tables would appear identical, then you would read the records from each table sorted in some particular order. Compare row 1 to row 1, row 2 to row 2, etc. When you hit a row that's different, you know the tables are not the same.

As a shortcut, you could just use SQLite to dump the data into two text files (again, ordered the same way for both tables) and compare the file contents.

You may need to include all the columns in your ORDER BY clause if there is not a subset of columns that guarantee a unique sort order. (If there is such a subset of columns, then those columns would constitute the identity for the rows and you would use the above algorithm).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM