[英]How do i get the missing rows from a table compared to another table in different databases in MySQL?
I have two tables that have more than 200k rows, they are supposed to be identical except that one has 3 rows more than the other. 我有两个表,它们的行数超过200k,它们应该是相同的,只是其中一个表比另一个表多3行。 I am trying to figure out which lines are they. 我试图找出它们是哪几行。 And each table is in a different database. 每个表都在不同的数据库中。 How can I do so in MySQL? 如何在MySQL中这样做?
I tried this: 我尝试了这个:
SELECT t1.*
FROM db1.tb1 t1
LEFT JOIN db2.tb2 ON tb1.col_13 = tb2.col_13
WHERE tb2.col_13 IS NULL;
but it is taking FOREVER. 但这是永远的。
EDIT 编辑
Since col_13 is all duplicates, this wont work. 由于col_13是所有重复项,因此将不起作用。 The problem is I cant find a commen primary key between the tables, the primary key between them is datetime, which is almost identical, but because of different scripts used to insert the data into each table, some of the rows have different seconds because of rounding, for example: "2015-09-01 00:00:11" and "2015-09-01 00:00:12" are the same rows but because of rounding they have different seconds. 问题是我找不到表之间的命令主键,它们之间的主键是日期时间,这几乎是相同的,但是由于用于将数据插入每个表的脚本不同,因此某些行具有不同的秒数,原因是舍入,例如:“ 2015-09-01 00:00:11”和“ 2015-09-01 00:00:12”是相同的行,但是由于舍入,它们具有不同的秒数。
Easton's comments are correct. 伊斯顿的评论是正确的。 You are executing the query properly. 您正在正确执行查询。 Your issue isn't with joining two databases but with your query performance. 您的问题不是联接两个数据库,而是查询性能。 To solve that problem, more details about your table structure will be required. 要解决该问题,将需要有关表结构的更多详细信息。 His suggestions are good starting points though, namely make sure and do the join on a column which is unique and indexed in both databases. 不过,他的建议是一个很好的起点,即确保在两个数据库中唯一且已索引的列上进行连接。 That will allow the query to execute as quickly as possible. 这样可以使查询尽快执行。
If you can't easily fix the problem yourself, you may be best off asking a new question and this time focusing on the performance with more details about the table structure. 如果您自己不能轻松解决问题,则最好提出一个新问题,这一次着重于性能,并提供有关表结构的更多详细信息。
This should take just a second or two since your query looks bang-on, so you may need to add indexes to col_13. 这应该只需要一两秒钟,因为查询看起来很麻烦,因此您可能需要向col_13添加索引。 If this only runs once or twice than you may not need that, but if it runs often, I would add indexes. 如果仅运行一次或两次,则可能不需要,但是如果经常运行,则添加索引。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.