[英]Improve delete with IN performance
I struggle to write a DELETE query in MariaDB 5.5.44 database. 我很难在MariaDB 5.5.44数据库中编写DELETE查询。
The first of the two following code samples works great, but I need to add a WHERE statement there. 以下两个代码示例中的第一个效果很好,但是我需要在其中添加WHERE语句。 That is displayed in the second code sample. 这将显示在第二个代码示例中。
I need to delete only rows from polozkyTransakci where puvodFaktury <> FAKTURA VO CZ in transakce_tmp table. 我只需要从polozkyTransakci中删除transakce_tmp表中的puvodFaktury <> FAKTURA VO CZ的行 。 I thought that my WHERE statement in the second sample could have worked ok with the inner SELECT, but it takes forever to process (it takes about 40 minutes in my cloud based ETL tool) and even then it does not leave the rows I want untouched. 我以为第二个示例中的WHERE语句可以与内部SELECT一起使用,但是它需要花很长时间才能处理(在基于云的ETL工具中大约需要40分钟),即使这样,它也不会留下我想要的行。
1. 1。
DELETE FROM polozkyTransakci
WHERE typPolozky = 'odpocetZalohy';
2. 2。
DELETE FROM polozkyTransakci
WHERE typPolozky = 'odpocetZalohy'
AND idTransakce NOT IN (
SELECT idTransakce
FROM transakce_tmp
WHERE puvodFaktury = 'FAKTURA VO CZ');
Thaks a million for any help 向任何人致谢一百万
David 大卫
IN is very bad on performance .. Try using NOT EXISTS() IN对性能非常不利。尝试使用NOT EXISTS()
DELETE FROM polozkyTransakci
WHERE typPolozky = 'odpocetZalohy'
AND NOT EXISTS (SELECT 1
FROM transakce_tmp r
WHERE r.puvodFaktury = 'FAKTURA VO CZ'
AND r.idTransakce = polozkyTransakci.idTransakce );
Before you can performance tune, you need to figure out why it is not deleting the correct rows. 在进行性能调整之前,您需要弄清楚为什么它没有删除正确的行。
So first start with doing selects until you get the right rows identified. 因此,首先要进行选择,直到找到正确的行。 Build your select a bit at time checking the results at each stage to see if you are getting the results you want. 请在每次检查结果时建立一些选择,以查看是否获得所需的结果。
Once you have the select then you can convert to a delete. 选择之后,即可转换为删除。 When testing the delete do it is a transaction and run some test of the data that is left behind to ensure it deleted properly before rolling back or committing. 测试删除时,它是一个事务,并对遗留的数据进行一些测试,以确保在回滚或提交之前正确删除了该数据。 Since you likely want to performance tune, I would suggest rolling back, so that you can then try again on the performance tuned version to ensure you got the same results. 由于您可能想对性能进行调整,因此建议您回退,以便随后可以再次尝试对性能进行调整的版本,以确保获得相同的结果。 Of course, you only want to do this on a dev server! 当然,您只想在开发服务器上执行此操作!
Now while I agree that not exists may be faster, some of the other things you want to look at are: 现在,尽管我同意不存在可能会更快,但是您要查看的其他一些内容是:
Performance tuning is a complex thing and it is best to get read up on it in detail by reading some of the performance tuning books available for your specific database. 性能调优是一件复杂的事情,最好阅读特定数据库可用的一些性能调优书来详细阅读它。
I might be inclined to write the query as a LEFT JOIN
, although I'm guessing this would have the same performance plan as NOT EXISTS
: 我可能倾向于将查询写为LEFT JOIN
,尽管我猜测这将具有与NOT EXISTS
相同的性能计划:
DELETE pt
FROM polozkyTransakci pt LEFT JOIN
transakce_tmp tt
ON pt.idTransakce = tt.idTransakce AND
tt.puvodFaktury = 'FAKTURA VO CZ'
WHERE pt.typPolozky = 'odpocetZalohy' AND tt.idTransakce IS NULL;
I would recommend indexes, if you don't have them: polozkyTransakci(typPolozky, idTransakce)
and transakce_tmp(idTransakce, puvodFaktury)
. 如果没有索引,我会推荐索引: polozkyTransakci(typPolozky, idTransakce)
和transakce_tmp(idTransakce, puvodFaktury)
。 These would work on the NOT EXISTS
version as well. 这些也可以在NOT EXISTS
版本上使用。
You can test the performance of these queries using SELECT
: 您可以使用SELECT
测试这些查询的性能:
SELECT pt.*
FROM polozkyTransakci pt LEFT JOIN
transakce_tmp tt
ON pt.idTransakce = tt.idTransakce AND
tt.puvodFaktury = 'FAKTURA VO CZ'
WHERE pt.typPolozky = 'odpocetZalohy' AND tt.idTransakce IS NULL;
The DELETE
should be slower (due to the cost of logging transactions), but the performance should be comparable. DELETE
应该更慢(由于记录事务的成本),但是性能应该是可比的。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.