简体   繁体   English

如何以最高效的方式从表中删除大量记录?

[英]How to delete heavy records from a table in the most performant way?

This is the case: I have a table with 16,000 rows, with a child table with 4,000,000 rows. 情况是这样的:我有一个包含16,000行的表,一个带有4,000,000行的子表。 The parent table has a column with a lot of data (it's a wkt, used for geometry). 父表的列包含大量数据(这是用于几何图形的wkt)。 I need to cleanup the data periodically, and at this moment I need to delete 5685 parent rows along with 1,400,000 child rows. 我需要定期清理数据,这时我需要删除5685个父行以及1,400,000个子行。 I'm struggling to write the most performant query to achieve this. 我正在努力编写性能最高的查询来实现这一目标。 My current method is this: 我当前的方法是这样的:

1) Get all the ids from the parent table from the rows that needs to be deleted. 1)从需要删除的行中获取父表中的所有ID。

SELECT Id, ValidTo From ParentTable Where ValidTo < someDate; 从ParentTable中选择SELECT,ValidTo,其中ValidTo <someDate;

2) For each id I find I am executing following commands with: 2)对于每个ID,我发现我正在执行以下命令:

DELETE FROM ChildTable WHERE ParentId = IdFromStepOne; 从ChildTable删除,其中ParentId = IdFromStepOne;

DELETE FROM ParentTable WHERE Id = IdFromStepOne 从ParentTable删除Id = IdFromStepOne

This is taking 15 minutes for 95-100 records, so it will be done in 14 hours.. Can this be written more performant? 95-100条记录需要15分钟,因此它将在14小时内完成。 I'm coding in .Net Core and using Entitiy Framework for you information. 我正在.Net Core中进行编码,并使用Entitiy Framework为您提供信息。

Thanks in advance! 提前致谢!

You query shows that you are looping through each id and deleting the child & parent rows. 您的查询显示您正在遍历每个ID并删除子级和父级行。

Use IN clause to perform it for multiple values. 使用IN子句对多个值执行它。

    DELETE FROM ChildTable WHERE ParentId in (SELECT Id From ParentTable Where ValidTo < someDate)

    DELETE FROM ParentTable WHERE Id in (SELECT Id From ParentTable Where ValidTo < someDate)

As you need to delete rows in two tables, you will need 2 queries and the SELECT query doesn't need to select the ValidTo column but only the Id. 由于您需要删除两个表中的行,因此将需要2个查询,而SELECT查询不需要选择ValidTo列,而仅选择ID。

I would write these queries: 我会写这些查询:

DELETE FROM ChildTable ct
WHERE EXISTS (SELECT pt.Id FROM ParentTable pt WHERE ct.Id_parent = pt.Id AND pt.ValidTo < someDate);

DELETE FROM ParentTable
WHERE ValidTo < someDate;

Using pl/sql you should be able to select the ParentTable's Id s to delete only one time. 使用pl / sql,您应该能够选择ParentTable的Id以便仅删除一次。

Query1 => SELECT Id FROM ParentTable WHERE ValidTo < someDate
Query2 => DElETE FROM ChildTable WHERE id_parent IN [results of Query 1]
Query3 => DELETE FROM ParentTable WHERE Id IN [results of Query 1]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM