简体   繁体   English

有没有办法在mysql中进行批量/更快的删除?

[英]Is there any way to do a bulk/faster delete in mysql?

I have a table with 10 million records, what is the fastest way to delete & retain last 30 days. 我有一个包含1000万条记录的表,删除和保留最近30天的最快方法是什么。

I know this can be done in event scheduler, but my worry is if takes too much time, it might lock the table for much time. 我知道这可以在事件调度程序中完成,但我担心的是如果花费太多时间,它可能会锁定表很长时间。

It will be great if you can suggest some optimum way. 如果你能提出一些最佳方法,那就太棒了。

Thanks. 谢谢。

Offhand, I would: 随便,我会:

  1. Rename the table 重命名表格
  2. Create an empty table with the same name as your original table 创建一个与原始表同名的空表
  3. Grab the last 30 days from your "temp" table and insert them back into the new table 从“临时”表中抓取最近30天,然后将它们插回到新表中
  4. Drop the temp table 删除临时表

This will enable you to keep the table live through (almost) the entire process and get the past 30 days worth of data at your leisure. 这将使您能够(几乎)整个过程保持表格直播,并在闲暇时获得过去30天的数据。

You could try partition tables. 你可以试试分区表。

PARTITION BY LIST (TO_DAYS( date_field ))

This would give you 1 partition per day, and when you need to prune data you just: 这将为您提供每天1个分区,当您需要修剪数据时,您只需:

ALTER TABLE tbl_name DROP PARTITION p#

http://dev.mysql.com/doc/refman/5.1/en/partitioning.html http://dev.mysql.com/doc/refman/5.1/en/partitioning.html

To expand on Michael Todd's answer. 扩展Michael Todd的答案。

If you have the space, 如果你有空间,

  1. Create a blank staging table similar to the table you want to reduce in size 创建一个类似于要减小的表的空白临时表
  2. Fill the staging table with only the records you want to have in your destination table 仅使用目标表中要包含的记录填充临时表
  3. Do a double rename like the following 如下所示进行双重重命名

Assuming: table is the table name of the table you want to purge a large amount of data from newtable is the staging table name no other tables are called temptable 假设:table是要从newtable清除大量数据的表的表名,是staging表名,没有其他表被称为temptable

    rename table table to temptable, newtable to table;
    drop temptable;

This will be done in a single transaction, which will require an instantaneous schema lock. 这将在单个事务中完成,这将需要瞬时模式锁定。 Most high concurrency applications won't notice the change. 大多数高并发应用程序都不会注意到这种变化。

Alternatively, if you don't have the space, and you have a long window to purge this data, you can use dynamic sql to insert the primary keys into a temp table, and join the temp table in a delete statement. 或者,如果您没有空间,并且有一个很长的窗口来清除此数据,则可以使用动态sql将主键插入临时表,并将临时表连接到delete语句中。 When you insert into the temp table, be aware of what max_packet_size is. 插入临时表时,请注意max_packet_size是什么。 Most installations of MySQL use 16MB (16777216 bytes). 大多数MySQL安装使用16MB(16777216字节)。 Your insert command for the temp table should be under max_packet_size. temp表的insert命令应该在max_packet_size下。 This will not lock the table. 这不会锁定表格。 You'll want to run optimize table to reclaim space for the rest of the engine to use. 您将需要运行优化表来回收空间以供其他引擎使用。 You probably won't be able to reclaim disk space, unless you were to shutdown the engine and move the data files. 您可能无法回收磁盘空间,除非您要关闭引擎并移动数据文件。

Not that it helps you with your current problem, but if this is a regular occurance, you might want to look into a merge table : just add tables for different periods in time, and remove them from the merge table definition when no longer needed. 并不是它可以帮助您解决当前的问题,但如果这是一个常规的问题,您可能需要查看合并表 :只需添加不同时间段的表,并在不再需要时从merge表定义中删除它们。 Another option is partitioning , in which it is equally trivial to drop a (oldest) partition. 另一种选择是分区 ,其中删除(最旧的)分区同样微不足道。

关闭你的资源, SELECT .. INTO OUTFILE ,解析输出,删除表, LOAD DATA LOCAL INFILE optimized_db.txt - 重新创建比更新更便宜。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM