[英]Changing Large MySQL InnoDB Tables
Adding a new column or adding a new index can take hours and days for large innodb tables in MySQL with more than 10 million rows. 对于MySQL中具有超过1000万行的大型innodb表,添加新列或添加新索引可能要花费数小时和数天的时间。 What is the best way to increase the performance on large innodb tables in these two cases?
在这两种情况下,提高大型innodb表性能的最佳方法是什么? More memory, tweaking the configuration (for example increasing the sort_buffer_size or innodb_buffer_pool_size ), or some kind of trick?
更存储器,调整上述结构(例如增加sort_buffer_size的值或innodb_buffer_pool_size ),或某种特技的? Instead of altering a table directly, one could create a new one, change it, and copy the old data the new, like this which is useful for ISAM tables and multiple changes :
与其直接更改一个表,不如创建一个新表,对其进行更改,然后将旧数据复制为新数据,这样对ISAM表和多项更改非常有用:
CREATE TABLE tablename_tmp LIKE tablename;
ALTER TABLE tablename_tmp ADD fieldname fieldtype;
INSERT INTO tablename_tmp SELECT * FROM tablename;
ALTER TABLE tablename RENAME tablename_old;
ALTER TABLE tablename_tmp RENAME tablename;
Is it recommendable for innodb tables, too, or is it just what the ALTER TABLE command does anway? 它是否也推荐用于innodb表,或者只是ALTER TABLE命令执行的操作?
Edit 2016: we've recently (August 2016) released gh-ost
, modifying my answer to reflect it. Edit 2016:我们最近(2016年8月)发布了
gh-ost
,修改了我的回答以反映出来。
Today there are several tools which allow you to do online alter table for MySQL. 今天,有几种工具可让您为MySQL做在线更改表。 These are:
这些是:
A large table will take long time to ALTER
. 大桌子需要很长时间才能
ALTER
。 innodb_buffer_pool_size
is important, and so are other variables, but on very large table they are all negligible. innodb_buffer_pool_size
很重要,其他变量也很重要,但是在很大的表上,它们都可以忽略不计。 It just takes time. 这只需要时间。
What MySQL does to ALTER
a table is to create a new table with new format, copy all rows, then switch over. MySQL
ALTER
表的作用是创建具有新格式的新表,复制所有行,然后切换。 During this time the table is completely locked. 在此期间,表被完全锁定。
It will most probably perform worst of all options. 它最有可能在所有选项中表现最差。 Why is that?
这是为什么? Because you're using an InnoDB table, the
INSERT INTO tablename_tmp SELECT * FROM tablename
makes for a transaction. 因为您使用的是InnoDB表,所以
INSERT INTO tablename_tmp SELECT * FROM tablename
可以进行事务处理。 a huge transaction. 一笔巨大的交易。 It will create even more load than the normal
ALTER TABLE
. 与正常的
ALTER TABLE
相比,它将产生更多的负载。
Moreover, you will have to shut down your application at that time so that it does not write ( INSERT
, DELETE
, UPDATE
) to your table. 而且,您此时必须关闭您的应用程序,以使其不向表写入(
INSERT
, DELETE
, UPDATE
)。 If it does - your whole transaction is pointless. 如果可以,那么您的整个交易毫无意义。
The tools do not all work alike. 这些工具并非都一样。 However, the basics are shared:
但是,基本知识是共享的:
RENAME
. RENAME
交换两者。 The openark-kit tool has been in use for 3.5 years now. openark-kit工具已经使用了3.5年。 The Percona tool is a few months old, but possibly more tested then the former.
Percona工具已经使用了几个月,但可能比前者经过了更多的测试。 Facebook's tool is said to work well for Facebook, but does not provide with a general solution to the average user.
据说Facebook的工具很适合Facebook,但没有为普通用户提供通用的解决方案。 I haven't used it myself.
我自己没有用过。
Edit 2016: gh-ost
is a triggerless solution, which significantly reduces master write-load on the master, decoupling the migration write load from the normal load. 编辑2016年:
gh-ost
是一种无触发的解决方案,可显着减少主服务器上的主服务器写入负载,从而将迁移写入负载与正常负载分离。 It is auditable, controllable, testable. 它是可审核,可控制,可测试的。 We've developed it internally at GitHub and released it as open source;
我们已经在GitHub内部进行了开发,并将其作为开源发布。 we're doing all our production migrations via
gh-ost
today. 我们今天通过
gh-ost
进行所有生产迁移。 See more here . 在这里查看更多。
Each tool has its own limitations, look closely at documentation. 每个工具都有其自身的局限性,请仔细阅读文档。
The conservative way is to use an Active-Passive Master-Master replication, do the ALTER
on the standby (passive) server, then switch roles and do the ALTER
again on what used to be the active server, now turned passive. 保守的方法是使用主动-被动主-主复制,在备用(被动)服务器上执行
ALTER
,然后切换角色,然后对以前变为主动(现在变为被动)的服务器再次执行ALTER
。 This is also a good option, but requires an additional server, and deeper knowledge of replication. 这也是一个不错的选择,但是需要额外的服务器和更深的复制知识。
Rename screws up referenced tables. 重命名参考表的螺丝。
If you have say table_2
which is child to tablename
, on ALTER TABLE tablename RENAME tablename_old;
如果您说出
table_2
是tablename
子级,则在ALTER TABLE tablename RENAME tablename_old;
table_2
will start pointing to tablename_old
. table_2
将开始指向tablename_old
。
Now without altering table_2 you cannt point it back to tablename
. 现在,无需更改table_2,您就无法将其指向
tablename
。 You have to keep on going making alters in every child and referenced table. 您必须继续在每个子表和引用表中进行更改。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.