简体   繁体   English

更改大型MySQL InnoDB表

[英]Changing Large MySQL InnoDB Tables

Adding a new column or adding a new index can take hours and days for large innodb tables in MySQL with more than 10 million rows. 对于MySQL中具有超过1000万行的大型innodb表,添加新列或添加新索引可能要花费数小时和数天的时间。 What is the best way to increase the performance on large innodb tables in these two cases? 在这两种情况下,提高大型innodb表性能的最佳方法是什么? More memory, tweaking the configuration (for example increasing the sort_buffer_size or innodb_buffer_pool_size ), or some kind of trick? 更存储器,调整上述结构(例如增加sort_buffer_size的值innodb_buffer_pool_size ),或某种特技的? Instead of altering a table directly, one could create a new one, change it, and copy the old data the new, like this which is useful for ISAM tables and multiple changes : 与其直接更改一个表,不如创建一个新表,对其进行更改,然后将旧数据复制为新数据,这样对ISAM表多项更改非常有用:

CREATE TABLE tablename_tmp LIKE tablename;
ALTER TABLE tablename_tmp ADD fieldname fieldtype;
INSERT INTO tablename_tmp SELECT * FROM tablename;
ALTER TABLE tablename RENAME tablename_old;
ALTER TABLE tablename_tmp RENAME tablename;

Is it recommendable for innodb tables, too, or is it just what the ALTER TABLE command does anway? 它是否也推荐用于innodb表,或者只是ALTER TABLE命令执行的操作?

Edit 2016: we've recently (August 2016) released gh-ost , modifying my answer to reflect it. Edit 2016:我们最近(2016年8月)发布了gh-ost ,修改了我的回答以反映出来。

Today there are several tools which allow you to do online alter table for MySQL. 今天,有几种工具可让您为MySQL做在线更改表。 These are: 这些是:

Let's consider the "normal" `ALTER TABLE`: 让我们考虑“正常”的“ ALTER TABLE”:

A large table will take long time to ALTER . 大桌子需要很长时间才能ALTER innodb_buffer_pool_size is important, and so are other variables, but on very large table they are all negligible. innodb_buffer_pool_size很重要,其他变量也很重要,但是在很大的表上,它们都可以忽略不计。 It just takes time. 这只需要时间。

What MySQL does to ALTER a table is to create a new table with new format, copy all rows, then switch over. MySQL ALTER表的作用是创建具有新格式的新表,复制所有行,然后切换。 During this time the table is completely locked. 在此期间,表被完全锁定。

Consider your own suggestion: 考虑您自己的建议:

It will most probably perform worst of all options. 它最有可能在所有选项中表现最差。 Why is that? 这是为什么? Because you're using an InnoDB table, the INSERT INTO tablename_tmp SELECT * FROM tablename makes for a transaction. 因为您使用的是InnoDB表,所以INSERT INTO tablename_tmp SELECT * FROM tablename可以进行事务处理。 a huge transaction. 一笔巨大的交易。 It will create even more load than the normal ALTER TABLE . 与正常的ALTER TABLE相比,它将产生更多的负载。

Moreover, you will have to shut down your application at that time so that it does not write ( INSERT , DELETE , UPDATE ) to your table. 而且,您此时必须关闭您的应用程序,以使其不向表写入( INSERTDELETEUPDATE )。 If it does - your whole transaction is pointless. 如果可以,那么您的整个交易毫无意义。

What the online tools provide 在线工具提供了什么

The tools do not all work alike. 这些工具并非都一样。 However, the basics are shared: 但是,基本知识是共享的:

  • They create a "shadow" table with altered schema 他们创建了具有更改架构的“影子”表
  • They create and use triggers to propagate changes from original table to ghost table 他们创建并使用触发器将更改从原始表传播到幻像表
  • They slowly copy all the rows from your table to shadow table. 他们慢慢地将表中的所有行复制到影子表。 They do so in chunks: say, 1,000 rows at a time. 他们这样做是成块的:例如,一次1,000行。
  • They do all the above while you are still able to access and manipulate the original table. 在您仍然可以访问和操作原始表的同时,它们可以完成上述所有操作。
  • When satisfied, they swap the two, using a RENAME . 满意后,他们使用RENAME交换两者。

The openark-kit tool has been in use for 3.5 years now. openark-kit工具已经使用了3.5年。 The Percona tool is a few months old, but possibly more tested then the former. Percona工具已经使用了几个月,但可能比前者经过了更多的测试。 Facebook's tool is said to work well for Facebook, but does not provide with a general solution to the average user. 据说Facebook的工具很适合Facebook,但没有为普通用户提供通用的解决方案。 I haven't used it myself. 我自己没有用过。

Edit 2016: gh-ost is a triggerless solution, which significantly reduces master write-load on the master, decoupling the migration write load from the normal load. 编辑2016年: gh-ost是一种无触发的解决方案,可显着减少主服务器上的主服务器写入负载,从而将迁移写入负载与正常负载分离。 It is auditable, controllable, testable. 它是可审核,可控制,可测试的。 We've developed it internally at GitHub and released it as open source; 我们已经在GitHub内部进行了开发,并将其作为开源发布。 we're doing all our production migrations via gh-ost today. 我们今天通过gh-ost进行所有生产迁移。 See more here . 在这里查看更多。

Each tool has its own limitations, look closely at documentation. 每个工具都有其自身的局限性,请仔细阅读文档。

The conservative way 保守的方式

The conservative way is to use an Active-Passive Master-Master replication, do the ALTER on the standby (passive) server, then switch roles and do the ALTER again on what used to be the active server, now turned passive. 保守的方法是使用主动-被动主-主复制,在备用(被动)服务器上执行ALTER ,然后切换角色,然后对以前变为主动(现在变为被动)的服务器再次执行ALTER This is also a good option, but requires an additional server, and deeper knowledge of replication. 这也是一个不错的选择,但是需要额外的服务器和更深的复制知识。

Rename screws up referenced tables. 重命名参考表的螺丝。

If you have say table_2 which is child to tablename , on ALTER TABLE tablename RENAME tablename_old; 如果您说出table_2tablename子级,则在ALTER TABLE tablename RENAME tablename_old; table_2 will start pointing to tablename_old . table_2将开始指向tablename_old

Now without altering table_2 you cannt point it back to tablename . 现在,无需更改table_2,您就无法将其指向tablename You have to keep on going making alters in every child and referenced table. 您必须继续在每个子表和引用表中进行更改。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM