简体   繁体   English

为什么TokuDB和InnoDB插入比MyISAM这么慢

[英]Why TokuDB and InnoDB insert is so slow compared to MyISAM

I have prepared the following SQL statements to compare the performance behavior of MyISAM, InnoDB, and TokuDB (INSERT is executed for 100000 times): 我准备了以下SQL语句来比较MyISAM,InnoDB和TokuDB的性能行为(INSERT执行了100000次):

MyISAM: MyISAM:

CREATE TABLE `testtable_myisam` (`id` bigint(20) NOT NULL AUTO_INCREMENT, `value1` INT DEFAULT NULL, `value2` INT DEFAULT NULL, PRIMARY KEY (`id`), KEY `index1` (`value1`)) ENGINE=MyISAM DEFAULT CHARSET=utf8;

INSERT INTO `testtable_myisam` (`value1`, `value2`) VALUES (FLOOR(RAND() * 1000), FLOOR(RAND() * 1000)); 

InnoDB: InnoDB:

CREATE TABLE `testtable_innodb` (`id` bigint(20) NOT NULL AUTO_INCREMENT, `value1` INT DEFAULT NULL, `value2` INT DEFAULT NULL, PRIMARY KEY (`id`), KEY `index1` (`value1`)) ENGINE=InnoDB DEFAULT CHARSET=utf8;

INSERT INTO `testtable_innodb` (`value1`, `value2`) VALUES (FLOOR(RAND() * 1000), FLOOR(RAND() * 1000));

TokuDB: TokuDB:

CREATE TABLE `testtable_tokudb` (`id` bigint(20) NOT NULL AUTO_INCREMENT, `value1` INT DEFAULT NULL, `value2` INT DEFAULT NULL, PRIMARY KEY (`id`), KEY `index1` (`value1`)) ENGINE=TokuDB DEFAULT CHARSET=utf8;

INSERT INTO `testtable_tokudb` (`value1`, `value2`) VALUES (FLOOR(RAND() * 1000), FLOOR(RAND() * 1000));

At the beginning, the INSERT performance of InnoDB is almost 50 times slower than MyISAM, and TokuDB is 40 times slower than MyISAM. 最初,InnoDB的INSERT性能比MyISAM慢50倍,而TokuDB则比MyISAM慢40倍。

Then I figure out the setting of "innodb-flush-log-at-trx-commit=2" on InnoDB, to make its INSERT behavior similar with MyISAM. 然后,我在InnoDB上确定“ innodb-flush-log-at-trx-commit = 2”的设置,以使其插入行为与MyISAM相似。

The question is, what should I do on the TokuDB? 问题是,我应该在TokuDB上做什么? I bet the poor INSERT performance of TokuDB is also caused by some inproper setting, but I cannot figure out the reason. 我敢打赌,TokuDB的INSERT性能不佳也是由某些不正确的设置引起的,但我不知道原因。

--------- UPDATE --------- ---------更新---------

Thanks to tmcallaghan's comments, I have modified my setting into "tokudb_commit_sync=OFF", now the insert rate of TokuDB on small dataset seems to be meaningful (I will execute them on large dataset once I figure out the following problem): 感谢tmcallaghan的评论,我将设置修改为“ tokudb_commit_sync = OFF”,现在TokuDB在小型数据集上的插入率似乎很有意义(一旦找出以下问题,我将在大型数据集上执行它们):

However, the select performance of TokuDB is still wired compared to MyISAM and InnoDB with following SQL (wherein the ? is replaced by a different Int by my simulator): 但是,与具有以下SQL的MyISAM和InnoDB相比,TokuDB的选择性能仍然是有线的(其中,?由我的模拟器替换为不同的Int):

SELECT id, value1, value2 FROM testtable_myisam WHERE value1=?; 
SELECT id, value1, value2 FROM testtable_innodb WHERE value1=?; 
SELECT id, value1, value2 FROM testtable_tokudb WHERE value1=?; 

Upon a million dataset, each 10k SELECT statments cost 10 and 15 seconds by MyISAM and InnoDB individually, but TokuDB requires about 40 seconds. 对于一百万个数据集,MyISAM和InnoDB分别花费每10k SELECT语句花费10和15秒,而TokuDB则需要40秒。

Did I miss some other settings? 我还错过其他设置吗?

Thanks in advance! 提前致谢!

This doesn't sound like a very interesting test (100,000 rows is not a lot, and your insertions are not concurrent), but here is the setting you are looking for. 这听起来不像是一个非常有趣的测试(100,000行不是很多,并且您的插入不是并发的),但这是您要寻找的设置。

Issuing "set tokudb_commit_sync=0;" 发出“设置为tokudb_commit_sync = 0;” will turn off fsync() on commit operations. 将在提交操作上关闭fsync()。 Note that there are no durability guarantees in this mode. 请注意,在这种模式下没有耐用性保证。

As I mentioned prior, TokuDB's strength is indexing data that is significantly larger than RAM, and this test is not. 如前所述,TokuDB的优势在于索引比RAM大得多的数据,而这项测试并非如此。

The reason why transactional engines are slower is because they force the hard disk to confirm it wrote the data down. 事务引擎较慢的原因是因为它们迫使硬盘确认它已将数据记下来。 For the HDD to write data down, it has to position the head over the magnetic disk plate and stream the data. 为了使HDD向下写入数据,它必须将磁头放置在磁盘板上,并传输数据。 Each transaction means that the disk will position the magnetic needle over the head, write the data down and tell the OS that it's there for sure. 每笔交易都意味着磁盘会将磁针放在磁头上方,将数据写下来,并告诉操作系统它肯定在那里。

The reason transactional engines do that is so they can conform to the D part of ACID. 事务引擎这样做的原因是,它们可以符合ACID的D部分。 They ensure you that data you wanted to be written down, is, in fact, written down permanently. 它们可确保您实际上要永久记录要记录的数据。 MyISAM doesn't do that. MyISAM不这样做。

Thus, the speed of insert is proportional to the number of Input Output Operations per Second (IOPS) of the hard disk. 因此,插入速度与硬盘的每秒输入输出操作数(IOPS)成正比。 That also means, if you wrap several queries in one transaction, you can exploit the write speed bandwith of the mentioned drives. 这也意味着,如果将多个查询包装在一个事务中,则可以利用上述驱动器的写入速度带宽。 Also, that implies that drives with high IOPS (SSD for example, have 40+ thousand IOPS and mechanical ones range at about 250 - 300, but don't take my word for exact numbers). 同样,这意味着具有高IOPS的驱动器(例如,SSD具有40+千次IOPS,而机械IOPS的范围约为250-300,但是我不相信确切的数字)。

Long story short, if you want really fast inserts using transactional engines - wrap multiple queries in a single transaction. 长话短说,如果您想使用事务引擎真正快速插入-将多个查询包装在一个事务中。 All the "optimizations" you do are slightly violating the D part of ACID, because the engines will try to exploit various fast memories lying around that can be used as buffers. 您所做的所有“优化”都略微违反了ACID的D部分,因为引擎将尝试利用周围的各种快速存储器作为缓冲区。 That means, if something goes wrong, such as you lose power - kiss your data goodbye. 这意味着,如果出现问题(例如断电),请告别数据。

Also, the tests conducted by you are actually bad because they're on small scale. 而且,您进行的测试实际上是不好的,因为它们规模很小。 Both InnoDB and especially TokuDB are designed to contain hundreds of gigabytes of data and to offer linear performance. InnoDB尤其是TokuDB均设计为包含数百GB的数据并提供线性性能。

I have updated my.cnf into something below, now the overall performance looks better. 我已经将my.cnf更新为以下内容,现在整体性能看起来更好。

For 10k times of SELECT from MyISAM, it takes 4 seconds, whereby InnoDB takes 5 seconds, and TokuDB takes 8 seconds. 对于MyISAM进行SELECT的10k次,它需要4秒,而InnoDB需要5秒,而TokuDB需要8秒。 So can I conclude under below configuration, TokuDB is behaving similar (even not necessary better) with MyISAM and InnoDB. 因此,我可以在以下配置下得出结论,TokuDB与MyISAM和InnoDB的行为相似(甚至不一定更好)。

Indeed, I am curious about tons of performance comparison between InnoDB vs TokuDB, but not MyISAM vs TokuDB, why? 确实,我对InnoDB与TokuDB之间的大量性能比较感到好奇,但对MyISAM与TokuDB之间的性能比较却感到好奇,为什么?


tokudb_commit_sync=0

max_allowed_packet = 1M
table_open_cache = 128
read_buffer_size = 2M
read_rnd_buffer_size = 8M
myisam_sort_buffer_size = 64M
thread_cache_size = 8
query_cache_size = 32M
thread_concurrency = 8

innodb_flush_log_at_trx_commit=2
innodb_buffer_pool_size = 2G
innodb_additional_mem_pool_size = 20M
innodb_log_buffer_size = 8M
innodb_lock_wait_timeout = 50

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM