简体   繁体   English

MySQL插入查询优化

[英]MySQL Insert Query Optimization

Which of the two methods below will be faster for inserting a large number of rows into a table.将大量行插入表中时,以下两种方法中的哪一种会更快。

Query method 1 : Execute query one by one.查询方法一:一一执行查询。

INSERT INTO tbl_user(id, name, number) VALUES(NULL, 'A', '9999999999');
INSERT INTO tbl_user(id, name, number) VALUES(NULL, 'B', '9999999999');
INSERT INTO tbl_user(id, name, number) VALUES(NULL, 'C', '9999999999');

Query method 2 : Execute query at once.查询方法2:立即执行查询。

INSERT INTO tbl_user(id, name, number) VALUES(NULL, 'A', '9999999999'),
                                             (NULL, 'B', '9999999999'), 
                                             (NULL, 'C', '9999999999');

Since there are a few arguments, I thought I would try a benchmark, but first由于有一些争论,我想我会尝试一个基准,但首先

 CREATE TABLE `tbl_user` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(20) DEFAULT NULL,
  `number` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB 

I then generate, SQL queries of the form in the question with 2 lines of python.然后我用 2 行 python 生成问题中表单的 SQL 查询。

Scenario 1:场景一:
Many single inserts with the queries each being exactly the same许多单个插入,每个查询都完全相同

INSERT INTO tbl_user VALUES(NULL,'A','9999999');
INSERT INTO tbl_user VALUES(NULL,'A','9999999');

1000 Rows; 1000 行; Average (mean) running time of three executions 45.80 seconds三个执行的平均(平均)运行时间 45.80 秒
5000 Rows; 5000 行; single run 220 seconds单次运行 220 秒

Scenario 2:场景2:
A single query to insert a 1000 rows, it looks like this:插入 1000 行的单个查询,如下所示:

INSERT INTO tbl_user VALUES(NULL,'A','9999999'),
(NULL,'A','9999999'),
(NULL,'A','9999999'),
(NULL,'A','9999999'),

1000 rows Average (mean) running time of three executions 0.17 Seconds 1000 行 3 次执行的平均(平均)运行时间 0.17 秒
5000 rows Average (mean) running time of three executions 0.48 5000 行 3 次执行的平均(平均)运行时间 0.48
10000 rows Average (mean) running time of three executions 1.06 10000 行 3 次执行的平均(平均)运行时间 1.06

Scenario 3:场景3:
Similar to scenario 1 but with a START TRANSACTION and COMMIT wrapped around the insert statements与场景 1 类似,但在插入语句周围有一个START TRANSACTIONCOMMIT

1000 rows Average (mean) running time of three executions 0.16 seconds 1000 行 3 次执行的平均(平均)运行时间 0.16 秒
5000 rows Average (mean) running time of three executions 0.48 5000 行 3 次执行的平均(平均)运行时间 0.48
10000 rows Average (mean) running time of three executions 0.91 10000 行 3 次执行的平均(平均)运行时间 0.91

Conclusion:结论:
Scenario 2, which is what's proposed in two other answers indeed outperforms scenario 1 in a big way.方案 2,即其他两个答案中提出的方案,确实在很大程度上优于方案 1。 With this data, it's hard to choose between 2 and 3 though.有了这些数据,很难在 2 和 3 之间做出选择。 More rigorous testing with a larger number of inserts is required.需要使用大量插入件进行更严格的测试。 But without that information i would probably go with three, the reason being that parsing a very large string usually has it's overheads and so does preparing one!但是如果没有这些信息,我可能会选择三个,原因是解析一个非常大的字符串通常会有开销,准备一个也是如此! I suspect that if we tried to insert about 50,000 records at once in a single statement it might actually be a lot slower.我怀疑如果我们尝试在单个语句中一次插入大约 50,000 条记录,它实际上可能会慢很多。

Second method(query) is faster then the first one.第二种方法(查询)比第一种方法快。

Because in first method it executes three different queries on a table where as in second method it gets executed only once to insert multiple records in table.因为在第一种方法中它在表上执行三个不同的查询,而在第二种方法中它只执行一次以在表中插入多条记录。

You will see major difference when you insert hundreds of rows at a single time.当您一次插入数百行时,您会看到重大差异。

Second query is much faster than first.第二个查询比第一个查询快得多。 As per document factors contributing to increase in performance of multiple insert in a single statement are:-根据文档,有助于提高单个语句中多个插入的性能的因素是:-

9.2.2.1 Speed of INSERT Statements 9.2.2.1 INSERT 语句的速度

To optimize insert speed, combine many small operations into a single large operation.要优化插入速度,请将许多小操作合并为一个大操作。 Ideally, you make a single connection, send the data for many new rows at once, and delay all index updates and consistency checking until the very end.理想情况下,您建立一个连接,一次发送许多新行的数据,并将所有索引更新和一致性检查延迟到最后。

The time required for inserting a row is determined by the following factors, where the numbers indicate approximate proportions:插入一行所需的时间由以下因素决定,其中数字表示大致比例:

Connecting: (3)连接: (3)

Sending query to server: (2)向服务器发送查询:(2)

Parsing query: (2)解析查询:(2)

Inserting row: (1 × size of row)插入行:(1 × 行大小)

Inserting indexes: (1 × number of indexes)插入索引:(1 × 索引数)

Closing: (1)闭幕式:(1)

If you are inserting many rows from the same client at the same time, use INSERT statements with multiple VALUES lists to insert several rows at a time.如果您同时从同一客户端插入多行,请使用带有多个 VALUES 列表的 INSERT 语句一次插入多行。 This is considerably faster (many times faster in some cases) than using separate single-row INSERT statements.这比使用单独的单行 INSERT 语句快得多(在某些情况下快很多倍)。 If you are adding data to a nonempty table, you can tune the bulk_insert_buffer_size variable to make data insertion even faster.如果要将数据添加到非空表,则可以调整 bulk_insert_buffer_size 变量以加快数据插入速度。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM