简体   繁体   中英

Why do INNODB bulk inserts cause the auto increment primary key to skip values?

I am familiar with the fact that failed inserts in InnoDB leads to gaps in an auto-incremented primary key, I believe this is unrelated (it's not obvious to me). The problem that I am having is that I will insert 5 records into a table with an auto incremented primary key which cause the primary key to have a gap from 5 to 8. Here is a script to repeat the problem.

DROP TABLE IF EXISTS `test_table`;
CREATE TABLE `test_table` (
`A` int(10) unsigned NOT NULL AUTO_INCREMENT,
`B` int(10) unsigned NOT NULL,
PRIMARY KEY (`A`)
) ENGINE=INNODB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;

insert into `test_table` (`B`)
SELECT 1
UNION
SELECT 2
UNION
SELECT 3
UNION
SELECT 4
UNION
SELECT 5;

insert into `test_table` (`B`)
SELECT 6
UNION
SELECT 7
UNION
SELECT 8
UNION
SELECT 9
UNION
SELECT 10;

SELECT * FROM `test_table`;

I would expect both columns A and B to be the same in this case, but at the beginning of the second insert A will be 8, and B will be 6. Does anyone know why this gap occurs?

I've looked at the documentation here: http://dev.mysql.com/doc/refman/5.1/en/innodb-auto-increment-handling.html but these lock modes do not seem related to this gap (I could be missing something).

For some reason 'INSERT INTO test_table ( B ) SELECT 1 UNION SELECT 2' sets AUTO_INCREMENT = 4, but this inserts work as expected -

INSERT INTO `test_table` (`B`)
VALUES (1), (2), (3), (4), (5);

INSERT INTO `test_table` (`B`)
VALUES (6), (7), (8), (9), (10);

Try to set innodb_autoinc_lock_mode = 0 and run your script again, there won't be gaps.

AUTO_INCREMENT Handling in InnoDB .

I'd think the query analyzer was asked for a rough estimate on how many rows would be returned by the inner query, and it was slightly off.

The database engine needs to strive for three conflicting goals:

  • the transaction needs to be atomic, so any numbers generated from the sequence in the transaction must be consecutive.
  • the sequence should not be locked for an extended amount of time (otherwise, other INSERT operations would be blocked), and
  • the inner query should not need to realize the full result table, as that may be rather large.

As noone guaranteed sequence numbers to be consecutive globally, the best approach is to get an upper bound of the number of rows that have to be inserted, then increment the sequence (atomically) by that amount and use the block so reserved for the INSERT operation.

This guarantees that a concurrent INSERT will have either smaller or larger primary key values (ie the queries are properly serialized), but the data can be added in between rows (as there is no ordering unless you use an ORDER BY clause)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM