简体   繁体   中英

MySQL table - how to do the INSERTing fast on a billion row table?

I have a table with around 20 Million rows that will eventually be around 1 billion rows as I populate it.

I've noticed that the Inserting became really slow and I am barely at 2% of the completion.

The create table statement is:

    CREATE TABLE `slots` (
 `customerid` int(11) NOT NULL,
 `orderid` int(11) NOT NULL,
 `queueid` int(11) NOT NULL AUTO_INCREMENT,
 `item_id` int(3) NOT NULL,
 `variable1` int(3) NOT NULL,
 `variable2` int(3) NOT NULL,
 `variable3` int(3) NOT NULL,
 `variable4` int(3) NOT NULL,
 `variable5` int(3) NOT NULL,
 `variable6` int(3) NOT NULL,
 `variable7` tinyint(1) NOT NULL,
 `variable8` tinyint(1) NOT NULL,
 `variable9` tinyint(1) NOT NULL,
 PRIMARY KEY (`customerid`,`orderid`,`queueid`),
 UNIQUE KEY `queueid` (`queueid`),
 KEY `orderid` (`orderid`)
) ENGINE=InnoDB AUTO_INCREMENT=25883472 DEFAULT CHARSET=latin1

Right now the inserting PHP script works like this:

  1. Get the data.
  2. Create 500-1000 rows INSERT IGNORE statement ( to insert them in one "go" and not by having 500-1000 separate INSERT operations ).
  3. Execute the INSERT IGNORE query.

I am stuck at 22 INSERTs per minute( times 500-1000 - which equals around 11000-22000 rows inserted per minute or about 150-350 row inserts per second ) which I think could be faster? At this rate it would take me months to gather all of the data and I need to do it in a few days...

I am gathering the data via API - maybe I could get it all into a big file and then somehow import it in one batch via something else than batch INSERTs?

I am really puzzled with this and would really appreciate help of someone knowledgeable.

This is too long for a comment.

Drop the indexes and add them back again after you have loaded the table. It will take a while to build, but the loads will be much faster.

You should "get" the table as sets of files. Load the files individually, either directly into the table or into staging tables. If the latter, you should do a single insert statement as something like:

insert into t(...)
   select . . .
   from staging_table;

(there might be a where clause as well).

What I really don't understand is why you have a three part primary key, one of whose components is an auto-incremented column. It seems that queue_id should be the primary key (and that would further lead to the observation that the table should be called queues or queue_id should really be slot_id ).

To use the multiple-row insert syntax is the most performant way is

INSERT INTO table (field, field2, etc) VALUES (1, 2, etc),(1, 2, etc),(1, 2, etc);

Manual :

INSERT statements that use VALUES syntax can insert multiple rows. To do this, include multiple lists of column values, each enclosed within parentheses and separated by commas. Example:

INSERT INTO tbl_name (a,b,c) VALUES(1,2,3),(4,5,6),(7,8,9);

The values list for each row must be enclosed within parentheses.

Edited.

See here, there are nice information about this: http://dev.mysql.com/doc/refman/5.1/en/insert-speed.html

Including

  • Using multiple VALUES lists
  • Leveraging the INSERT DELAYED feature
  • "Concurrent Inserts"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM