简体   繁体   中英

Large background updates with MySQL

I'm building a web service, backed by MySQL, which caches and indexes data from an outside source at regular intervals (say, twice a day). The update routine is the only thing which modifies the cached data; to the rest of the service, this data is read-only. Additionally, the data is retrieved through multiple HTTP requests to the external source. The number of requests is proportional to the amount of data retrieved. Assume that when combined, the data does not fit into memory. I'm striving for the following:

  1. For the update to be atomic from the perspective of the rest of the service. The service should not serve half-updated data.
  2. For the bulk insertion of the new data to be reasonably fast. The updates and insertions should not use separate transactions but run in a single transaction. There should be a single commit at the end.
  3. For these updates to interrupt the rest of the service as little as possible. The bulk update should not lock other sessions out of accessing the old data while the update is occurring.

I'm using InnoDB.

Let's say I have a database called webservice which contains a table named data . An obvious first attempt to update the data would be the following:

START TRANSACTION;
INSERT INTO `data`(`row1`, `row2`, `row3`) VALUES ('val1', 'val2', 'val3');
INSERT INTO `data`(`row1`, `row2`, `row3`) VALUES ('val4', 'val5', 'val6');
UPDATE `data` SET `row2` = 'val7' WHERE `id` = 3;
/* And so on for a very large number of INSERTs and UPDATEs. */
COMMIT;

As far as I know, this satisfies 1 and 2, but violates 3.

I have in mind another solution which appears to satisfy 1, 2, and 3. This uses "temp" tables in another database where the new data will be inserted, then swaps the tables.

START TRANSACTION;
DROP TABLE IF EXISTS `webservice_temp`.`data`;
CREATE TABLE `webservice_temp`.`data` LIKE `webservice`.`data`;
INSERT INTO `webservice_temp`.`data`
  SELECT * from `webservice`.`data`;
INSERT INTO `data`(`row1`, `row2`, `row3`) VALUES ('val1', 'val2', 'val3');
/* etc. */
COMMIT;
RENAME TABLE `webservice_temp`.`data` TO `webservice`.`data`;

Is this a good solution to my problem?

If you are using InnoDB you can use your first approach (and it will satisfy all three requirements) by using START TRANSACTION WITH CONSISTENT SNAPSHOT . This allows ongoing read requests to be served from a snapshot of the original data at the time of start transaction.

The WITH CONSISTENT SNAPSHOT modifier starts a consistent read for storage engines that are capable of it. This applies only to InnoDB.

A consistent read means that InnoDB uses multi-versioning to present to a query a snapshot of the database at a point in time. The query sees the changes made by transactions that committed before that point of time, and no changes made by later or uncommitted transactions.

A read operation that uses snapshot information to present query results based on a point in time, regardless of changes performed by other transactions running at the same time. If queried data has been changed by another transaction, the original data is reconstructed based on the contents of the undo log. http://dev.mysql.com/doc/refman/5.7/en/glossary.html#glos_consistent_read

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM