[英]How do I increase the speed of a large series of UPDATEs in mySQL vs SQL Server?
I have an application which I'm writing in Java with simple SQL, so no custom MySQL or SQL Server here - it might have to run on either. 我有一个用简单的SQL用Java编写的应用程序,因此这里没有自定义的MySQL或SQL Server-它可能必须在两者上运行。 One data persist operation has to grab the data out of the DB, compare it with what has been submitted and then insert, update or delete accordingly.
一种数据持久化操作必须从数据库中获取数据,将其与已提交的数据进行比较,然后进行相应的插入,更新或删除。
I've improved the performance of the operation considerably by batching the JDBC calls. 通过批处理JDBC调用,我大大提高了操作性能。
So my INSERTs - I just call the Statement.addBatch()
method for the whole data set to be inserted, and the JDBC driver creates 因此,我的INSERT-我只针对要插入的整个数据集调用
Statement.addBatch()
方法,然后JDBC驱动程序创建
INSERT INTO data (parentId, seriesDate, valueDate, value)
VALUES (a,b,c,d),(a,b,e,f),(a,b,g,h)... etc
The DELETEs - I just delete the whole lot with 该DELETE操作 -我只是删除了一大堆与
DELETE FROM data WHERE parentId = a AND seriesDate = b;
and I can re-insert them. 我可以重新插入它们。 (It may be better to take another approach by composing a big long
(最好是采用另一种方法,编写一个较长的
DELETE FROM data WHERE (parentId = 1 AND seriesDate = b)
OR (parentId = 2 AND seriesDate = c)
OR (parentId = 3 AND seriesDate = d) ...
but that's not the issue here, my main problem is that the UPDATEs are really slow - twice as slow as the INSERTs 但这不是这里的问题,我的主要问题是UPDATE确实很慢-是INSERT的两倍。
I get 1000 separate statements: 我得到1000个单独的语句:
UPDATE data SET value = 4
WHERE parentId = 1 AND seriesDate = '' AND valueDate = '';
In SQL Server, the UPDATEs are just as quick as the INSERTs , but in MySQL I am seeing it run 10 x slower. 在SQL Server中, UPDATE与INSERT一样快,但是在MySQL中,我看到它的运行速度慢了10倍。
I am hoping I've forgotten some mutually compatible approach, or missed out on some JDBC connection configuration I need to adjust, maybe in conjunction with the number of items I'm putting in each batch. 我希望我忘记了一些相互兼容的方法,或者错过了一些我需要调整的JDBC连接配置,也许与我在每批中放入的项目数结合在一起。
[UPDATE 2018-05-17] Here's the requested DDL - and unfortunately I can't change this (yet) so any suggestions that involve schema changes won't help, at least not this year :( [UPDATE 2018-05-17]这是所请求的DDL-不幸的是,我无法更改此(至今),因此任何涉及架构更改的建议都将无济于事,至少在今年不行:(
CREATE TABLE data (
parentId INT UNSIGNED NOT NULL,
seriesDate DATE NOT NULL,
valueDate DATE NOT NULL,
value FLOAT NOT NULL,
versionstamp INT UNSIGNED NOT NULL DEFAULT 1,
createdDate DATETIME DEFAULT CURRENT_TIMESTAMP,
last_modified DATETIME DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
CONSTRAINT pk_data PRIMARY KEY (parentId, seriesDate, valueDate),
CONSTRAINT fk_data_forecastid FOREIGN KEY (parentId)
REFERENCES forecast (id)
) MAX_ROWS 222111000;
CREATE TRIGGER trg_data_update BEFORE UPDATE ON data
FOR EACH ROW SET NEW.versionstamp = OLD.versionstamp + 1;
CREATE INDEX ix_data_seriesdate ON `data` (seriesDate);
The INSERT : 插入 :
INSERT INTO `data` (`parentId`, `valueDate`, `value`, `seriesDate`)
VALUES (52031,'2010-04-20',1.12344,'2013-01-10')
EXPLAIN PLAN:
id: 1
select_type: INSERT
table: data
partitions:
type: ALL
possible_keys: PRIMARY,ix_data_seriesdate
and the UPDATE : 和更新 :
UPDATE `data` SET `value` = -2367.0
WHERE `parentId` = 52005 AND `seriesDate` = '2018-04-20' AND `valueDate` = '2000-02-11'
EXPLAIN PLAN:
id: 1
select_type: UPDATE
table: data
partitions:
type: range
possible_keys: PRIMARY,ix_data_seriesdate
key: PRIMARY
key_len: 10
ref: const,const,const
rows: 1
filtered: 100
Extra: Using where
and the DELETE : 和DELETE :
DELETE FROM `data` WHERE `parentId` = 52030 AND `seriesDate` = '2018-04-20'
EXPLAIN PLAN:
id: 1
select_type: DELETE
table: data
partitions:
type: range
possible_keys: PRIMARY,ix_data_seriesdate
key: PRIMARY
key_len: 7
ref: const,const
rows: 1
filtered: 100
Extra: Using where
FYI 2 fields are updated automatically - last_modified
by the ON UPDATE
clause and versionstamp
by the trigger (and again, I can't ditch that functionality). FYI 2字段自动更新-由
ON UPDATE
子句进行last_modified
并由触发器进行versionstamp
(同样,我无法放弃该功能)。
Ways I've found to improve UPDATE statements: 我发现改进UPDATE语句的方法:
EG 例如
CREATE TABLE #TempData (
parentId INT UNSIGNED NOT NULL,
seriesDate DATE NOT NULL,
valueDate DATE NOT NULL,
value FLOAT NOT NULL
);
INSERT INTO #TempData ( parentId, seriesDate, valueDate, value ) VALUES ( .... ), ( .... ), ( .... );
UPDATE
data
SET
value = #TempData.value
FROM
#TempData
WHERE
data.parentId = #TempData.parentId AND
data.seriesDate = #TempData.seriesDate AND
data.valueDate = #TempData.valueDate;
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.