简体   繁体   English

如何在不构建字符串的情况下使用 JDBC 进行扩展插入?

[英]How do you do an extended insert using JDBC without building strings?

I've got an application that parses log files and inserts a huge amount of data into database.我有一个应用程序可以解析日志文件并将大量数据插入数据库。 It's written in Java and talks to a MySQL database over JDBC.它是用 Java 编写的,并通过 JDBC 与 MySQL 数据库对话。 I've experimented with different ways to insert the data to find the fastest for my particular use case.我已经尝试了不同的方法来插入数据,以便为我的特定用例找到最快的方法。 The one that currently seems to be the best performer is to issue an extended insert (eg a single insert with multiple rows), like this:目前似乎表现最好的方法是发出扩展插入(例如具有多行的单个插入),如下所示:

INSERT INTO the_table (col1, col2, ..., colN) VALUES
(v1, v2, v3, ..., vN),
(v1, v2, v3, ..., vN),
...,
(v1, v2, v3, ..., vN);

The number of rows can be tens of thousands.行数可以是数万。

I've tried using prepared statements, but it's nowhere near as fast, probably because each insert is still sent to the DB separately and the tables needs to be locked and whatnot.我试过使用准备好的语句,但它远没有那么快,可能是因为每个插入仍然单独发送到数据库并且需要锁定表等等。 My colleague who worked on the code before me tried using batching, but that didn't perform well enough either.在我之前处理代码的同事尝试使用批处理,但效果也不够好。

The problem is that using extended inserts means that as far as I can tell I need to build the SQL string myself (since the number of rows is variable) and that means that I open up all sorts of SQL injection vectors that I'm no where intelligent enough to find myself.问题是,使用扩展插入意味着,据我所知,我需要自己构建 SQL 字符串(因为行数是可变的),这意味着我打开了各种我不知道的 SQL 注入向量哪里聪明到可以找到自己。 There's got to be a better way to do this.必须有更好的方法来做到这一点。

Obviously I escape the strings I insert, but only with something like str.replace("\\"", "\\\\\\"");显然我转义了我插入的字符串,但只能使用str.replace("\\"", "\\\\\\""); (repeated for ', ? and \\), but I'm sure that isn't enough. (重复 '、? 和 \\),但我确信这还不够。

prepared statements + batch insert:准备好的语句 + 批量插入:

PreparedStatement stmt = con.prepareStatement(
"INSERT INTO employees VALUES (?, ?)");

stmt.setInt(1, 101);
stmt.setString(2, "Paolo Rossi");
stmt.addBatch();

stmt.setInt(1, 102);
stmt.setString(2, "Franco Bianchi");
stmt.addBatch();

// as many as you want   
stmt.executeBatch();

I would try batching your inserts and see how that performs.我会尝试对您的插入进行批处理,看看效果如何。

Have a read of this ( http://www.onjava.com/pub/a/onjava/excerpt/javaentnut_2/index3.html?page=2 ) for more information on batching.阅读本文( http://www.onjava.com/pub/a/onjava/excerpt/javaentnut_2/index3.html?page=2 )以获取有关批处理的更多信息。

If you are loading tens of thousands of records then you're probably better off using a bulk loader.如果您要加载数以万计的记录,那么最好使用批量加载器。

http://dev.mysql.com/doc/refman/5.0/en/load-data.html http://dev.mysql.com/doc/refman/5.0/en/load-data.html

Regarding the difference between extended inserts and batching single inserts, the reason I decided to use extended inserts is because I noticed that it took my code alot longer time to insert alot of rows than mysql does from the terminal.关于扩展插入和批处理单个插入之间的区别,我决定使用扩展插入的原因是我注意到与 mysql 从终端插入大量行相比,我的代码花费了更长的时间。 This was even though I was batching inserts in batches of 5000. The solution in the end was to use extended inserts.即使我批量插入 5000 次也是如此。最终的解决方案是使用扩展插入。

I quickly retested this theory.我很快重新测试了这个理论。

I took two dumps of a table with 1.2 million rows.我对一个有 120 万行的表进行了两次转储。 One using the default extended insert statements you get with mysqldump and the other using:一个使用您通过 mysqldump 获得的默认扩展插入语句,另一个使用:

mysqldump --skip-extended-insert

Then I simply imported the files again into new tables and timed it.然后我只需将文件再次导入新表并计时。

The extended insert test finished in 1m35s and the other in 3m49s.扩展插入测试在 1 分 35 秒内完成,另一个在 3 分 49 秒内完成。

The full answer is to use the rewriteBatchedStatements=true configuration option along with dfa's answer of using a batched statement.完整的答案是使用rewriteBatchedStatements=true配置选项以及dfa使用批处理语句的答案

The relevant mysql documentation相关的mysql 文档

A worked MySQL example一个有效的 MySQL 示例

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在spring数据JDBC中插入默认值 - How do you insert with default value in spring data JDBC 如何在 Java JDBC 程序中使用 INSERT SQL 语句? - How do you use the INSERT SQL statement in a Java JDBC program? 如何在不使用ORM和直接JDBC(或等效JDBC)的情况下处理这些常见问题? - How do you handle these common problems WITHOUT using an ORM and using straight JDBC (or equivalent)? 您如何使用 Mockito 从扩展 class 测试方法? - How do you test method from extended class using Mockito? 在 Spring Boot 中,如何使用扩展的 setter 和 getter 设置(扩展的)属性? - In Spring Boot, how do you set a (extended) property using extended setters and getters? 如何使用 JDBC 插入局部变量 MySQL - How do I insert local variable MySQL using JDBC 如何使用JDBC创建表并将INSERT记录插入表中? - How do I CREATE a table and INSERT records into table using JDBC? 如何在JDBC中使用自动编号批量插入? - How to do batch insert with an autonumber in JDBC? 您如何存储和重播JDBC语句? - How do you store and replay JDBC statements? 如何使用扩展DisMax QueryParser对Solr中的不同字段执行单独的查询? - How do you perform seperate queries on different fields in Solr using the Extended DisMax QueryParser?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM