简体   繁体   English

从Java批量复制到SQL Server的最有效方法是什么?

[英]What's the most efficient way to bulk-copy to SQL Server from Java?

I have data that is streamed from disk and processed in memory by a Java application and that finally needs to be copied into SQL Server. 我有从磁盘流式传输并由Java应用程序在内存中处理的数据,最终需要将其复制到SQL Server中。 The data can be fairly large (hence the streaming) and can require up to several 100,000 rows to be inserted. 数据可能非常大(因此会进行流式传输),并且可能需要最多插入100,000行。 The fastest solution seems to be using SQL Server's bulk-copy feature. 最快的解决方案似乎是使用SQL Server的批量复制功能。 However, I haven't found any way for Java programs to do this easily or nearly fast enough. 但是,我还没有找到让Java程序轻松或几乎足够快地执行此操作的方法。

Here are some ways that I've already investigated: 这是我已经研究过的一些方法:

  • Using the SqlBulkCopy class in .NET. 在.NET中使用SqlBulkCopy类。 This is very efficient since you can stream data right from a data source and straight to SQL Server. 这非常有效,因为您可以直接从数据源流式传输数据,然后直接将数据流传输到SQL Server。 The problem with this approach is that you need to be running .NET. 这种方法的问题是您需要运行.NET。 Perhaps this could be used using a Java to .NET bridge. 也许可以使用Java到.NET桥来使用。 Although, I wonder about the cost of marshalling data between runtimes. 虽然,我想知道在运行时之间编组数据的成本。

  • Using the BULK INSERT TSQL statement. 使用BULK INSERT TSQL语句。 The problem with this is that you need create a properly formatted file on disk. 问题是您需要在磁盘上创建格式正确的文件。 I've seen some small performance gains over JDBC's batch insert using this. 我已经看到使用JDBC的批处理插入可以获得一些小的性能提升。 Also, this is only useful locally. 此外,这仅在本地有用。

  • Write files to disk and use the bcp command line utility. 将文件写入磁盘,然后使用bcp命令行实用程序。 Still a little faster than JDBC batch insert but not that much. 仍然比JDBC批处理插入快一点,但不算多。 I also lose the ability to use a transaction with this method. 我也失去了使用这种方法进行交易的能力。

  • Use the C API . 使用C API Again, very efficient, but you need to be using C. There would be a way to use this through JNI. 同样,这非常有效,但是您需要使用C。将有一种方法可以通过JNI使用它。 If there's some free Java library out there that does this, I'd like to know about it. 如果有一些免费的Java库可以执行此操作,那么我想了解一下。

I'm looking for the fastest solution. 我正在寻找最快的解决方案。 Memory is not an issue. 内存不是问题。

Thanks! 谢谢!

  • For the .NET answer i would recommended IKVM. 对于.NET答案,我建议使用IKVM。 Then your Java Code will be .NET code and you can call any .NET code. 然后,您的Java代码将是.NET代码,并且您可以调用任何.NET代码。
  • The BULK INSERT required also that the bulk file is accessible from SQL Server. 批量插入还要求可以从SQL Server访问该批量文件。 This is only a local option. 这只是一个本地选项。 The performance from Batch Update can be vary between different JDBC drivers. 在不同的JDBC驱动程序之间,批处理更新的性能可能有所不同。
  • For native calls I would recommended to use JNA (Java native access). 对于本机调用,我建议使用JNA(Java本机访问)。 Then you does not need to write any C code. 然后,您无需编写任何C代码。

The best option for me was to use the commercial SQL Server JDBC driver from DataDirect with standard JDBC calls addBatch/executeBatch that run across Linux and Windows - https://blogs.datadirect.com/2012/05/how-to-bulk-insert-jdbc-batches-into-microsoft-sql-server-oracle-sybase.html 对我来说,最好的选择是从DataDirect的标准JDBC使用商用的SQL Server JDBC驱动程序调用addBatch /则ExecuteBatch跨Linux和Windows上运行- https://blogs.datadirect.com/2012/05/how-to-bulk-将jdbc-batches插入到Microsoft-sql-server-oracle-sybase.html中

I've seen load times improve from 7 hours to under 30 minutes. 我已经看到加载时间从7小时缩短到30分钟以下。

从SQL Server的Microsoft JDBC驱动程序的4.2版开始,有一个名为com.microsoft.sqlserver.jdbc.SQLServerBulkCopy的类,它与.NET的SqlBulkCopy类相同。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 用Java创建HashMap副本 - 最有效的方法是什么? - Create a HashMap copy in Java - What's the most efficient way? 在java中组合UDP和RPC的最简单,最有效的方法是什么? - What's the easiest and most efficient way to combine UDP and RPCs in java? 用Java编写大型文本文件的最有效方法是什么? - What's the most efficient way to write large text file in java? 制作流副本的最有效方法是什么? - What is the most efficient way to make a copy of a stream? Java:防御复制int []的最有效方法? - Java: most efficient way to defensively copy an int[]? Java - 从Array []中删除一组元素的最有效方法是什么 - Java - What's the most efficient way of removing a set of elements from an Array[] 从Java Date对象中删除时间的最有效方法是什么? - What's the most efficient way to strip out the time from a Java Date object? java:从 StringBuilder 中删除所有空格的最有效方法是什么 - java: what's the most efficient way to remove all blank space from a StringBuilder 从C ++访问Java方法的最有效方法是什么 - What is the most efficient way to access java methods from c++ Java — 同步 ArrayList 的最有效方法是什么? - Java — What is the most efficient way to synchronize an ArrayList?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM