如何使用java从数据库中读取大数据？

Question

I am having more then 2gb data in my table i need to read more the 1gb data from the single table, i know various option available in db side to achieve this but i need better approach in java code, can any one tell with example java code like parallel processing in multi threading.我的表中有超过 2gb 的数据我需要从单个表中读取更多的 1gb 数据，我知道 db 端可用的各种选项来实现这一点，但我需要在 Java 代码中使用更好的方法，任何人都可以通过示例 java 告诉我多线程中类似并行处理的代码。

example Code示例代码

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;
 
public class SelectRowsExample {
  
  public static void main(String[] args) {
 
    Connection connection = null;
    try {
 
  // Load the MySQL JDBC driver
 
  String driverName = "com.mysql.jdbc.Driver";
 
  Class.forName(driverName);
               
  String serverName = "localhost";
 
  String schema = "test";
 
  String url = "jdbc:mysql://" + serverName +  "/" + schema;
 
  String username = "username";
 
  String password = "password";
 
  connection = DriverManager.getConnection(url, username, password);
 
   
 
  System.out.println("Successfully Connected to the database!");
 
   
    } catch (ClassNotFoundException e) {
 
  System.out.println("Could not find the database driver " + e.getMessage());
    } catch (SQLException e) {
 
  System.out.println("Could not connect to the database " + e.getMessage());
    }
 
    try {
 
       
Statement statement = connection.createStatement();
 
ResultSet results = statement.executeQuery("SELECT * FROM employee orderby dept");
         
while (results.next()) {
   
  String empname = results.getString("name");
 
  System.out.println("Fetching data by column index for row " + results.getRow() + " : " + empname);
 
   String department = results.getString("department");
 
  System.out.println("Fetching data by column name for row " + results.getRow() + " : " + department);
 
 
}
 

        } catch (SQLException e) {
 
  System.out.println("Could not retrieve data from the database " + e.getMessage());
    }
 
  }
}

Here my query will return name and department details more the 1gb data will come for each department.在这里，我的查询将返回名称和部门详细信息，每个部门都会有更多的 1GB 数据。 if i use this way it will surly slow down the application.如果我使用这种方式，它肯定会减慢应用程序的速度。 that's why i thought go for parallel processing in multithreading.这就是为什么我想在多线程中进行并行处理。 any one kindly give me the suggestion to read the huge amount of data quickly.任何人都请给我建议快速阅读大量数据。

Answer 1

In your example you don't have to use high calibre gun like paralellism.在您的示例中，您不必像并行那样使用大口径枪。 Also it doesn't necessarily solves your problem because there could be a lot of bottlenecks because of hardware, network, etc as luk2302 mentioned it.此外，它不一定能解决您的问题，因为正如 luk2302 提到的那样，由于硬件、网络等可能存在很多瓶颈。

There are two much easier tweaks:有两个更容易的调整：

Select only those data that you really need.仅选择您真正需要的数据。 Even if your employee record has 3 columns you can spare the 1/3 of the data which results in speed increase and lower memory consumption.即使您的员工记录有 3 列，您也可以节省 1/3 的数据，从而提高速度并降低内存消耗。 Not to mention if it has much more columns.更不用说它是否有更多的列。

ResultSet results = statement.executeQuery("SELECT name, department FROM employee orderby dept");

The default fetchSize won't be enough.默认的 fetchSize 是不够的。 It's value depends on the driver, but for example by default when Oracle JDBC runs a query, it retrieves a result set of 10 rows at a time from the database cursor.它的值取决于驱动程序，但例如默认情况下，当 Oracle JDBC 运行查询时，它一次从数据库游标中检索 10 行的结果集。 I know that you are using MySql but it should be about the same.我知道您正在使用 MySql，但它应该大致相同。 Increasing it you can reduce the overall trip count to the database cursor which is costy.增加它可以减少到数据库游标的总行程计数，这是昂贵的。 Therefore I recommend it to increase it to 500 or 1000, but you can even experiment with higher values.因此，我建议将其增加到 500 或 1000，但您甚至可以尝试使用更高的值。 More info on fetchSize: What does Statement.setFetchSize(nSize) method really do in SQL Server JDBC driver?有关 fetchSize 的更多信息： Statement.setFetchSize(nSize) 方法在 SQL Server JDBC 驱动程序中到底做了什么？

Statement statement = connection.createStatement();
statement.setFetchSize(1000);

+1 - System.out.println also slows down your code. +1 - System.out.println 也会减慢您的代码速度。 You can read about it here: Why is System.out.println so slow?你可以在这里阅读：为什么 System.out.println 这么慢？ But it's better to replace with a logger library or at least for testing purposes you can use something like this:但是最好用记录器库替换，或者至少出于测试目的，您可以使用以下内容：

if(results.getRow()%1000 == 0) {
    System.out.println("Fetching data by column index for row " + results.getRow() + " : " + empname);
}

Br, Nandor Br, 南多

如何使用java从数据库中读取大数据？

问题描述

1 个解决方案

解决方案1
1 2020-09-02 13:47:02

如何使用java从数据库中读取大数据？

问题描述

1 个解决方案

解决方案1 1 2020-09-02 13:47:02

解决方案1
1 2020-09-02 13:47:02