简体   繁体   English

内存泄漏从数据库中读取+800万条记录

[英]Memory leakage reading in +8 million records from database

I have a database with +8million records which I need to processes in particular ways which is written in Java. 我有一个包含+800万条记录的数据库,我需要用Java编写的特定方式对其进行处理。 After looking some stuff up, I found following related posts: 在查找了一些东西之后,我发现了以下相关文章:

This is my code that returns the items stored in the column Tags of my MySQL database: 这是我的代码,返回存储在我的MySQL数据库的标签列中的项目:

public ResultSet getAllTags() {
    String query = "SELECT Tags FROM dataset";
    ResultSet rs = null;

    try {
        connection = ConnectionFactory.getConnection(DATABASE);
        preparedStatement = connection.prepareStatement(query, ResultSet.TYPE_SCROLL_SENSITIVE, ResultSet.CONCUR_READ_ONLY);
        preparedStatement.setFetchSize(Integer.MIN_VALUE);
        rs = preparedStatement.executeQuery(query);
        // following line is for testing, to see what comes out of the resultset
        System.out.println("output: " + rs.getString(1));
        return rs;
    } catch (Exception ex) {
        ex.printStackTrace();
        return null;
    } finally {
        closeAll();
    }
}

Here I return the ResultSet in order for me to process each line in the rs.next() loop. 在这里,我返回ResultSet以便我处理rs.next()循环中的每一行。 However, at the line of rs = preparedStatement.executeQuery(query); 但是,在rs = preparedStatement.executeQuery(query); it starts to eat all my computer's free memory (I work on Mac OSX with 8GB of RAM. With only Eclipse open I have +/- 5GB left free, when running application it goes down till < 100MB free) making me to shut down the database connection and application etc... So I assume this can be called a Memory Leakage? 它开始消耗掉我计算机的所有可用内存(我在具有8GB RAM的Mac OSX上工作。仅在Eclipse打开的情况下,我还有+/- 5GB的空闲空间,运行应用程序时它会下降到<100MB可用空间),这使我不得不关闭数据库连接和应用程序等...所以我认为这可以称为内存泄漏?

Can someone explain what I'm doing wrong and why this issue occurs even when I follow the instructions found on other pages with similar amount of records? 有人可以解释我做错了什么,为什么即使按照其他记录量相似的页面上的说明进行操作,也会出现此问题?

The only thing you're doing wrong is to use a stupid database driver (MySQL), which by default reads the whole resultset in memory. 您唯一做错的是使用愚蠢的数据库驱动程序(MySQL),默认情况下会读取内存中的整个结果集。

Try using the useCursorFetch and defaultFetchSize properties described in http://dev.mysql.com/doc/connector-j/en/connector-j-reference-configuration-properties.html to avoid that, and you should be able to iterate through the rows without fetching everything in memory (not tested though). 尝试使用http://dev.mysql.com/doc/connector-j/en/connector-j-reference-configuration-properties.html中描述的useCursorFetch和defaultFetchSize属性来避免这种情况,并且您应该能够遍历行,而不会获取内存中的所有内容(尽管未测试)。

Note that the line 注意行

System.out.println("output: " + rs.getString(1));

will throw an exception, since you haven't called next() yet in the result set. 将抛出一个异常,因为您尚未在结果集中调用next() Also note that, if closeAll() closes the connection, the caller won't be able to iterate through the result set, since it will be closed. 还要注意,如果closeAll()关闭了连接,则调用者将无法遍历结果集,因为它将被关闭。 You should execute the iteration before closing the connection. 您应该在关闭连接之前执行迭代。

Note that the documentation of the driver says: 请注意,驱动程序的文档说:

By default, ResultSets are completely retrieved and stored in memory. 默认情况下,完全检索结果集并将其存储在内存中。 In most cases this is the most efficient way to operate, and due to the design of the MySQL network protocol is easier to implement. 在大多数情况下,这是最有效的操作方式,而且由于MySQL网络协议的设计更易于实现。 If you are working with ResultSets that have a large number of rows or large values, and cannot allocate heap space in your JVM for the memory required, you can tell the driver to stream the results back one row at a time. 如果您正在使用具有大量行或较大值的ResultSet,并且无法在JVM中为所需的内存分配堆空间,则可以告诉驱动程序一次将结果流回一行。

To enable this functionality, create a Statement instance in the following manner: 要启用此功能,请按以下方式创建一个Statement实例:

stmt = conn.createStatement(java.sql.ResultSet.TYPE_FORWARD_ONLY,
          java.sql.ResultSet.CONCUR_READ_ONLY);
stmt.setFetchSize(Integer.MIN_VALUE);

But you're used TYPE_SCROLL_SENSITIVE and not TYPE_FORWARD_ONLY . 但是您使用的是TYPE_SCROLL_SENSITIVE而不是TYPE_FORWARD_ONLY

Did you consider positive values for fetch sizes, 256/512/1024/2048. 您是否考虑过取大小256/512/1024/2048的正值。 I would expect setting a negative value for fetch size having no effect, however this could vary by driver implementation and you should verify the actual behavior in the driver documentation. 我希望为获取大小设置一个负值没有影响,但是这可能会因驱动程序的实现而有所不同,您应该在驱动程序文档中验证实际行为。

    public void setFetchSize(int rows) throws SQLException {
    synchronized (checkClosed().getConnectionMutex()) {
        if (((rows < 0) && (rows != Integer.MIN_VALUE))
                || ((this.maxRows != 0) && (this.maxRows != -1) && (rows > this
                        .getMaxRows()))) {
            throw SQLError.createSQLException(
                    Messages.getString("Statement.7"), //$NON-NLS-1$
                    SQLError.SQL_STATE_ILLEGAL_ARGUMENT, getExceptionInterceptor()); //$NON-NLS-1$ //$NON-NLS-2$
        }

        this.fetchSize = rows;
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM