简体   繁体   English

java.lang.OutOfMemoryError:读取大文本文件时超出了GC开销限制

[英]java.lang.OutOfMemoryError: GC overhead limit exceeded while Reading a Large Text file

I have a text file which is about 2 GB in size. 我有一个文本文件,大小约为2 GB。 Each line of the file has the following format: 文件的每一行具有以下格式:

some text possibly separated by commas , unique integer 一些可能用逗号分隔的文本,唯一整数

I need to take each line, split it into two parts : text, unique integer and put it in a Hashmap as a key value pair. 我需要将每一行分成两部分:文本,唯一整数并将其作为键值对放入Hashmap中。

Now I am facing OutOfMemory Error even when the heap size is set to 10 GB. 现在,即使堆大小设置为10 GB,我也面临OutOfMemory错误。

There could be two reasons for this : 1. The way I am reading the file is wrong. 可能有两个原因:1.我读取文件的方式错误。 2. I am creating too many unnecessary String objects. 2.我创建了太多不必要的String对象。

This is what I am doing : 这就是我在做什么:

InputStream is = Thread.currentThread().getContextClassLoader().getResourceAsStream("filename.txt");

InputStreamReader stream = new InputStreamReader(is, StandardCharsets.UTF_8);

BufferedReader reader = new BufferedReader(stream);

while(true)
{
 line =reader.readLine();
 if (line == null) {
  break;
 }
 String text= line.substring(0, line.lastIndexOf(",")).trim();

 String id = line.substring(line.lastIndexOf(",") + 1).trim();

 //put this in a hashmap and other processing
}

Since I need to split each line of the text in two parts and the first part(text) might have commas as well, I am using substring() method for this purpose. 由于我需要将文本的每一行分成两部分,并且第一部分(文本)也可能有逗号,因此我为此使用了substring()方法。

The reason I am using trim is that I need to put the text and id in the Hashmap without trailing and leading whitespaces. 我使用trim的原因是我需要将文本和id放在Hashmap中,而不能在末尾加上前导空格。

Error message: 错误信息:

 Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
    at java.util.Arrays.copyOfRange(Arrays.java:3664)
    at java.lang.String.<init>(String.java:207)
    at java.lang.String.substring(String.java:1969)

u should add loop condition. 你应该添加循环条件。 please try it again with the underline code.it seems to work! 请使用下划线代码再试一次。它似乎可以正常工作!

    try {
        String line;

        while ((line = reader.readLine()) != null) {
            String text = line.substring(0, line.lastIndexOf(",")).trim();

            String id = line.substring(line.lastIndexOf(",") + 1).trim();

            //put this in a hashmap and other processing
        }
    } catch (IOException e) {
        e.printStackTrace();
    } finally {
        try {
            reader.close();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 java.lang.OutOfMemoryError: 使用 itext PdfReader 读取大型 PDF 文件时超出 GC 开销限制 - java.lang.OutOfMemoryError: GC overhead limit exceeded when reading large PDF file using itext PdfReader java.lang.OutOfMemoryError:使用Apache POI读取Excel文件时,超出了GC开销限制 - java.lang.OutOfMemoryError: GC overhead limit exceeded" while reading excel file using apache POI Java PreparedStatement java.lang.OutOfMemoryError:超出了GC开销限制 - Java PreparedStatement java.lang.OutOfMemoryError: GC overhead limit exceeded 詹金斯 java.lang.OutOfMemoryError:超出 GC 开销限制 - Jenkins java.lang.OutOfMemoryError: GC overhead limit exceeded java.lang.OutOfMemoryError:GC开销限制超出了android studio - java.lang.OutOfMemoryError: GC overhead limit exceeded android studio Gridgain:java.lang.OutOfMemoryError:超出了GC开销限制 - Gridgain: java.lang.OutOfMemoryError: GC overhead limit exceeded Spark失败了java.lang.OutOfMemoryError:超出了GC开销限制? - Spark fails with java.lang.OutOfMemoryError: GC overhead limit exceeded? SonarQube java.lang.OutOfMemoryError:超出了GC开销限制 - SonarQube java.lang.OutOfMemoryError: GC overhead limit exceeded Tomcat java.lang.OutOfMemoryError:超出了GC开销限制 - Tomcat java.lang.OutOfMemoryError: GC overhead limit exceeded java.lang.OutOfMemoryError:超出 GC 开销限制 - java.lang.OutOfMemoryError: GC overhead limit exceeded
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM