简体   繁体   English

在java中将大文件加载到arraylist的最佳方法

[英]Best way to load a large file into arraylist in java

I have a file whose size is about 300mb. 我有一个大小约300mb的文件。 I want to read the contents line by line and then add it into ArrayList. 我想逐行读取内容,然后将其添加到ArrayList中。 So I have made an object of array list a1 , then reading the file using BufferedReader , after that when I add the lines from file into ArrayList it gives an error Exception in thread "main" java.lang.OutOfMemoryError: Java heap space. 所以我创建了一个数组列表a1的对象,然后使用BufferedReader读取文件,之后当我将文件中的行添加到ArrayList时,它会在线程“main”中给出错误异常java.lang.OutOfMemoryError:Java堆空间。

Please tell me what should be the solution for this. 请告诉我这应该是什么解决方案。

  public static void main(String[] args) {
    // TODO Auto-generated method stub
    try {
      FileReader file = new FileReader(
          "/home/dmdd/Desktop/AsiaData/RawData/AllupperairVcomponent.txt");
      ArrayList a1 = new ArrayList();
      BufferedReader br = new BufferedReader(file);
      String line = "";
      while ((line = br.readLine()) != null) {
        a1.add(line);
      }
    } catch (Exception e) {
      // TODO: handle exception
      e.printStackTrace();
    }
  }

Naively, increase the size of the heap via the Xmx command line argument (see this excellent answer for some guidance) 天真地,通过Xmx命令行参数增加堆的大小(有关指导,请参阅此优秀答案

This'll only work up to a point though, instead consider structuring your data so that the memory requirements are minimized. 这只会达到一定程度,而是考虑构建数据,以便最大限度地减少内存需求。 Do you need the whole thing in memory at once? 你是否一次需要记忆中的所有东西? Perhaps you only need to test whether an item is in that set, consider using a hash or a bloom filter (etc). 也许你只需要测试一个项目是否在该集合中,考虑使用散列或布隆过滤器(等)。

Just increase the heap size of Java 只需增加Java的堆大小

java -Xmx250m java -Xmx250m

If you running your project from IDE set -Xmx250m in arguments. 如果从IDE set -Xmx250m在参数中运行项目。

250m is 250mb 250米是250mb

If you have to have it in memory, you could try increasing the heap size by passing the -mx option to the java executable. 如果必须将它放在内存中,可以尝试通过将-mx选项传递给java可执行文件来增加堆大小。

It may also be worth considering the question if you really need all that data in memory at the same time. 如果你真的需要同时在内存中的所有数据,也可能值得考虑这个问题。 It could be that you can either process it sequentially, or keep most or all of it on disk. 可能是您可以按顺序处理它,也可以将大部分或全部保留在磁盘上。

Pass -Xmx1024m to increase your heap sapce to 1024 mb. 传递-Xmx1024m以将堆sapce增加到1024 mb。

java -Xms1024m -Xmx512m HelloWorld

You can increase up-to 4GB on a 32 bit system and on a 64 bit system you can go much higher. 您可以在32位系统上增加最多4GB,在64位系统上可以增加更多。

Use java.nio.file.Files.readAllLines, it returns List<String>. 使用java.nio.file.Files.readAllLines,它返回List<String>. And if you're getting OOME increase heap size as java -Xmx1024m 如果你让OOME增加堆大小为java -Xmx1024m

I agree with @Murali partly this will fix the problem you are facing. 我同意@Murali,这将解决你所面临的问题。 But it is advisable to use Caching when handling large files. 但是建议在处理大文件时使用缓存。 What if the file size becomes 500Mb in a rare case. 如果文件大小在极少数情况下变为500Mb怎么办? Make use of a Caching API like Memcached this will eliminate Memory Outages in JVM. 使用像Memcached这样的缓存API,这将消除JVM中的内存中断。

If you can: process the file in batches of 10000 lines or so. 如果可以:批量处理10000行以上的文件。

read 10k lines process repeat until done 读取10k行进程重复直到完成

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM