简体   繁体   English

使用Apache Commons lineIterator时出现OutOfMemory错误

[英]OutOfMemory error when using Apache Commons lineIterator

I'm trying to iterate line-by-line a 1.2GB file using Apache Commons FileUtils.lineIterator . 我正在尝试使用Apache Commons FileUtils.lineIterator逐行迭代1.2GB文件。 However, as soon as a LineIterator calls hasNext() I get a java.lang.OutOfMemoryError: Java heap space . 但是,只要LineIterator调用hasNext()我就会得到一个java.lang.OutOfMemoryError: Java heap space I've already allocated 1G to the java heap. 我已经为Java堆分配了1G

What am I doing wrong in here? 我在这里做错了什么? After reading some docs, isn't LineIterator supposed to be reading the file from the file system and not loading it into memory? 在阅读了一些文档之后,LineIterator是不是应该从文件系统中读取文件而不是将其加载到内存中?

Note the code is in Scala: 请注意代码在Scala中:

  val file = new java.io.File("data_export.dat")
  val it = org.apache.commons.io.FileUtils.lineIterator(file, "UTF-8")
  var successCount = 0L
  var totalCount = 0L
  try {
    while ( {
      it.hasNext()
    }) {
      try {
        val legacy = parse[LegacyEvent](it.nextLine())
        BehaviorEvent(legacy)
        successCount += 1L
      } catch {
        case e: Exception => println("Parse error")
      }
      totalCount += 1
    }
  } finally {
    it.close()
  }

Thanks for your help here! 谢谢你的帮助!

The code looks good. 代码看起来不错。 Probably it does not find an end of a line in the file and reads a very long line which is larger than 1Gb into memory. 可能它没有在文件中找到一行的结尾,并且在内存中读取一条大于1Gb的非常长的行。

Try wc -l in Unix and see how many lines you get. 在Unix中尝试wc -l ,看看你得到了多少行。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用 RestTemplate 时出错 - 使用 Apache Commons Multimap 反序列化对象的问题 - Error when using RestTemplate - problem with deserialization of object with Apache Commons Multimap 使用org.apache.commons.net.ftp.FTPClient从AsyncTask类登录FTP时出错 - Error when logging into FTP from AsyncTask class using org.apache.commons.net.ftp.FTPClient 使用Apache Commons Exec时,如何单独收集标准输出和标准错误? - How do I collect Standard Out and Standard Error separately when using Apache Commons Exec? 使用Apache Commons FileUpload - Using Apache commons FileUpload 使用apache commons进行编码 - Encoding using apache commons 使用apache commons csv解析CSV时使用IllegalArgumentException-CSVFormat - IllegalArgumentException when using parsing CSV using apache commons csv - CSVFormat 尝试使用apache commons-exec jar打开cmd时出错 - error trying to open cmd using apache commons-exec jar 使用Apache Crypto Commons时java.security.GeneralSecurityException CryptoCipher - java.security.GeneralSecurityException CryptoCipher when using Apache Crypto Commons 使用Apache Commons电子邮件库以Java发送电子邮件时出错 - error while sending email in Java using Apache Commons email libs 如何解决这个错误,CSVFormat with double quote using apache commons? - how to solve this error, CSVFormat with double quote using apache commons?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM