[英]OutOfMemory error when using Apache Commons lineIterator
I'm trying to iterate line-by-line a 1.2GB file using Apache Commons FileUtils.lineIterator
. 我正在尝试使用Apache Commons
FileUtils.lineIterator
逐行迭代1.2GB文件。 However, as soon as a LineIterator
calls hasNext()
I get a java.lang.OutOfMemoryError: Java heap space
. 但是,只要
LineIterator
调用hasNext()
我就会得到一个java.lang.OutOfMemoryError: Java heap space
。 I've already allocated 1G
to the java heap. 我已经为Java堆分配了
1G
。
What am I doing wrong in here? 我在这里做错了什么? After reading some docs, isn't LineIterator supposed to be reading the file from the file system and not loading it into memory?
在阅读了一些文档之后,LineIterator是不是应该从文件系统中读取文件而不是将其加载到内存中?
Note the code is in Scala: 请注意代码在Scala中:
val file = new java.io.File("data_export.dat")
val it = org.apache.commons.io.FileUtils.lineIterator(file, "UTF-8")
var successCount = 0L
var totalCount = 0L
try {
while ( {
it.hasNext()
}) {
try {
val legacy = parse[LegacyEvent](it.nextLine())
BehaviorEvent(legacy)
successCount += 1L
} catch {
case e: Exception => println("Parse error")
}
totalCount += 1
}
} finally {
it.close()
}
Thanks for your help here! 谢谢你的帮助!
The code looks good. 代码看起来不错。 Probably it does not find an end of a line in the file and reads a very long line which is larger than 1Gb into memory.
可能它没有在文件中找到一行的结尾,并且在内存中读取一条大于1Gb的非常长的行。
Try wc -l
in Unix and see how many lines you get. 在Unix中尝试
wc -l
,看看你得到了多少行。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.