简体   繁体   English

Android - OutOfMemory读取文本文件时

[英]Android - OutOfMemory when reading text file

I'm making a dictionary app on android. 我在android上制作一个字典应用程序。 During its startup, the app will load content of .index file (~2MB, 100.000+ lines) 在启动期间,应用程序将加载.index文件的内容(~2MB,100.000+行)

However, when i use BufferedReader.readLine() and do something with the returned string, the app will cause OutOfMemory. 但是,当我使用BufferedReader.readLine()并对返回的字符串执行某些操作时,该应用程序将导致OutOfMemory。

// Read file snippet
Set<String> indexes = new HashSet<String)();

FileInputStream is = new FileInputStream(indexPath);
BufferedReader reader = new BufferedReader(new InputStreamReader(is));

String readLine;

while ( (readLine = reader.readLine()) != null) {
    indexes.add(extractHeadWord(readLine));
}

// And the extractHeadWord method
private String extractHeadWord(String string) {
    String[] splitted = string.split("\\t");
    return splitted[0];
}

When reading log, I found that while executing, it causes the GC explicitly clean objects many times (GC_EXPLICIT freed xxx objects, in which xxx is a big number such as 15000, 20000). 在读取日志时,我发现在执行时,它会使GC显式清除对象多次(GC_EXPLICIT释放xxx对象,其中xxx是一个很大的数字,如15000,20000)。

And I tried another way: 我尝试了另一种方式:

final int BUFFER = 50;
char[] readChar = new char[BUFFER];

//.. construct BufferedReader

while (reader.read(readChar) != -1) {
    indexes.add(new String(readChar));
    readChar = new char[BUFFER];
}

..and it run very fast. ..它运行得非常快。 But it was not exactly what I wanted. 但这并不是我想要的。

Is there any solution that run fast as the second snippet and easy to use as the first? 有没有任何解决方案作为第二个片段快速运行并且易于使用作为第一个?

Regard. 看待。

The extractHeadWord uses String.split method. extractHeadWord使用String.split方法。 This method does not create new strings but relies on the underlying string (in your case the line object) and uses indexes to point out the "new" string. 此方法不会创建新字符串,而是依赖于基础字符串(在您的情况下是line对象)并使用索引指出“新”字符串。

Since you are not interessed in the rest of the string you need to discard the it so it gets garbage collected otherwise the whole string will be in memory (but you are only using a part of it). 由于你没有在字符串的其余部分中处理它,你需要丢弃它以便它被垃圾收集,否则整个字符串将在内存中(但你只使用它的一部分)。

Calling the constructor String(String) ("copy constructor") discards the rest of string: 调用构造函数String(String) (“copy constructor”)会丢弃其余的字符串:

private String extractHeadWord(String string) {
    String[] splitted = string.split("\\t");
    return new String(splitted[0]);
}

What happens if your extractHeadWord does this return new String(splitted[0]); 如果您的extractHeadWord执行此操作会return new String(splitted[0]); extractHeadWord return new String(splitted[0]); .

It will not reduce temporary objects, but it might reduce the footprint of the application. 它不会减少临时对象,但可能会减少应用程序的占用空间。 I don't know if split does about the same as substring, but I guess that it does. 我不知道split是否和substring一样,但我猜它确实如此。 substring creates a new view over the original data, which means that the full character array will be kept in memory. substring在原始数据上创建一个新视图,这意味着完整的字符数组将保留在内存中。 Explicitly invoking new String(string) will truncate the data. 显式调用new String(string)将截断数据。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM