简体   繁体   English

在Android / Java中读写文件时跳过部分

[英]Skip parts while reading and writing a file in Android/Java

I'm trying to learn Java/Android and right now I'm doing some experiments with the replaceAll function. 我正在尝试学习Java / Android,现在我正在使用replaceAll函数进行一些实验。 But I've found that with large text files the process gets sluggish so I was wondering if there is a way to skip the "useless" parts of a file to have a better performance. 但是我发现对于大型文本文件,该过程变得很缓慢,因此我想知道是否有一种方法可以跳过文件的“无用”部分以获得更好的性能。 (Note: Just skip them, not delete them) (注意:只跳过它们,不删除它们)

Note: I am not trying to "count lines" or "println" or "system.out", I'm just replacing strings and saving the changes in the same file. 注意:我不是要“计数行数”,“ println”或“ system.out”,我只是替换字符串并将更改保存在同一文件中。

Example

AAAA AAAA

CCCC- 9234802394819102948102948104981209381'238901'2309'129831'2381'2381'23081'23081'284091824098304982390482304981'20841'948023984129048'1489039842039481'204891'29031'923481290381'20391'294872385710239841'20391'20931'20853029573098341'290831'20893'12894093274019799919208310293810293810293810293810298'120931¿2093¿12039¿120931¿203912¿0391¿203912¿039¿12093¿12093¿12093¿12093¿12093¿1209312¿0390¿... DDDD CCCC- 9234802394819102948102948104981209381'238901'2309'129831'2381'2381'2381'23081'23081'284091824098304982390482304981'20841'948023984129048'1489039842039481'204891'29031'923481290381'20391'294872385710239841'20931'29093102102938938290290290938938290290290938938938 2093¿12039¿120931¿203912¿0391¿203912¿039¿12093¿12093¿12093¿12093¿12093¿1209312¿0390¿... DDDD

AAAA AAAA

CCCC- 9234802394819102948102948104981209381'238901'2309'129831'2381'2381'23081'23081'284091824098304982390482304981'20841'948023984129048'1489039842039481'204891'29031'923481290381'20391'294872385710239841'20391'20931'20853029573098341'290831'20893'12894093274019799919208310293810293810293810293810298'120931¿2093¿12039¿120931¿203912¿0391¿203912¿039¿12093¿12093¿12093¿12093¿12093¿1209312¿0390¿... DDDD CCCC- 9234802394819102948102948104981209381'238901'2309'129831'2381'2381'2381'23081'23081'284091824098304982390482304981'20841'948023984129048'1489039842039481'204891'29031'923481290381'20391'294872385710239841'20931'29093102102938938290290290938938290290290938938938 2093¿12039¿120931¿203912¿0391¿203912¿039¿12093¿12093¿12093¿12093¿12093¿1209312¿0390¿... DDDD

and so on....like a zillion times 等等...像无数次

I want to replace all "AAAA" with "BBBB" , but there are large portions of data between the strings I am replacing. 我想用“ BBBB”替换所有“ AAAA” ,但是我要替换的字符串之间有很大一部分数据。 Also, this portions always begin with "CCCC" and end with "DDDD" . 同样,此部分始终以“ CCCC”开头并以“ DDDD”结尾。

Here's the code I am using to replace the string. 这是我用来替换字符串的代码。

File file = new File("my_file.txt");
BufferedReader reader = new BufferedReader(new FileReader(file));
String line = "", oldtext = "";
while((line = reader.readLine()) != null) {
   oldtext += line + "\r\n";
}
reader.close();

// Replacing "AAAA" strings
String newtext= oldtext.replaceAll("AAAA", "BBBB");

FileWriter writer = new FileWriter("my_file.txt");
writer.write(newtext);
writer.close();

I think reading all lines is inefficient, especially when you won't be modifying these parts (and they represent the 90% of the file). 我认为读取所有行效率不高,尤其是当您不打算修改这些部分时(它们代表文件的90%)。

Does anyone know a solution??? 有谁知道解决方案???

You are wasting a lot of time on this line -- 您在这条线上浪费了很多时间-

oldtext += line + "\r\n";

In Java, String is immutable, which means you can't modify them. 在Java中, String是不可变的,这意味着您无法修改它们。 Therefore, when you do the concatenation, Java is actually making a complete copy of oldtext . 因此,当您进行连接时,Java实际上是在复制oldtext So, for every line in your file, you are recopying every line that came before in your new String . 因此,对于文件中的每一行,您都将重新复制新String之前的每一行。 Take a look at StringBuilder for aa way to build a String avoiding these copies. 查看StringBuilder了解一种避免这些副本的String方法。

However, in your case, you do not need the whole file in memory, because you can process line by line. 但是,根据您的情况,您不需要整个文件在内存中,因为您可以逐行处理。 By moving your replaceAll and write into your loop, you can operate on each line as you read it. 通过移动replaceAllwrite循环,您可以在读取时在每一行上进行操作。 This will keep the memory footprint of the routine down, because you are only keeping a single line in memory. 这将使例程的内存占用减少,因为您仅在内存中保留了一行。

Note that since the FileWriter is opened before you read the input file, you need to have a different name for the output file. 请注意,由于在读取输入文件之前已打开FileWriter ,因此您需要为输出文件使用其他名称。 If you want to keep the same name, you can do a renameTo on the File after you close it. 如果您要保留相同的名称,则可以在关闭File后对File执行renameTo

File file = new File("my_file.txt");
BufferedReader reader = new BufferedReader(new FileReader(file));
FileWriter writer = new FileWriter("my_out_file.txt");
String line = "";
while((line = reader.readLine()) != null) {
    // Replacing "AAAA" strings
    String newtext= line.replaceAll("AAAA", "BBBB");    
    writer.write(newtext);
}
reader.close();
writer.close();

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM