[英]How to download large data file by java code where data is fetching in chunks?
I need to have my bulk data into a file. 我需要将批量数据保存到文件中。 When I tried to write the data to a file in on shot, sometimes I get
OutOfMemoryException
in my java code. 当我尝试将数据即时写入文件时,有时我的Java代码中出现
OutOfMemoryException
。 To handle this case I am trying to write different code where I need to open file once and write the data to file in chunk
so that my heap memory does not grow. 为了处理这种情况,我尝试编写不同的代码,在该代码中我需要一次打开文件,然后
write the data to file in chunk
以免堆内存增加。 So I am looking for the best approach
for this case. 因此,我正在
looking for the best approach
这种情况looking for the best approach
。 My source data will be a rest service's response data
. 我的源数据将是
rest service's response data
。 and I will write that data to the destination file. 然后将数据写入目标文件。
Please suggest me a best approach to write data into a file... 请建议我一种将数据写入文件的最佳方法。
I am trying to handle this case by following logic... 我正在尝试通过遵循逻辑来处理这种情况...
buffOut.write(arr, 0, available);
buffOut.write(arr, 0, available);
将byte []写入文件buffOut.write(arr, 0, available);
buffOut.flush();
buffOut.flush();
Java Streams looks very suitable option after taking your use case in consideration. 考虑到您的用例之后, Java Streams看起来非常合适。 Processing file is based on Java streams yield better results as compared to file scanner , Buffered Reader or Java NIO using memory mapped files .
与使用内存映射文件的文件扫描器,缓冲读取器或Java NIO相比,基于Java流处理文件的结果更好。
Here is performance comparison of processing ability of various Java Alternatives: 这是各种Java替代品的处理能力的性能比较:
File Size :- 1 GB 档案大小: -1 GB
Sanner approach: Total elapsed time: 15627 ms Sanner方法:总耗用时间: 15627毫秒
Maped Byte Buffer: Exception in thread “main” java.lang.OutOfMemoryError: Java heap space 映射的字节缓冲区:线程“ main”中的异常java.lang.OutOfMemoryError:Java堆空间
Java 8 Stream: Total elapsed time: 3124 ms Java 8流:总经过时间: 3124毫秒
Java 7 Files: Total elapsed time: 13657 ms Java 7文件:总经过时间: 13657毫秒
Sample Processing Example is as below: 样品处理示例如下:
package com.large.file;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.concurrent.TimeUnit;
import java.util.stream.Stream;
public class Java8StreamRead {
public static void main(String[] args) {
long startTime = System.nanoTime();
Path file = Paths.get("c:/temp/my-large-file.csv");
try
{
//Java 8: Stream class
Stream<String> lines = Files.lines( file, StandardCharsets.UTF_8 );
for( String line : (Iterable<String>) lines::iterator )
{
//System.out.println(line);
}
} catch (IOException ioe){
ioe.printStackTrace();
}
long endTime = System.nanoTime();
long elapsedTimeInMillis = TimeUnit.MILLISECONDS.convert((endTime - startTime), TimeUnit.NANOSECONDS);
System.out.println("Total elapsed time: " + elapsedTimeInMillis + " ms");
}
}
Try the following: 请尝试以下操作:
URL url = new URL("http://large.file.dat");
Path path = Paths.get("/home/it/documents/large.file.dat");
Files.copy(url.openStream(), path);
Chunked should not matter, unless you want to work with parts of a file, when the connection is likely to fail after a time. 分块不应该的问题,除非你想用一个文件的部分工作,当连接很可能在一段时间后失效。
You can use compression sending headers and wrapping the InputStream in a GzippedInputStream. 您可以使用压缩发送头并将InputStream封装在GzippedInputStream中。 Or use apache's HttpClient with out-of-the-box support.
或者使用现成的支持使用apache的HttpClient 。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.