[英]How to read multiple times from inpust stream?
I have InputStream as input (it's a big.zip) which contains several files like:我有 InputStream 作为输入(它是一个 big.zip),其中包含几个文件,例如:
First I need to determine a file I want to process (Lexicographic order) like:首先,我需要确定要处理的文件(字典顺序),例如:
String getFileName(List<String> filenames){
return filenames.stream()
.filter(PREDICATE)
.max(Comparator.naturalOrder());
}
}
Then I need to pass this.xml file as InputStream for further parsing.然后我需要将 this.xml 文件作为 InputStream 进行进一步解析。
It would be easy to operate on objects in memory, but I don't know how to approach this with InputStream.对 memory 中的对象进行操作会很容易,但我不知道如何使用 InputStream 来解决这个问题。 The solution should be memory efficient so I cannot just save everything.
解决方案应该是 memory 有效的,所以我不能只保存所有内容。 Should I read it 2 times?
我应该读2遍吗?
Answer depends on target.答案取决于目标。
DOM parser load all information into memory. DOM 解析器将所有信息加载到 memory 中。 SAX parser is a stream processor.
SAX 解析器是一个 stream 处理器。
If you're using ZipFile
you only need to open the ZIP file once.如果您使用
ZipFile
,您只需打开 ZIP 文件一次。 ZipFile
let's you get an input stream based on an entry. ZipFile
让您根据条目获取输入 stream。
for (Enumeration<? extends ZipEntry> entries = zipFile.entries(); entries.hasMoreElements(); ) {
ZipEntry entry = entry.nextElement();
if (entry matches filter) {
InputStream inputStream = zipFile.getInputStream(entry);
// use inputStream as needed
}
}
An alternative is to use ZipInputStream
:另一种方法是使用
ZipInputStream
:
try (ZipInputStream zipStream = new ZipInputStream(...)) {
ZipEntry entry;
while ((entry = zipStream.getNextEntry()) != null) {
if (entry matches filter) {
// use zipStream as needed, just don't close it!
}
zipStream.closeEntry();
}
}
That remark about not closing the ZipInputStream
is important;关于不关闭
ZipInputStream
的评论很重要; apart from the close()
method the stream works just like a stream for a separate entry.除了
close()
方法之外,stream 的工作方式与 stream 的工作方式类似,用于单独的条目。 Closing it will close the entire ZIP file though.关闭它将关闭整个 ZIP 文件。
If needed, Apache Commons IO has CloseShieldInputStream that can be used to wrap the ZIP stream, so when the wrapper is closed the ZIP stream isn't. If needed, Apache Commons IO has CloseShieldInputStream that can be used to wrap the ZIP stream, so when the wrapper is closed the ZIP stream isn't.
Edit: the ZipInputStream
solution indeed needs two loops, because the file to use depends on all other files.编辑:
ZipInputStream
解决方案确实需要两个循环,因为要使用的文件取决于所有其他文件。 The ZipFile
solution can still be used. ZipFile
解决方案仍然可以使用。 Loop once to get the right file, then use zipFile.getInputStream(getEntry(fileName))
, or if you store the entry instead of just its name, zipFile.getInputStream(file)
.循环一次以获取正确的文件,然后使用
zipFile.getInputStream(getEntry(fileName))
,或者如果您存储条目而不仅仅是其名称zipFile.getInputStream(file)
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.