简体   繁体   English

如何从输入 stream 中多次读取?

[英]How to read multiple times from inpust stream?

I have InputStream as input (it's a big.zip) which contains several files like:我有 InputStream 作为输入(它是一个 big.zip),其中包含几个文件,例如:

  • xxx1.xml xxx1.xml
  • xxx2.xml xxx2.xml
  • xxx2_old.xml xxx2_old.xml

First I need to determine a file I want to process (Lexicographic order) like:首先,我需要确定要处理的文件(字典顺序),例如:

String getFileName(List<String> filenames){
        return filenames.stream()
                .filter(PREDICATE)
                .max(Comparator.naturalOrder());
    }
}

Then I need to pass this.xml file as InputStream for further parsing.然后我需要将 this.xml 文件作为 InputStream 进行进一步解析。

It would be easy to operate on objects in memory, but I don't know how to approach this with InputStream.对 memory 中的对象进行操作会很容易,但我不知道如何使用 InputStream 来解决这个问题。 The solution should be memory efficient so I cannot just save everything.解决方案应该是 memory 有效的,所以我不能只保存所有内容。 Should I read it 2 times?我应该读2遍吗?

Answer depends on target.答案取决于目标。

DOM parser load all information into memory. DOM 解析器将所有信息加载到 memory 中。 SAX parser is a stream processor. SAX 解析器是一个 stream 处理器。

If you're using ZipFile you only need to open the ZIP file once.如果您使用ZipFile ,您只需打开 ZIP 文件一次。 ZipFile let's you get an input stream based on an entry. ZipFile让您根据条目获取输入 stream。

for (Enumeration<? extends ZipEntry> entries = zipFile.entries(); entries.hasMoreElements(); ) {
    ZipEntry entry = entry.nextElement();
    if (entry matches filter) {
        InputStream inputStream = zipFile.getInputStream(entry);
        // use inputStream as needed
    }
}

An alternative is to use ZipInputStream :另一种方法是使用ZipInputStream

try (ZipInputStream zipStream = new ZipInputStream(...)) {
    ZipEntry entry;
    while ((entry = zipStream.getNextEntry()) != null) {
        if (entry matches filter) {
            // use zipStream as needed, just don't close it!
        }
        zipStream.closeEntry();
    }
}

That remark about not closing the ZipInputStream is important;关于不关闭ZipInputStream的评论很重要; apart from the close() method the stream works just like a stream for a separate entry.除了close()方法之外,stream 的工作方式与 stream 的工作方式类似,用于单独的条目。 Closing it will close the entire ZIP file though.关闭它将关闭整个 ZIP 文件。
If needed, Apache Commons IO has CloseShieldInputStream that can be used to wrap the ZIP stream, so when the wrapper is closed the ZIP stream isn't. If needed, Apache Commons IO has CloseShieldInputStream that can be used to wrap the ZIP stream, so when the wrapper is closed the ZIP stream isn't.

Edit: the ZipInputStream solution indeed needs two loops, because the file to use depends on all other files.编辑: ZipInputStream解决方案确实需要两个循环,因为要使用的文件取决于所有其他文件。 The ZipFile solution can still be used. ZipFile解决方案仍然可以使用。 Loop once to get the right file, then use zipFile.getInputStream(getEntry(fileName)) , or if you store the entry instead of just its name, zipFile.getInputStream(file) .循环一次以获取正确的文件,然后使用zipFile.getInputStream(getEntry(fileName)) ,或者如果您存储条目而不仅仅是其名称zipFile.getInputStream(file)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM