Java 从 ZipInputStream 条目创建 InputStream

Question

I would like to write a method that read several XML files inside a ZIP, from a single InputStream.我想编写一个方法，从单个 InputStream 中读取 ZIP 中的多个 XML 文件。

The method would open a ZipInputStream, and on each xml file, get the corresponding InputStream, and give it to my XML parser.该方法将打开一个 ZipInputStream，并在每个 xml 文件上获取相应的 InputStream，并将其提供给我的 XML 解析器。 Here is the skeleton of the method:这是该方法的框架：

private void readZip(InputStream is) throws IOException {

    ZipInputStream zis = new ZipInputStream(is);
    ZipEntry entry = zis.getNextEntry();

    while (entry != null) {

        if (entry.getName().endsWith(".xml")) {

            // READ THE STREAM
        }
        entry = zis.getNextEntry();
    }
}

The problematic part is the "// READ THE STREAM".有问题的部分是“// READ THE STREAM”。 I have a working solution, which consist to create a ByteArrayInputStream, and feed my parser with it.我有一个可行的解决方案，它包括创建一个 ByteArrayInputStream，并用它来提供我的解析器。 But it uses a buffer, and for large files I get an OutOfMemoryError.但它使用缓冲区，对于大文件，我得到一个 OutOfMemoryError。 Here is the code, if someone is still interested:这是代码，如果有人仍然感兴趣：

int count;
byte buffer[] = new byte[2048];
ByteArrayOutputStream out = new ByteArrayOutputStream();
while ((count = zis.read(buffer)) != -1) { out.write(buffer, 0, count); }       
InputStream is = new ByteArrayInputStream(out.toByteArray());

The ideal solution would be to feed the parser with the original ZipInputStream.理想的解决方案是将原始 ZipInputStream 提供给解析器。 It should works, because it works if I just print the entry content with a Scanner:它应该可以工作，因为如果我只使用扫描仪打印条目内容，它就可以工作：

Scanner sc = new Scanner(zis);
while (sc.hasNextLine())
{
    System.out.println(sc.nextLine());
}

But... The parser I'm currently using (jdom2, but I also tried with javax.xml.parsers.DocumentBuilderFactory) closes the stream after parsing the data:/.但是...我当前使用的解析器（jdom2，但我也尝试使用 javax.xml.parsers.DocumentBuilderFactory）在解析数据后关闭 stream：/。 So I'm unable to get the next entry and continue.所以我无法获得下一个条目并继续。

So finally the question is:所以最后的问题是：

Does anybody know a DOM parser that doesn't close its stream?有人知道不关闭其 stream 的 DOM 解析器吗？
Is there another way to have an InputStream from a ZipEntry?还有另一种方法可以从 ZipEntry 获得 InputStream 吗？

Thanks.谢谢。

Answer 1

您可以包装ZipInputStream并拦截对close()的调用。

Answer 2

Thanks to halfbit, I ended up with my own ZipInputStream class, which overrides the close method : 感谢halfbit，我最终得到了自己的ZipInputStream类，它覆盖了close方法：

import java.io.IOException;
import java.io.InputStream;
import java.util.zip.ZipInputStream;

public class CustomZipInputStream extends ZipInputStream {

    private boolean _canBeClosed = false;

    public CustomZipInputStream(InputStream is) {
        super(is);
    }

    @Override
    public void close() throws IOException {

        if(_canBeClosed) super.close();
    }

    public void allowToBeClosed() { _canBeClosed = true; }
}

Answer 3

A small improvement on Tim's solution: The problem with having to call allowToBeClosed() before close() is that it makes closing the ZipInputStream properly when handling exceptions tricky and will break Java 7's try-with-resources statement. Tim解决方案的一个小改进：必须在close（）之前调用allowToBeClosed（）的问题是它在处理异常棘手时正确关闭ZipInputStream并且会破坏Java 7的try-with-resources语句。

I suggest creating a wrapper class as follows: 我建议创建一个包装类，如下所示：

public class UncloseableInputStream extends InputStream {
  private final InputStream input;

  public UncloseableInputStream(InputStream input) {
    this.input = input;
  }

  @Override
  public void close() throws IOException {} // do not close the wrapped stream

  @Override
  public int read() throws IOException {
    return input.read();
  }

  // delegate all other InputStream methods as with read above
}

which can then safely be used as follows: 然后可以安全地使用如下：

try (ZipInputStream zipIn = new ZipInputStream(...))
{
  DocumentBuilder db = DocumentBuilderFactory.newInstance().newDocumentBuilder();
  ZipEntry entry;
  while (null != (entry = zipIn.getNextEntry()))
  {
    if ("file.xml".equals(entry.getName())
    {
      Document doc = db.parse(new UncloseableInputStream(zipIn));
    }
  }
}

Answer 4

If you don't mind external dependencies, Apache Commons IO provides a convenience class named CloseShieldInputStream for blocking the close() call.如果你不介意外部依赖， Apache Commons IO提供了一个方便的 class 名为CloseShieldInputStream用于阻止close()调用。

private void readZip(InputStream is) throws IOException {

    ZipInputStream zis = new ZipInputStream(is);
    ZipEntry entry = zis.getNextEntry();

    while (entry != null) {

        if (entry.getName().endsWith(".xml")) {
            //commons-io 2.9 and later
            InputStream tempIs = CloseShieldInputStream.wrap(zis);
            //commons-io < 2.9
            //InputStream tempIs = new CloseShieldInputStream(zis);

            // READ THE STREAM

        }
        entry = zis.getNextEntry();
    }
}

Java 从 ZipInputStream 条目创建 InputStream

问题描述

4 个解决方案

解决方案1
3 已采纳 2013-11-16 16:50:19

解决方案2
3 2013-11-16 17:01:10

解决方案3
3 2013-12-11 07:52:52

解决方案4
0 2022-10-06 14:52:02

Java 从 ZipInputStream 条目创建 InputStream

问题描述

4 个解决方案

解决方案1 3 已采纳 2013-11-16 16:50:19

解决方案2 3 2013-11-16 17:01:10

解决方案3 3 2013-12-11 07:52:52

解决方案4 0 2022-10-06 14:52:02

解决方案1
3 已采纳 2013-11-16 16:50:19

解决方案2
3 2013-11-16 17:01:10

解决方案3
3 2013-12-11 07:52:52

解决方案4
0 2022-10-06 14:52:02