简体   繁体   中英

Failure when using java.util.zip.* to unzip an xlsx blob created with Apache POI

The following code is a test program that creates an xlsx blob using Apache POI and unzips the blob (xlsx is in zip format) to check some values. The code is based on Stack Overflow questions but I have not been able to get unzipping to work properly. A requirement for this is that the filesystem must not be touched so we resort to streams.

The problem is that the line

int size = (int) entry.getSize();

always produces -1. However, see the comments before

zipInput.closeEntry();

for a variation that gets further into the program but seems incorrect.

We don't want to use POI itself for reading the xlsx file because we are testing a problem where POI appears to not be cooperating with Apache Struts and we want to eliminate the case where the way we are using POI independent of Struts is the culprit. Once this test passes, if the problem Struts reports remains, we know to look at Struts itself rather than the way we are using POI.

/**
 * Short, Self Contained, Correct Example of
 * 1) creating an xlsx blob with test data;
 * 2) sending it to a stream;
 * 3) unzipping the xlsx;
 * 4) checking the worksheet's xml file for
 * correct test values.
 * Heavily inlined from two other properly factored
 * files, one POI wrapper and one JUnit test case
 * (although the test is not a "unit test" since it
 * only tests third party library integration)
 * DEPENDENCIES
 * 1) poi-3.8.jar
 * 2) poi-ooxml-3.8.jar
 * 3) xmlbeans-2.6.0.jar
 * 4) dom4j-1.6.1.jar
 * 5) poi-ooxml-schemas-3.8.jar
 * Other solutions attempted:
 * 1) commons-compress: always returned byte array of proper size
 * but containing all zeros.
 * Note that we do NOT want to use the filesystem. This exercise's
 * purpose is to create dynamically generated xlsx files delivered
 * via HTTP. The filesystem is never involved.
 */
import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import java.util.zip.ZipEntry;
import java.util.zip.ZipInputStream;

import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.ss.usermodel.Sheet;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;

public class CreateWorkbook
{
    public static void main(String[] args) throws IOException
    {
        // create the data to place into workbook
        List<List<String>> resultset = new ArrayList<List<String>>();
        List<String> row = new ArrayList<String>();
        row.add("A1");
        row.add("B1");
        row.add("C1");
        resultset.add(row);
        row = new ArrayList<String>();
        row.add("A2");
        row.add("B2");
        row.add("C2");
        resultset.add(row);
        row = new ArrayList<String>();
        row.add("A3");
        row.add("B3");
        row.add("C3");
        resultset.add(row);
        // create POI workbook and fill it with resultSet
        XSSFWorkbook workbook = new XSSFWorkbook();
        Sheet s = workbook.createSheet();
        for (int i = 0; i < resultset.size(); i++)
        {
            List<String> record = resultset.get(i);
            Row r = s.createRow(i);
            for (int j = 0; j < record.size(); j++)
                r.createCell(j).setCellValue(record.get(j));
        }
        ByteArrayOutputStream out = new ByteArrayOutputStream();
        workbook.write(out);
        out.flush();
        // retrieve workbook's data into a string and
        // test if the cell values (a1,b2,c3,etc) appear
        ByteArrayInputStream in = new ByteArrayInputStream(out.toByteArray());
        // xlsx is a zipped archive so we need to unzip
        ZipInputStream zipInput = new ZipInputStream(in);
        assert zipInput.available() == 1;
        ZipEntry entry = zipInput.getNextEntry();
        assert entry != null;
        while (entry != null)
        {
            // since this "if" condition passes during execution we
            // know that the stream contains a zip archive with
            // an entry called "xl/worksheets/sheet1.xml". Therefore,
            // ZipEntry retrieval appears to work correctly and the
            // archive has at least a partially correct zip format.
            if (entry.getName().equals("xl/worksheets/sheet1.xml"))
            {
                int size = (int) entry.getSize();
                // ==============================================
                // always fails, reports -1
                assert size > 0 : size;
                // ==============================================
                byte[] bytes = new byte[size];
                int readbytes = zipInput.read(bytes, 0, size);
                assert readbytes > 0 : readbytes;
                // nothing after this line has been tested
                StringBuilder builder = new StringBuilder();
                for (int i = 0; i < size; i++)
                    builder.append(Character.toString((char) bytes[i]));
                String xml = builder.toString();
                assert xml.contains("A1");
                assert xml.contains("B2");
                assert xml.contains("C3");
                break;
            }
            // Swapping the order of these two lines actually
            // causes the program to go further, with size = 678
            // and failing at the readbytes assertion rather than
            // the size assertion.
            // I thought swapping would mess everything up since
            // the entry would be closed immediately after obtaining
            // it.
            zipInput.closeEntry();
            entry = zipInput.getNextEntry();
        }
        zipInput.close();
        in.close();
    }
}

You not exactly rigth work with Zip.

Try this snippet:

        if (entry.getName().equals("xl/worksheets/sheet1.xml"))
        {
            ByteArrayOutputStream bos = new ByteArrayOutputStream();
            IOUtils.copy(zipInput, bos);
            String xml = bos.toString();
            assert xml.contains("A1");
            assert xml.contains("B2");
            assert xml.contains("C3");
            break;
        }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM