简体   繁体   中英

Apache POI generated xlsx file size is larger than manually created via Microsoft Excel

I am using Apache POI to generate xlsx sheet for reports. One of the POI generated report I saved as another using Microsoft excel . When comparing the original file and the saved file there was 12Mb difference. The original file was 15Mb while the saved file is just 2.5Mb. The Workbook used is XSSFWorkbook.

Is it possible to reduce the file size created by Apache POI

Here is the code snippet I have used:

XSSFWorkbook workbookTitle = new XSSFWorkbook(fileInputStream);
workbook = new SXSSFWorkbook(workbookTitle, maxRows);

font = workbook.createFont();
font.setFontHeightInPoints((short) 9);
font.setFontName(FONT_NAME);

cellTwoDecimal = workbook.createCellStyle();

DataFormat format = workbook.createDataFormat();

cellTwoDecimal.setDataFormat(format.getFormat("0.00"));
cellTwoDecimal.setFont(font);

cellCommon = workbook.createCellStyle();
cellCommon.setFont(font);

cellText = workbook.createCellStyle();
cellText.setDataFormat((short) BuiltinFormats.getBuiltinFormat("text"));
cellText.setFont(font);

cellWrpText = workbook.createCellStyle();
cellWrpText.setWrapText(true);
cellWrpText.setFont(font);


Row row;
Cell cell;

for (int i = 0; i < size; i++) {
    row = excelSheet.createRow(rowIndex++);
    cell = row.createCell(i);
    cell.setCellValue(rowHeader);
    cell.setCellStyle(cellCommon);

}

I have removed some internal logics from code. Please share your ideas.

[Edit 1] I am inserting a lot of blank cells where there is no value, ie. some part of the report will not have any value. So I put a blank cell there. I am also setting style for the blank cell. Can this be the reason?

Thanks in advance.

According to your "edit 1"... if i understand you correctly you create cells with no value. you do not have to do so. if you dont want to write something then do not create the empty cell. on my poi-experience you only have to create rows and cells if you want to write something.

from this point of view it is clear, that your xlsx is very large (many many cell-objects). i think MS Excel removes the empty cells on manual save.

added: Need to mention that there is also an issue with styling your cells. please try to use as few as possible instances of CellStyle. if you have cells with same style do not create a new instance of CellStyle with same attributes. please apply the same instance of CellStyle. Also do not assign style to simple text cells. in this case excel uses a default style (background='white', textcolor='black', font='any default', size='any default', format='default').

I had a similar problem, and later figured out that I was opening the FileOutputStream in append mode(append=true). The file size grew exponentially(say from 7KB to 54KB) every-time I update a single cell on the sheet. When removed the append, it worked just fine.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM