简体   繁体   中英

How to get the size in bytes of an excel file while writing to it in Apache POI?

I have a bit of a problem around here, I just can't get it done.

The thing is that I'm using POI for a project in Java and I have to get the final outputs in .xls format(HSSF for Apache).

So, my business rules state that each file I generate has to be 12 MB as a maximun file size.

But I know .xls has some internal way of dealing the data(XML I guess), so this adds more bytes than just putting the result in a plain text file. I just can't get the size of the Excel Workbook since it generates itself in a temporal location(I can't find it) and I just can't read it while writting.

Is there any way to get the size in bytes of the Excel output file while Java writes to it using the HSSF Workbook Object?

Your best bet is probably to periodically write the file out, and see how big it is. The only way to know for sure how big the file will be is to write it out...

With HSSF, not all cells take up the same amount of size. String cells take up a different size to numeric cells, formula cells vary depending on the number of operators and values in them, string cells vary based on if they're using the same text as a previous cell or not etc. You can do some rough guesses based on the kinds of things you're adding (remembering to take account of cell styles, named ranges, pictures etc), but the only way to be sure is to write it out every so often and see how big it is.

For XSSF, it's even more complicated. Not only do different cells take up different amounts of characters in the XML (much as for HSSF), the .xlsx file format is a compressed format. So, writing the same snippet of XML can take variable amounts of space in the output file, based on how the compression algorithms manage it. (The first one will take more than subsequent ones for example). So, there's even less hope for being certain without saving and testing. Again, you can probably come up with some rough guesses, but the only way to be sure is to save and see.

If you want a predictable file size, you'll have to use something purely text based, eg a .CSV file.

好吧,在对API进行了一些研究之后,我发现名为getBytes()的方法返回工作簿上每个数据(表,行,数据等)的字节数组,因此使用长度将返回非常接近的字节数组。用户使用的最终工作簿生成的字节。

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM