简体   繁体   中英

Character encoding issue while reading an Excel file in a Java Web App

In a Java Web app, I'm using the JExcel API to read Excel files sent by clients.

I'm doing something like this :

byte[] excelFile = ...
InputStream inputStream = new ByteArrayInputStream(excelFile);

WorkbookSettings ws = new WorkbookSettings();
ws.setEncoding("CP1252");

Workbook w = Workbook.getWorkbook(inputStream, ws);
...

Struts gives me the Excel file as a byte array (I use the FormFile#getFileData() method).

It works OK on Windows. However this is quite different on Linux. While cells can be parsed correctly and their content well interpreted (even if there is some non ASCII characters like 'à', 'ê', etc), sheet names does not. I get some bad characters like '?' or ' '.

I forced workbook encoding to UTF-8 :

ws.setEncoding("UTF-8");

but there is no effect.

I changed the Excel file to UTF-8 too, nothing happens. I really don't understand why it does not work, especially sheet names, since the whole chain is in UTF-8 (I have a Servlet Filter which forces HTTP requests encoding to UTF-8 too).

I had a similar problem but with another java excel api. The problem is that excel tries to be smart and replace some characters for you. An example of this in my case would be that excel replaced three dots '...' with a singe character representing three dots out of it's own character set which is non-standard UTF-8. My framework didn't recognize it and I got similar undefined character ( ') as you are now getting. To fix this I had to manually edit all the excel spreadsheets and then it worked ok. The big problem I had was finding which characters it was. I am not sure if this is an option for you though.

It seems to be a bug of the JXL version I am using. Indeed, if I upgrade the JAR to the last version, the problem does not occur.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM