简体   繁体   中英

Japanese characters not displayed properly in ReadOnlySharedStringsTable

I have a problem regarding reading Japanese characters in my Excel file. The reader's constructor is:

public XExcelFileReader(final String excelPath) throws Exception {
    this.opcPkg = OPCPackage.open(excelPath, PackageAccess.READ);
    this.stringsTable = new ReadOnlySharedStringsTable(this.opcPkg);

    XSSFReader xssfReader = new XSSFReader(this.opcPkg);
    XMLInputFactory factory = XMLInputFactory.newInstance();
    InputStream inputStream = xssfReader.getSheetsData().next();
    this.xmlReader = factory.createXMLStreamReader(inputStream);

    while (this.xmlReader.hasNext()) {
      this.xmlReader.next();
      if (this.xmlReader.isStartElement()) {
        if (this.xmlReader.getLocalName().equals("sheetData"))
          break;
      }
    }
  }

At this point, stringsTable has Japanese characters such as 予算ヨサン but in the Excel file, it only reads as 予算 . Some are displayed as they appear in the Excel file but some are not. I'm not sure where it goes wrong and encoding is UTF-8.

I'm reading a large Excel file and I have tried creating a workbook but it gives out a memory error, so using that is not an option.

Any idea on where it could go wrong?

Found an answer. Modified the constructor to be:

public XExcelFileReader(final String excelPath) throws Exception {
    this.opcPkg = OPCPackage.open(excelPath, PackageAccess.READ);
    XSSFReader xssfReader = new XSSFReader(this.opcPkg);
    this.stringsTable = xssfReader.getSharedStringsTable();

    XMLInputFactory factory = XMLInputFactory.newInstance();
    InputStream inputStream = xssfReader.getSheetsData().next();
    this.xmlReader = factory.createXMLStreamReader(inputStream);

    while (this.xmlReader.hasNext()) {
      this.xmlReader.next();
      if (this.xmlReader.isStartElement()) {
        if (this.xmlReader.getLocalName().equals("sheetData")) {
          break;
        }
      }
    }
  }

and changed stringsTable to SharedStringsTable. I'm not really sure why XSSFReader had to go first. Anyone who can explain is more than welcome to do so.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM