简体   繁体   中英

Reading multiple tabs of a huge excel file in Java using XSS and Event

I am using this code from (by author: lchen) which reads contents from excel file based on number of rows I provide into method 'readRow() '.

 import java.io.InputStream;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;

import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamException;
import javax.xml.stream.XMLStreamReader;

import org.apache.poi.openxml4j.opc.OPCPackage;
import org.apache.poi.openxml4j.opc.PackageAccess;
import org.apache.poi.ss.util.CellReference;
import org.apache.poi.xssf.eventusermodel.ReadOnlySharedStringsTable;
import org.apache.poi.xssf.eventusermodel.XSSFReader;
import org.apache.poi.xssf.usermodel.XSSFRichTextString;
import org.xml.sax.InputSource;


public class TestLargeFileRead {
    private int rowNum = 0;
    private OPCPackage opcPkg;
    private ReadOnlySharedStringsTable stringsTable;
    private XMLStreamReader xmlReader;


    public void XExcelFileReader(String excelPath) throws Exception {
        opcPkg = OPCPackage.open(excelPath, PackageAccess.READ);
        this.stringsTable = new ReadOnlySharedStringsTable(opcPkg);

        XSSFReader xssfReader = new XSSFReader(opcPkg);
        XMLInputFactory factory = XMLInputFactory.newInstance();
        InputStream inputStream = xssfReader.getSheetsData().next();
        xmlReader = factory.createXMLStreamReader(inputStream);


        while (xmlReader.hasNext()) {
            xmlReader.next();
            if (xmlReader.isStartElement()) {
                if (xmlReader.getLocalName().equals("sheetData"))
                    break;
            }
        }
    }


    public int rowNum() {
        return rowNum;
    }


    public List<String[]> readRows(int batchSize) throws XMLStreamException {
        String elementName = "row";
        List<String[]> dataRows = new ArrayList<String[]>();
        if (batchSize > 0) {
            while (xmlReader.hasNext()) {
                xmlReader.next();
                if (xmlReader.isStartElement()) {
                    if (xmlReader.getLocalName().equals(elementName)) {
                        rowNum++;
                        dataRows.add(getDataRow());
                        if (dataRows.size() == batchSize)
                            break;
                    }
                }
            }
        }
        return dataRows;
    }

    private String[] getDataRow() throws XMLStreamException {
        List<String> rowValues = new ArrayList<String>();
        while (xmlReader.hasNext()) {
            xmlReader.next();
            if (xmlReader.isStartElement()) {
                if (xmlReader.getLocalName().equals("c")) {
                    CellReference cellReference = new CellReference(
                            xmlReader.getAttributeValue(null, "r"));
                    // Fill in the possible blank cells!
                    while (rowValues.size() < cellReference.getCol()) {
                        rowValues.add("");
                    }
                    String cellType = xmlReader.getAttributeValue(null, "t");
                    rowValues.add(getCellValue(cellType));
                }
            } else if (xmlReader.isEndElement()
                    && xmlReader.getLocalName().equals("row")) {
                break;
            }
        }
        return rowValues.toArray(new String[rowValues.size()]);
    }

    private String getCellValue(String cellType) throws XMLStreamException {
        String value = ""; // by default
        while (xmlReader.hasNext()) {
            xmlReader.next();
            if (xmlReader.isStartElement()) {
                if (xmlReader.getLocalName().equals("v")) {
                    if (cellType != null && cellType.equals("s")) {
                        int idx = Integer.parseInt(xmlReader.getElementText());
                        return new XSSFRichTextString(
                                stringsTable.getEntryAt(idx)).toString();
                    } else {
                        return xmlReader.getElementText();
                    }
                }
            } else if (xmlReader.isEndElement()
                    && xmlReader.getLocalName().equals("c")) {
                break;
            }
        }
        return value;
    }

    @Override
    protected void finalize() throws Throwable {
        if (opcPkg != null)
            opcPkg.close();
        super.finalize();
    }
public static void main(String[] args)  {  
        try {  
            TestLargeFileRead howto = new TestLargeFileRead();  
            howto.XExcelFileReader("D:\\TEMP_CATALOG\\H1.xlsx");  
        } catch (Exception e) {  
            e.printStackTrace();  
        }  


    }
}

But it reads only First SHEET's contents and discards other subsequent SHEETS. My requirement is to read SHEET name; and based on name read that SHEET's contents. Can anyone help me to customize this above code fetch SHEET NAME and their contents ? please ?

The key class you need to work with, and tweak your use of, is XSSFReader . If you take a look at the Javadocs for it , you'll see it provides an Iterator of InputStreams of all the sheets, and a way to get at the root Workbook stream.

If you want to access all the sheets, you need to change these lines:

    InputStream inputStream = xssfReader.getSheetsData().next();
    xmlReader = factory.createXMLStreamReader(inputStream);

Into something more like:

Iterator<InputStream> sheetsData = xssfReader.getSheetsData();
while (sheetsData.hasNext()) {
    InputStream inputStream = sheetsData.next();
    xmlReader = factory.createXMLStreamReader(inputStream);

    ....
}

If you also want to get the sheet name as well, you'll want to do something like what is shown in the Apache POI XLSX event-based text extractor

XSSFReader.SheetIterator iter = (XSSFReader.SheetIterator) xssfReader.getSheetsData();
while (sheetsData.hasNext()) {
    InputStream inputStream = sheetsData.next();
    String sheetName = iter.getSheetName();

    if (sheetName.equalsIgnoreCase("TheSheetIWant")) {
       xmlReader = factory.createXMLStreamReader(inputStream);

       ....
    }
}

If you want to know more about doing this stuff, then one of the best examples, that's easy to read and follow, is XSSFEventBasedExcelExtractor that comes with Apache POI - read the code for that and learn!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM