简体   繁体   中英

OpenXML, SAX, and Simply Reading an Xlsx file

I have been struggling to find a solution on how to read a large xlsx file with OpenXml. I have tried the microsoft samples without luck. I simply need to read an excel file into a DataTable in c#. I am not concerned with value types in the datatable, everything can be stored as a string values.

The samples I have found so far don't retain the structure of the spreadsheet and only return the values of the cells.

Any ideas?

The open xml SDK can be a little hard to understand. However, I have found it useful to use http://simpleooxml.codeplex.com/ this code plex project. It adds a thin layer over the sdk to more easily parse through excel files and work with styles.

Then you can use something like the following with their worksheet reader to recurse through and grab the values you want

System.IO.MemoryStream ms = Utility.StreamToMemory(xslxTemplate);
using (SpreadsheetDocument document = SpreadsheetDocument.Open(ms, true))
{
    IEnumerable<Sheet> sheets = document.WorkbookPart.Workbook.GetFirstChild<Sheets>().Elements<Sheet>();
    if (sheets.Count() == 0)
    {
        // The specified worksheet does not exist.
        return null;
    }
    string relationshipId = sheets.First().Id.Value;
    WorksheetPart worksheetPart = (WorksheetPart)document.WorkbookPart.GetPartById(relationshipId);
    string myval =WorksheetReader.GetCell("A", 0, worksheetPart).CellValue.InnerText;
    // Put in a loop to go through contents of document
}

You can get DataTable this way:

using (SpreadsheetDocument spreadsheet = SpreadsheetDocument.Open(fileName, false))
{
    DataTable data = ToDataTable(spreadsheet, "Employees");
}

This method will read Excel sheet data as DataTable

public DataTable ToDataTable(SpreadsheetDocument spreadsheet, string worksheetName)
{
    var workbookPart = spreadsheet.WorkbookPart;

    var sheet = workbookPart
        .Workbook
        .Descendants<Sheet>()
        .FirstOrDefault(s => s.Name == worksheetName);

    var worksheetPart = sheet == null
        ? null
        : workbookPart.GetPartById(sheet.Id) as WorksheetPart;

    var dataTable = new DataTable();

    if (worksheetPart != null)
    {
        var sheetData = worksheetPart.Worksheet.GetFirstChild<SheetData>();

        foreach (Row row in sheetData.Descendants<Row>())
        {
            var values = row
                .Descendants<Cell>()
                .Select(cell =>
                {
                    var value = cell.CellValue.InnerXml;
                    if (cell.DataType != null && cell.DataType.Value == CellValues.SharedString)
                    {
                        value = workbookPart
                            .SharedStringTablePart
                            .SharedStringTable
                            .ChildElements[int.Parse(value)]
                            .InnerText;
                    }
                    return (object)value;
                })
                .ToArray();

            dataTable.Rows.Add(values);
        }
    }

    return dataTable;
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM