简体   繁体   English

OpenXML,SAX和只需读取Xlsx文件

[英]OpenXML, SAX, and Simply Reading an Xlsx file

I have been struggling to find a solution on how to read a large xlsx file with OpenXml. 我一直在努力寻找如何使用OpenXml读取大型xlsx文件的解决方案。 I have tried the microsoft samples without luck. 我试过微软示例,没有运气。 I simply need to read an excel file into a DataTable in c#. 我只需要将Excel文件读入c#的DataTable中即可。 I am not concerned with value types in the datatable, everything can be stored as a string values. 我不关心数据表中的值类型,所有内容都可以存储为字符串值。

The samples I have found so far don't retain the structure of the spreadsheet and only return the values of the cells. 到目前为止,我发现的样本并未保留电子表格的结构,仅返回单元格的值。

Any ideas? 有任何想法吗?

The open xml SDK can be a little hard to understand. 开放的xml SDK可能有点难以理解。 However, I have found it useful to use http://simpleooxml.codeplex.com/ this code plex project. 但是,我发现使用http://simpleooxml.codeplex.com/此代码plex项目很有用。 It adds a thin layer over the sdk to more easily parse through excel files and work with styles. 它在sdk上添加了一个薄层,以便更轻松地通过excel文件进​​行解析并使用样式。

Then you can use something like the following with their worksheet reader to recurse through and grab the values you want 然后,您可以在其工作表阅读器中使用类似以下的内容来递归并获取所需的值

System.IO.MemoryStream ms = Utility.StreamToMemory(xslxTemplate);
using (SpreadsheetDocument document = SpreadsheetDocument.Open(ms, true))
{
    IEnumerable<Sheet> sheets = document.WorkbookPart.Workbook.GetFirstChild<Sheets>().Elements<Sheet>();
    if (sheets.Count() == 0)
    {
        // The specified worksheet does not exist.
        return null;
    }
    string relationshipId = sheets.First().Id.Value;
    WorksheetPart worksheetPart = (WorksheetPart)document.WorkbookPart.GetPartById(relationshipId);
    string myval =WorksheetReader.GetCell("A", 0, worksheetPart).CellValue.InnerText;
    // Put in a loop to go through contents of document
}

You can get DataTable this way: 您可以通过以下方式获取DataTable:

using (SpreadsheetDocument spreadsheet = SpreadsheetDocument.Open(fileName, false))
{
    DataTable data = ToDataTable(spreadsheet, "Employees");
}

This method will read Excel sheet data as DataTable 此方法将Excel表格数据读取为DataTable

public DataTable ToDataTable(SpreadsheetDocument spreadsheet, string worksheetName)
{
    var workbookPart = spreadsheet.WorkbookPart;

    var sheet = workbookPart
        .Workbook
        .Descendants<Sheet>()
        .FirstOrDefault(s => s.Name == worksheetName);

    var worksheetPart = sheet == null
        ? null
        : workbookPart.GetPartById(sheet.Id) as WorksheetPart;

    var dataTable = new DataTable();

    if (worksheetPart != null)
    {
        var sheetData = worksheetPart.Worksheet.GetFirstChild<SheetData>();

        foreach (Row row in sheetData.Descendants<Row>())
        {
            var values = row
                .Descendants<Cell>()
                .Select(cell =>
                {
                    var value = cell.CellValue.InnerXml;
                    if (cell.DataType != null && cell.DataType.Value == CellValues.SharedString)
                    {
                        value = workbookPart
                            .SharedStringTablePart
                            .SharedStringTable
                            .ChildElements[int.Parse(value)]
                            .InnerText;
                    }
                    return (object)value;
                })
                .ToArray();

            dataTable.Rows.Add(values);
        }
    }

    return dataTable;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM