简体   繁体   English

将 Excel 工作表读入数据表的最佳/最快方法?

[英]Best /Fastest way to read an Excel Sheet into a DataTable?

I'm hoping someone here can point me in the right direction - I'm trying to create a fairly robust utility program to read the data from an Excel sheet (may be.xls OR.xlsx) into a DataTable as quickly and leanly as possible.我希望这里有人能给我指出正确的方向——我正在尝试创建一个相当健壮的实用程序来将 Excel 工作表(可能是.xls OR.xlsx)中的数据读取到 DataTable 中,速度和精益一样快可能的。

I came up with this routine in VB (although I'd be just as happy with a good C# answer):我在 VB 中想出了这个例程(尽管我对一个好的 C# 答案同样满意):

Public Shared Function ReadExcelIntoDataTable(ByVal FileName As String, ByVal SheetName As String) As DataTable
    Dim RetVal As New DataTable

    Dim strConnString As String
    strConnString = "Driver={Microsoft Excel Driver (*.xls, *.xlsx, *.xlsm, *.xlsb)};DBQ=" & FileName & ";"

    Dim strSQL As String 
    strSQL = "SELECT * FROM [" & SheetName & "$]"

    Dim y As New Odbc.OdbcDataAdapter(strSQL, strConnString)

    y.Fill(RetVal)

    Return RetVal

End Function

I'm wondering if this is the best way to do it or if there are better / more efficent ways (or just more intelligent ways - Maybe Linq / native.Net providers) to use instead?我想知道这是否是最好的方法,或者是否有更好/更有效的方法(或者只是更智能的方法——也许是 Linq/native.Net 提供者)可以使用?

ALSO, just a quick and silly additional question - Do I need to include code such as y.Dispose() and y = Nothing or will that be taken care of since the variable should die at the end of the routine, right??另外,只是一个快速而愚蠢的附加问题——我是否需要包含诸如y.Dispose()y = Nothing之类的代码,或者是否需要处理这些代码,因为变量应该在例程结束时消失,对吗?

Thanks!!谢谢!!

If you want to do the same thing in C# based on Ciarán Answer如果你想根据Ciarán Answer在 C# 中做同样的事情

string sSheetName = null;
string sConnection = null;
DataTable dtTablesList = default(DataTable);
OleDbCommand oleExcelCommand = default(OleDbCommand);
OleDbDataReader oleExcelReader = default(OleDbDataReader);
OleDbConnection oleExcelConnection = default(OleDbConnection);

sConnection = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=C:\\Test.xls;Extended Properties=\"Excel 12.0;HDR=No;IMEX=1\"";

oleExcelConnection = new OleDbConnection(sConnection);
oleExcelConnection.Open();

dtTablesList = oleExcelConnection.GetSchema("Tables");

if (dtTablesList.Rows.Count > 0) 
{
    sSheetName = dtTablesList.Rows[0]["TABLE_NAME"].ToString();
}

dtTablesList.Clear();
dtTablesList.Dispose();


if (!string.IsNullOrEmpty(sSheetName)) {
    oleExcelCommand = oleExcelConnection.CreateCommand();
    oleExcelCommand.CommandText = "Select * From [" + sSheetName + "]";
    oleExcelCommand.CommandType = CommandType.Text;
    oleExcelReader = oleExcelCommand.ExecuteReader();
    nOutputRow = 0;

    while (oleExcelReader.Read())
    {
    }
    oleExcelReader.Close();
}
oleExcelConnection.Close();

here is another way read Excel into a DataTable without using OLEDB very quick Keep in mind that the file ext would have to be.CSV for this to work properly这是另一种不使用 OLEDB 非常快地将 Excel 读入 DataTable 的方法 请记住,文件扩展名必须是 .CSV 才能正常工作

private static DataTable GetDataTabletFromCSVFile(string csv_file_path)
{
    csvData = new DataTable(defaultTableName);
    try
    {
        using (TextFieldParser csvReader = new TextFieldParser(csv_file_path))
        {
            csvReader.SetDelimiters(new string[]
            {
                tableDelim 
            });
            csvReader.HasFieldsEnclosedInQuotes = true;
            string[] colFields = csvReader.ReadFields();
            foreach (string column in colFields)
            {
                DataColumn datecolumn = new DataColumn(column);
                datecolumn.AllowDBNull = true;
                csvData.Columns.Add(datecolumn);
            }

            while (!csvReader.EndOfData)
            {
                string[] fieldData = csvReader.ReadFields();
                //Making empty value as null
                for (int i = 0; i < fieldData.Length; i++)
                {
                    if (fieldData[i] == string.Empty)
                    {
                        fieldData[i] = string.Empty; //fieldData[i] = null
                    }
                    //Skip rows that have any csv header information or blank rows in them
                    if (fieldData[0].Contains("Disclaimer") || string.IsNullOrEmpty(fieldData[0]))
                    {
                        continue;
                    }
                }
                csvData.Rows.Add(fieldData);
            }
        }
    }
    catch (Exception ex)
    {
    }
    return csvData;
}

I have always used OLEDB for this, something like...我一直为此使用OLEDB ,比如...

    Dim sSheetName As String
    Dim sConnection As String
    Dim dtTablesList As DataTable
    Dim oleExcelCommand As OleDbCommand
    Dim oleExcelReader As OleDbDataReader
    Dim oleExcelConnection As OleDbConnection

    sConnection = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=C:\Test.xls;Extended Properties=""Excel 12.0;HDR=No;IMEX=1"""

    oleExcelConnection = New OleDbConnection(sConnection)
    oleExcelConnection.Open()

    dtTablesList = oleExcelConnection.GetSchema("Tables")

    If dtTablesList.Rows.Count > 0 Then
        sSheetName = dtTablesList.Rows(0)("TABLE_NAME").ToString
    End If

    dtTablesList.Clear()
    dtTablesList.Dispose()

    If sSheetName <> "" Then

        oleExcelCommand = oleExcelConnection.CreateCommand()
        oleExcelCommand.CommandText = "Select * From [" & sSheetName & "]"
        oleExcelCommand.CommandType = CommandType.Text

        oleExcelReader = oleExcelCommand.ExecuteReader

        nOutputRow = 0

        While oleExcelReader.Read

        End While

        oleExcelReader.Close()

    End If

    oleExcelConnection.Close()

The ACE.OLEDB provider will read both .xls and .xlsx files and I have always found the speed quite good. ACE.OLEDB提供程序将读取.xls.xlsx文件,我一直觉得速度非常好。

public DataTable ImportExceltoDatatable(string filepath)
{
    // string sqlquery= "Select * From [SheetName$] Where YourCondition";
    string sqlquery = "Select * From [SheetName$] Where Id='ID_007'";
    DataSet ds = new DataSet();
    string constring = @"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + filepath + ";Extended Properties=\"Excel 12.0;HDR=YES;\"";
    OleDbConnection con = new OleDbConnection(constring + "");
    OleDbDataAdapter da = new OleDbDataAdapter(sqlquery, con);
    da.Fill(ds);
    DataTable dt = ds.Tables[0];
    return dt;
}

This seemed to work pretty well for me.这对我来说似乎很管用。

private DataTable ReadExcelFile(string sheetName, string path)
{

    using (OleDbConnection conn = new OleDbConnection())
    {
        DataTable dt = new DataTable();
        string Import_FileName = path;
        string fileExtension = Path.GetExtension(Import_FileName);
        if (fileExtension == ".xls")
            conn.ConnectionString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + Import_FileName + ";" + "Extended Properties='Excel 8.0;HDR=YES;'";
        if (fileExtension == ".xlsx")
            conn.ConnectionString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + Import_FileName + ";" + "Extended Properties='Excel 12.0 Xml;HDR=YES;'";
        using (OleDbCommand comm = new OleDbCommand())
        {
            comm.CommandText = "Select * from [" + sheetName + "$]";
            comm.Connection = conn;
            using (OleDbDataAdapter da = new OleDbDataAdapter())
            {
                da.SelectCommand = comm;
                da.Fill(dt);
                return dt;
            }
        }
    }
}

You can use OpenXml SDK for *.xlsx files.您可以将 OpenXml SDK 用于 *.xlsx 文件。 It works very quickly.它工作得非常快。 I made simple C# IDataReader implementation for this sdk.我为此 sdk 制作了简单的 C# IDataReader 实现。 See here .这里 Now you can easy read excel file to DataTable and you can import excel file to sql server database (use SqlBulkCopy).现在您可以轻松地将 excel 文件读取到 DataTable,并且可以将 excel 文件导入到 sql server 数据库(使用 SqlBulkCopy)。 ExcelDataReader reads very fast. ExcelDataReader 读取速度非常快。 On my machine 10000 records less 3 sec and 60000 less 8 sec.在我的机器上,10000 条记录不到 3 秒,60000 条记录不到 8 秒。

Read to DataTable example:读取到 DataTable 示例:

class Program
{
    static void Main(string[] args)
    {
        var dt = new DataTable();
        using (var reader = new ExcelDataReader(@"data.xlsx"))
            dt.Load(reader);

        Console.WriteLine("done: " + dt.Rows.Count);
        Console.ReadKey();
   }
}

I found it pretty easy like this我发现这很容易

    using System;
    using System.Data;
    using System.IO;
    using Excel;

    public DataTable ExcelToDataTableUsingExcelDataReader(string storePath)
    {
        FileStream stream = File.Open(storePath, FileMode.Open, FileAccess.Read);

        string fileExtension = Path.GetExtension(storePath);
        IExcelDataReader excelReader = null;
        if (fileExtension == ".xls")
        {
            excelReader = ExcelReaderFactory.CreateBinaryReader(stream);
        }
        else if (fileExtension == ".xlsx")
        {
            excelReader = ExcelReaderFactory.CreateOpenXmlReader(stream);
        }

        excelReader.IsFirstRowAsColumnNames = true;
        DataSet result = excelReader.AsDataSet();
        var test = result.Tables[0];
        return result.Tables[0];
    }

Note: you need to install SharpZipLib package for this注意:您需要为此安装 SharpZipLib 包

Install-Package SharpZipLib

neat and clean;干净整洁; ;) ;)

This is the way to read from excel oledb这是从excel oledb读取的方法

try
{
    System.Data.OleDb.OleDbConnection MyConnection;
    System.Data.DataSet DtSet;
    System.Data.OleDb.OleDbDataAdapter MyCommand;
    string strHeader7 = "";
    strHeader7 = (hdr7) ? "Yes" : "No";
    MyConnection = new System.Data.OleDb.OleDbConnection("Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + fn + ";Extended Properties=\"Excel 12.0;HDR=" + strHeader7 + ";IMEX=1\"");
    MyCommand = new System.Data.OleDb.OleDbDataAdapter("select * from [" + wks + "$]", MyConnection);
    MyCommand.TableMappings.Add("Table", "TestTable");
    DtSet = new System.Data.DataSet();
    MyCommand.Fill(DtSet);
    dgv7.DataSource = DtSet.Tables[0];
    MyConnection.Close();
}
catch (Exception ex)
{
    MessageBox.Show(ex.ToString());
}
''' <summary>
''' ReadToDataTable reads the given Excel file to a datatable.
''' </summary>
''' <param name="table">The table to be populated.</param>
''' <param name="incomingFileName">The file to attempt to read to.</param>
''' <returns>TRUE if success, FALSE otherwise.</returns>
''' <remarks></remarks>
Public Function ReadToDataTable(ByRef table As DataTable,
                                incomingFileName As String) As Boolean
    Dim returnValue As Boolean = False
    Try

        Dim sheetName As String = ""
        Dim connectionString As String = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" & incomingFileName & ";Extended Properties=""Excel 12.0;HDR=No;IMEX=1"""
        Dim tablesInFile As DataTable
        Dim oleExcelCommand As OleDbCommand
        Dim oleExcelReader As OleDbDataReader
        Dim oleExcelConnection As OleDbConnection

        oleExcelConnection = New OleDbConnection(connectionString)
        oleExcelConnection.Open()

        tablesInFile = oleExcelConnection.GetSchema("Tables")

        If tablesInFile.Rows.Count > 0 Then
            sheetName = tablesInFile.Rows(0)("TABLE_NAME").ToString
        End If

        If sheetName <> "" Then

            oleExcelCommand = oleExcelConnection.CreateCommand()
            oleExcelCommand.CommandText = "Select * From [" & sheetName & "]"
            oleExcelCommand.CommandType = CommandType.Text

            oleExcelReader = oleExcelCommand.ExecuteReader

            'Determine what row of the Excel file we are on
            Dim currentRowIndex As Integer = 0

            While oleExcelReader.Read
                'If we are on the First Row, then add the item as Columns in the DataTable
                If currentRowIndex = 0 Then
                    For currentFieldIndex As Integer = 0 To (oleExcelReader.VisibleFieldCount - 1)
                        Dim currentColumnName As String = oleExcelReader.Item(currentFieldIndex).ToString
                        table.Columns.Add(currentColumnName, GetType(String))
                        table.AcceptChanges()
                    Next
                End If
                'If we are on a Row with Data, add the data to the SheetTable
                If currentRowIndex > 0 Then
                    Dim newRow As DataRow = table.NewRow
                    For currentFieldIndex As Integer = 0 To (oleExcelReader.VisibleFieldCount - 1)
                        Dim currentColumnName As String = table.Columns(currentFieldIndex).ColumnName
                        newRow(currentColumnName) = oleExcelReader.Item(currentFieldIndex)
                        If IsDBNull(newRow(currentFieldIndex)) Then
                            newRow(currentFieldIndex) = ""
                        End If
                    Next
                    table.Rows.Add(newRow)
                    table.AcceptChanges()
                End If

                'Increment the CurrentRowIndex
                currentRowIndex += 1
            End While

            oleExcelReader.Close()

        End If

        oleExcelConnection.Close()
        returnValue = True
    Catch ex As Exception
        'LastError = ex.ToString
        Return False
    End Try


    Return returnValue
End Function

The below code is tested by myself and is very simple, understandable, usable and fast.下面的代码是我自己测试的,非常简单,易懂,好用,速度快。 This code, initially takes all sheet names, then puts all tables of that excel file in a DataSet.此代码最初采用所有工作表名称,然后将该 excel 文件的所有表放入数据集中。

    public static DataSet ToDataSet(string exceladdress, int startRecord = 0, int maxRecord = -1, string condition = "")
    {
        DataSet result = new DataSet();
        using (OleDbConnection connection = new OleDbConnection(
                (exceladdress.TrimEnd().ToLower().EndsWith("x"))
                ? "Provider=Microsoft.ACE.OLEDB.12.0;Data Source='" + exceladdress + "';" + "Extended Properties='Excel 12.0 Xml;HDR=YES;'"
                : "provider=Microsoft.Jet.OLEDB.4.0;Data Source='" + exceladdress + "';Extended Properties=Excel 8.0;"))
            try
            {
                connection.Open();
                DataTable schema = connection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
                foreach (DataRow drSheet in schema.Rows)
                    if (drSheet["TABLE_NAME"].ToString().Contains("$"))
                    {
                        string s = drSheet["TABLE_NAME"].ToString();
                        if (s.StartsWith("'")) s = s.Substring(1, s.Length - 2);
                        System.Data.OleDb.OleDbDataAdapter command =
                            new System.Data.OleDb.OleDbDataAdapter(string.Join("", "SELECT * FROM [", s, "] ", condition), connection);
                        DataTable dt = new DataTable();
                        if (maxRecord > -1 && startRecord > -1) command.Fill(startRecord, maxRecord, dt);
                        else command.Fill(dt);
                        result.Tables.Add(dt);
                    }
                return result;
            }
            catch (Exception ex) { return null; }
            finally { connection.Close(); }
    }

Enjoy...享受...

Use the below snippet it will be helpfull.使用下面的代码片段会很有帮助。

string POCpath = @"G:\Althaf\abc.xlsx";

string POCConnection = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + POCpath + ";Extended Properties=\"Excel 12.0;HDR=Yes;IMEX=1\";";

OleDbConnection POCcon = new OleDbConnection(POCConnection);
OleDbCommand POCcommand = new OleDbCommand();
DataTable dt = new DataTable();
OleDbDataAdapter POCCommand = new OleDbDataAdapter("select * from [Sheet1$] ", POCcon);
POCCommand.Fill(dt);
Console.WriteLine(dt.Rows.Count);

I've used this method and for me, it is so efficient and fast.我已经使用过这种方法,对我来说,它是如此高效和快速。

// Step 1. Download NuGet source of Generic Parsing by Andrew Rissing
// Step 2. Reference this to your project
// Step 3. Reference Microsoft.Office.Interop.Excel to your project
// Step 4. Follow the logic below

public static DataTable ExcelSheetToDataTable(string filePath) {

    // Save a copy of the Excel file as CSV
    var xlApp = new XL.Application();
    var xlWbk = xlApp.Workbooks.Open(filePath);
    var tempPath =
        Path.Combine(Environment
            .GetFolderPath(Environment.SpecialFolder.UserProfile)
            , "AppData"
            , "Local",
            , "Temp"
            , Path.GetFileNameWithoutExtension(filePath) + ".csv");

    xlApp.DisplayAlerts = false;
    xlWbk.SaveAs(tempPath, XL.XlFileFormat.xlCSV);
    xlWbk.Close(SaveChanges: false);
    xlApp.Quit();

    // The actual parsing
    using (var parser = new GenericParserAdapter(tempPath)) {
        parser.FirstRowHasHeader = true;
        return parser.GetDataTable();
    }

}

Generic Parsing by Andrew Rissing Andrew Rissing 的通用解析

Here is another way of doing it这是另一种方法

public DataSet CreateTable(string source)
{
    using (var connection = new OleDbConnection(GetConnectionString(source, true)))
    {
        var dataSet = new DataSet();
        connection.Open();
        var schemaTable = connection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
        if (schemaTable == null)
            return dataSet;

        var sheetName = "";
        foreach (DataRow row in schemaTable.Rows)
        {
            sheetName = row["TABLE_NAME"].ToString();
            break;
        }

        var command = string.Format("SELECT * FROM [{0}$]", sheetName);
        var adapter = new OleDbDataAdapter(command, connection);
        adapter.TableMappings.Add("TABLE", "TestTable");
        adapter.Fill(dataSet);
        connection.Close();

        return dataSet;
    }
}

//

private string GetConnectionString(string source, bool hasHeader)
{
    return string.Format("Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};
    Extended Properties=\"Excel 12.0;HDR={1};IMEX=1\"", source, (hasHeader ? "YES" : "NO"));
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM