[英]Best /Fastest way to read an Excel Sheet into a DataTable?
I'm hoping someone here can point me in the right direction - I'm trying to create a fairly robust utility program to read the data from an Excel sheet (may be.xls OR.xlsx) into a DataTable as quickly and leanly as possible.我希望这里有人能给我指出正确的方向——我正在尝试创建一个相当健壮的实用程序来将 Excel 工作表(可能是.xls OR.xlsx)中的数据读取到 DataTable 中,速度和精益一样快可能的。
I came up with this routine in VB (although I'd be just as happy with a good C# answer):我在 VB 中想出了这个例程(尽管我对一个好的 C# 答案同样满意):
Public Shared Function ReadExcelIntoDataTable(ByVal FileName As String, ByVal SheetName As String) As DataTable
Dim RetVal As New DataTable
Dim strConnString As String
strConnString = "Driver={Microsoft Excel Driver (*.xls, *.xlsx, *.xlsm, *.xlsb)};DBQ=" & FileName & ";"
Dim strSQL As String
strSQL = "SELECT * FROM [" & SheetName & "$]"
Dim y As New Odbc.OdbcDataAdapter(strSQL, strConnString)
y.Fill(RetVal)
Return RetVal
End Function
I'm wondering if this is the best way to do it or if there are better / more efficent ways (or just more intelligent ways - Maybe Linq / native.Net providers) to use instead?我想知道这是否是最好的方法,或者是否有更好/更有效的方法(或者只是更智能的方法——也许是 Linq/native.Net 提供者)可以使用?
ALSO, just a quick and silly additional question - Do I need to include code such as y.Dispose()
and y = Nothing
or will that be taken care of since the variable should die at the end of the routine, right??另外,只是一个快速而愚蠢的附加问题——我是否需要包含诸如
y.Dispose()
和y = Nothing
之类的代码,或者是否需要处理这些代码,因为变量应该在例程结束时消失,对吗?
Thanks!!谢谢!!
If you want to do the same thing in C# based on Ciarán Answer如果你想根据Ciarán Answer在 C# 中做同样的事情
string sSheetName = null;
string sConnection = null;
DataTable dtTablesList = default(DataTable);
OleDbCommand oleExcelCommand = default(OleDbCommand);
OleDbDataReader oleExcelReader = default(OleDbDataReader);
OleDbConnection oleExcelConnection = default(OleDbConnection);
sConnection = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=C:\\Test.xls;Extended Properties=\"Excel 12.0;HDR=No;IMEX=1\"";
oleExcelConnection = new OleDbConnection(sConnection);
oleExcelConnection.Open();
dtTablesList = oleExcelConnection.GetSchema("Tables");
if (dtTablesList.Rows.Count > 0)
{
sSheetName = dtTablesList.Rows[0]["TABLE_NAME"].ToString();
}
dtTablesList.Clear();
dtTablesList.Dispose();
if (!string.IsNullOrEmpty(sSheetName)) {
oleExcelCommand = oleExcelConnection.CreateCommand();
oleExcelCommand.CommandText = "Select * From [" + sSheetName + "]";
oleExcelCommand.CommandType = CommandType.Text;
oleExcelReader = oleExcelCommand.ExecuteReader();
nOutputRow = 0;
while (oleExcelReader.Read())
{
}
oleExcelReader.Close();
}
oleExcelConnection.Close();
here is another way read Excel into a DataTable without using OLEDB very quick Keep in mind that the file ext would have to be.CSV for this to work properly
这是另一种不使用 OLEDB 非常快地将 Excel 读入 DataTable 的方法 请记住,文件扩展名必须是 .CSV 才能正常工作
private static DataTable GetDataTabletFromCSVFile(string csv_file_path)
{
csvData = new DataTable(defaultTableName);
try
{
using (TextFieldParser csvReader = new TextFieldParser(csv_file_path))
{
csvReader.SetDelimiters(new string[]
{
tableDelim
});
csvReader.HasFieldsEnclosedInQuotes = true;
string[] colFields = csvReader.ReadFields();
foreach (string column in colFields)
{
DataColumn datecolumn = new DataColumn(column);
datecolumn.AllowDBNull = true;
csvData.Columns.Add(datecolumn);
}
while (!csvReader.EndOfData)
{
string[] fieldData = csvReader.ReadFields();
//Making empty value as null
for (int i = 0; i < fieldData.Length; i++)
{
if (fieldData[i] == string.Empty)
{
fieldData[i] = string.Empty; //fieldData[i] = null
}
//Skip rows that have any csv header information or blank rows in them
if (fieldData[0].Contains("Disclaimer") || string.IsNullOrEmpty(fieldData[0]))
{
continue;
}
}
csvData.Rows.Add(fieldData);
}
}
}
catch (Exception ex)
{
}
return csvData;
}
I have always used OLEDB
for this, something like...我一直为此使用
OLEDB
,比如...
Dim sSheetName As String
Dim sConnection As String
Dim dtTablesList As DataTable
Dim oleExcelCommand As OleDbCommand
Dim oleExcelReader As OleDbDataReader
Dim oleExcelConnection As OleDbConnection
sConnection = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=C:\Test.xls;Extended Properties=""Excel 12.0;HDR=No;IMEX=1"""
oleExcelConnection = New OleDbConnection(sConnection)
oleExcelConnection.Open()
dtTablesList = oleExcelConnection.GetSchema("Tables")
If dtTablesList.Rows.Count > 0 Then
sSheetName = dtTablesList.Rows(0)("TABLE_NAME").ToString
End If
dtTablesList.Clear()
dtTablesList.Dispose()
If sSheetName <> "" Then
oleExcelCommand = oleExcelConnection.CreateCommand()
oleExcelCommand.CommandText = "Select * From [" & sSheetName & "]"
oleExcelCommand.CommandType = CommandType.Text
oleExcelReader = oleExcelCommand.ExecuteReader
nOutputRow = 0
While oleExcelReader.Read
End While
oleExcelReader.Close()
End If
oleExcelConnection.Close()
The ACE.OLEDB
provider will read both .xls
and .xlsx
files and I have always found the speed quite good. ACE.OLEDB
提供程序将读取.xls
和.xlsx
文件,我一直觉得速度非常好。
public DataTable ImportExceltoDatatable(string filepath)
{
// string sqlquery= "Select * From [SheetName$] Where YourCondition";
string sqlquery = "Select * From [SheetName$] Where Id='ID_007'";
DataSet ds = new DataSet();
string constring = @"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + filepath + ";Extended Properties=\"Excel 12.0;HDR=YES;\"";
OleDbConnection con = new OleDbConnection(constring + "");
OleDbDataAdapter da = new OleDbDataAdapter(sqlquery, con);
da.Fill(ds);
DataTable dt = ds.Tables[0];
return dt;
}
This seemed to work pretty well for me.这对我来说似乎很管用。
private DataTable ReadExcelFile(string sheetName, string path)
{
using (OleDbConnection conn = new OleDbConnection())
{
DataTable dt = new DataTable();
string Import_FileName = path;
string fileExtension = Path.GetExtension(Import_FileName);
if (fileExtension == ".xls")
conn.ConnectionString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + Import_FileName + ";" + "Extended Properties='Excel 8.0;HDR=YES;'";
if (fileExtension == ".xlsx")
conn.ConnectionString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + Import_FileName + ";" + "Extended Properties='Excel 12.0 Xml;HDR=YES;'";
using (OleDbCommand comm = new OleDbCommand())
{
comm.CommandText = "Select * from [" + sheetName + "$]";
comm.Connection = conn;
using (OleDbDataAdapter da = new OleDbDataAdapter())
{
da.SelectCommand = comm;
da.Fill(dt);
return dt;
}
}
}
}
You can use OpenXml SDK for *.xlsx files.您可以将 OpenXml SDK 用于 *.xlsx 文件。 It works very quickly.
它工作得非常快。 I made simple C# IDataReader implementation for this sdk.
我为此 sdk 制作了简单的 C# IDataReader 实现。 See here .
看这里。 Now you can easy read excel file to DataTable and you can import excel file to sql server database (use SqlBulkCopy).
现在您可以轻松地将 excel 文件读取到 DataTable,并且可以将 excel 文件导入到 sql server 数据库(使用 SqlBulkCopy)。 ExcelDataReader reads very fast.
ExcelDataReader 读取速度非常快。 On my machine 10000 records less 3 sec and 60000 less 8 sec.
在我的机器上,10000 条记录不到 3 秒,60000 条记录不到 8 秒。
Read to DataTable example:读取到 DataTable 示例:
class Program
{
static void Main(string[] args)
{
var dt = new DataTable();
using (var reader = new ExcelDataReader(@"data.xlsx"))
dt.Load(reader);
Console.WriteLine("done: " + dt.Rows.Count);
Console.ReadKey();
}
}
I found it pretty easy like this我发现这很容易
using System;
using System.Data;
using System.IO;
using Excel;
public DataTable ExcelToDataTableUsingExcelDataReader(string storePath)
{
FileStream stream = File.Open(storePath, FileMode.Open, FileAccess.Read);
string fileExtension = Path.GetExtension(storePath);
IExcelDataReader excelReader = null;
if (fileExtension == ".xls")
{
excelReader = ExcelReaderFactory.CreateBinaryReader(stream);
}
else if (fileExtension == ".xlsx")
{
excelReader = ExcelReaderFactory.CreateOpenXmlReader(stream);
}
excelReader.IsFirstRowAsColumnNames = true;
DataSet result = excelReader.AsDataSet();
var test = result.Tables[0];
return result.Tables[0];
}
Note: you need to install SharpZipLib package for this注意:您需要为此安装 SharpZipLib 包
Install-Package SharpZipLib
neat and clean;干净整洁; ;)
;)
This is the way to read from excel oledb这是从excel oledb读取的方法
try
{
System.Data.OleDb.OleDbConnection MyConnection;
System.Data.DataSet DtSet;
System.Data.OleDb.OleDbDataAdapter MyCommand;
string strHeader7 = "";
strHeader7 = (hdr7) ? "Yes" : "No";
MyConnection = new System.Data.OleDb.OleDbConnection("Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + fn + ";Extended Properties=\"Excel 12.0;HDR=" + strHeader7 + ";IMEX=1\"");
MyCommand = new System.Data.OleDb.OleDbDataAdapter("select * from [" + wks + "$]", MyConnection);
MyCommand.TableMappings.Add("Table", "TestTable");
DtSet = new System.Data.DataSet();
MyCommand.Fill(DtSet);
dgv7.DataSource = DtSet.Tables[0];
MyConnection.Close();
}
catch (Exception ex)
{
MessageBox.Show(ex.ToString());
}
''' <summary>
''' ReadToDataTable reads the given Excel file to a datatable.
''' </summary>
''' <param name="table">The table to be populated.</param>
''' <param name="incomingFileName">The file to attempt to read to.</param>
''' <returns>TRUE if success, FALSE otherwise.</returns>
''' <remarks></remarks>
Public Function ReadToDataTable(ByRef table As DataTable,
incomingFileName As String) As Boolean
Dim returnValue As Boolean = False
Try
Dim sheetName As String = ""
Dim connectionString As String = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" & incomingFileName & ";Extended Properties=""Excel 12.0;HDR=No;IMEX=1"""
Dim tablesInFile As DataTable
Dim oleExcelCommand As OleDbCommand
Dim oleExcelReader As OleDbDataReader
Dim oleExcelConnection As OleDbConnection
oleExcelConnection = New OleDbConnection(connectionString)
oleExcelConnection.Open()
tablesInFile = oleExcelConnection.GetSchema("Tables")
If tablesInFile.Rows.Count > 0 Then
sheetName = tablesInFile.Rows(0)("TABLE_NAME").ToString
End If
If sheetName <> "" Then
oleExcelCommand = oleExcelConnection.CreateCommand()
oleExcelCommand.CommandText = "Select * From [" & sheetName & "]"
oleExcelCommand.CommandType = CommandType.Text
oleExcelReader = oleExcelCommand.ExecuteReader
'Determine what row of the Excel file we are on
Dim currentRowIndex As Integer = 0
While oleExcelReader.Read
'If we are on the First Row, then add the item as Columns in the DataTable
If currentRowIndex = 0 Then
For currentFieldIndex As Integer = 0 To (oleExcelReader.VisibleFieldCount - 1)
Dim currentColumnName As String = oleExcelReader.Item(currentFieldIndex).ToString
table.Columns.Add(currentColumnName, GetType(String))
table.AcceptChanges()
Next
End If
'If we are on a Row with Data, add the data to the SheetTable
If currentRowIndex > 0 Then
Dim newRow As DataRow = table.NewRow
For currentFieldIndex As Integer = 0 To (oleExcelReader.VisibleFieldCount - 1)
Dim currentColumnName As String = table.Columns(currentFieldIndex).ColumnName
newRow(currentColumnName) = oleExcelReader.Item(currentFieldIndex)
If IsDBNull(newRow(currentFieldIndex)) Then
newRow(currentFieldIndex) = ""
End If
Next
table.Rows.Add(newRow)
table.AcceptChanges()
End If
'Increment the CurrentRowIndex
currentRowIndex += 1
End While
oleExcelReader.Close()
End If
oleExcelConnection.Close()
returnValue = True
Catch ex As Exception
'LastError = ex.ToString
Return False
End Try
Return returnValue
End Function
The below code is tested by myself and is very simple, understandable, usable and fast.下面的代码是我自己测试的,非常简单,易懂,好用,速度快。 This code, initially takes all sheet names, then puts all tables of that excel file in a DataSet.
此代码最初采用所有工作表名称,然后将该 excel 文件的所有表放入数据集中。
public static DataSet ToDataSet(string exceladdress, int startRecord = 0, int maxRecord = -1, string condition = "")
{
DataSet result = new DataSet();
using (OleDbConnection connection = new OleDbConnection(
(exceladdress.TrimEnd().ToLower().EndsWith("x"))
? "Provider=Microsoft.ACE.OLEDB.12.0;Data Source='" + exceladdress + "';" + "Extended Properties='Excel 12.0 Xml;HDR=YES;'"
: "provider=Microsoft.Jet.OLEDB.4.0;Data Source='" + exceladdress + "';Extended Properties=Excel 8.0;"))
try
{
connection.Open();
DataTable schema = connection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
foreach (DataRow drSheet in schema.Rows)
if (drSheet["TABLE_NAME"].ToString().Contains("$"))
{
string s = drSheet["TABLE_NAME"].ToString();
if (s.StartsWith("'")) s = s.Substring(1, s.Length - 2);
System.Data.OleDb.OleDbDataAdapter command =
new System.Data.OleDb.OleDbDataAdapter(string.Join("", "SELECT * FROM [", s, "] ", condition), connection);
DataTable dt = new DataTable();
if (maxRecord > -1 && startRecord > -1) command.Fill(startRecord, maxRecord, dt);
else command.Fill(dt);
result.Tables.Add(dt);
}
return result;
}
catch (Exception ex) { return null; }
finally { connection.Close(); }
}
Enjoy...享受...
Use the below snippet it will be helpfull.使用下面的代码片段会很有帮助。
string POCpath = @"G:\Althaf\abc.xlsx";
string POCConnection = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + POCpath + ";Extended Properties=\"Excel 12.0;HDR=Yes;IMEX=1\";";
OleDbConnection POCcon = new OleDbConnection(POCConnection);
OleDbCommand POCcommand = new OleDbCommand();
DataTable dt = new DataTable();
OleDbDataAdapter POCCommand = new OleDbDataAdapter("select * from [Sheet1$] ", POCcon);
POCCommand.Fill(dt);
Console.WriteLine(dt.Rows.Count);
I've used this method and for me, it is so efficient and fast.我已经使用过这种方法,对我来说,它是如此高效和快速。
// Step 1. Download NuGet source of Generic Parsing by Andrew Rissing
// Step 2. Reference this to your project
// Step 3. Reference Microsoft.Office.Interop.Excel to your project
// Step 4. Follow the logic below
public static DataTable ExcelSheetToDataTable(string filePath) {
// Save a copy of the Excel file as CSV
var xlApp = new XL.Application();
var xlWbk = xlApp.Workbooks.Open(filePath);
var tempPath =
Path.Combine(Environment
.GetFolderPath(Environment.SpecialFolder.UserProfile)
, "AppData"
, "Local",
, "Temp"
, Path.GetFileNameWithoutExtension(filePath) + ".csv");
xlApp.DisplayAlerts = false;
xlWbk.SaveAs(tempPath, XL.XlFileFormat.xlCSV);
xlWbk.Close(SaveChanges: false);
xlApp.Quit();
// The actual parsing
using (var parser = new GenericParserAdapter(tempPath)) {
parser.FirstRowHasHeader = true;
return parser.GetDataTable();
}
}
Here is another way of doing it这是另一种方法
public DataSet CreateTable(string source)
{
using (var connection = new OleDbConnection(GetConnectionString(source, true)))
{
var dataSet = new DataSet();
connection.Open();
var schemaTable = connection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
if (schemaTable == null)
return dataSet;
var sheetName = "";
foreach (DataRow row in schemaTable.Rows)
{
sheetName = row["TABLE_NAME"].ToString();
break;
}
var command = string.Format("SELECT * FROM [{0}$]", sheetName);
var adapter = new OleDbDataAdapter(command, connection);
adapter.TableMappings.Add("TABLE", "TestTable");
adapter.Fill(dataSet);
connection.Close();
return dataSet;
}
}
//
private string GetConnectionString(string source, bool hasHeader)
{
return string.Format("Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};
Extended Properties=\"Excel 12.0;HDR={1};IMEX=1\"", source, (hasHeader ? "YES" : "NO"));
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.