简体   繁体   English

通过OleDbDataAdapter方法(C#)从Excel文件进​​行古怪的SELECT

[英]Quirky SELECT from Excel file via OleDbDataAdapter method (C#)

I have got an Excel file in this form : 我有以下格式的Excel文件:

Column 1    Column 2    Column 3  
 data1        data2    
 data1        data2  
 data1        data2  
 data1        data2  
 data1        data2       data3  

That is, the whole Column 3 is empty except for the last row. 即,除最后一行外,整个列3为空。 I am accessing the Excel file via OleDbDataAdapter, returning a DataTable: here's the code. 我正在通过OleDbDataAdapter访问Excel文件,并返回一个DataTable:这是代码。

query = "SELECT * FROM [" + query + "]";
objDT = new DataTable();
objCmdSQL = this.GetCommand();
objCmdSQL.CommandText = query;
objSQLDad = new OleDbDataAdapter(objCmdSQL);
objSQLDad.Fill(objDT);
return objDT;

The point is, in this scenario my code returns a DataTable with just Column 1 and Column 2. 关键是,在这种情况下,我的代码将返回仅包含第1列和第2列的DataTable。
My guess is that JET engine tries to infer column type by the type of the very first cell in every column; 我的猜测是,JET引擎尝试通过每列中第一个单元格的类型来推断列类型; being the first value null, the whole column is ignored. 如果第一个值为null,则将忽略整个列。
I tried to fill in zeros and this code is actually returning all three columns; 我试图填充零,这段代码实际上返回了所有三列。 this is obviously the least preferable solution because I have to process large numbers of small files. 这显然是最不推荐的解决方案,因为我必须处理大量的小文件。
Inverting the selection range (from, ie "A1:C5" to "C5:A1" ) doesn't work either. 反转选择范围(从“ A1:C5”到“ C5:A1”)也不起作用。 I'm looking for something more elegant. 我正在寻找更优雅的东西。
I have already found a couple of posts discussing type mismatch (varchar cells in int columns and vice versa) but actually haven't found anything related to this one. 我已经找到了几篇讨论类型不匹配的文章(int列中的varchar单元,反之亦然),但实际上没有找到任何与此类型有关的内容。
Thanks for reading! 谢谢阅读!

edit 编辑

Weird behavior again. 再次奇怪的行为。 I have to work on mostly Excel 2003 .xls files, but since this question has been answered I thought I could test my code against Excel 2007 .xslx files. 我必须主要处理Excel 2003 .xls文件,但是由于已经回答了这个问题,所以我认为我可以针对Excel 2007 .xslx文件测试我的代码。 The connection string is the following: 连接字符串如下:

string strConn = @"Provider=Microsoft.ACE.OLEDB.12.0; Data Source=" + _fileName.Trim() + @";Extended Properties=""Excel 12.0;HDR=No;IMEX=1;""";

I get the "External table is not in the expected format" exception which I reckon is the standard exception when there is a version mismatch between ACE/JET and the file being opened. 我收到“外部表未采用预期的格式”异常,当ACE / JET与正在打开的文件之间版本不匹配时,我认为这是标准异常。

The string 字符串

Provider=Microsoft.ACE.OLEDB.12.0 

means that I am using the most recent version of OLEDB, I took a quick peek around and this version is used everywhere there is need of connecting to .xlsx files. 表示我正在使用OLEDB的最新版本,我快速浏览了一下,该版本可在需要连接到.xlsx文件的任何地方使用。
I have tried with just a vanilla provider ( just Excel 12.0, without IMEX nor HDR ) but I get the same exception. 我只尝试使用普通提供程序(只有Excel 12.0,没有IMEX或HDR),但遇到了同样的异常。
I am on .NET 2.0.50727 SP2, maybe time to upgrade? 我在.NET 2.0.50727 SP2上,也许是时候升级了?

I recreated your situation and following returned the 3 columns correctly. 我重新创建了您的情况,随后正确返回了3列。 That is, the first two columns fully populated with data and the third containing null until the last row, which had data. 也就是说,前两列已完全填充数据,第三列包含null直到最后一行具有数据。

string connString = @"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=C:\MyExcel.xls;Extended Properties=""Excel 8.0;HDR=No;IMEX=1"";";
DataTable dt = new DataTable();
OleDbConnection conn = new OleDbConnection(connString);
OleDbDataAdapter adapter = new OleDbDataAdapter("SELECT * FROM [Sheet1$]", conn);

adapter.Fill(dt);

Note I used the Access Database Engine(ACE) provider, which succeeded the old Joint Engine Technology(JET) provider, and my results may represent a behavior difference between the two. 注意我使用的是Access Database Engine(ACE)提供程序,它是旧的Joint Engine Technology(JET)提供程序的后继产品,我的结果可能表示两者之间的行为差​​异。 Of course, if you aren't already using it I suggest using the ACE provider as I believe Microsoft would too. 当然,如果您还没有使用它,我建议使用ACE提供程序,因为我相信微软也会这样做。 Also, note the connection's Extended Properties : 另外,请注意连接的Extended Properties

"HDR=Yes;" “HDR =是;” indicates that the first row contains columnnames, not data. 表示第一行包含列名,而不是数据。 "HDR=No;" “HDR =无;” indicates the opposite. 表示相反。

"IMEX=1;" “IMEX = 1;” tells the driver to always read "intermixed" (numbers, dates, strings etc) data columns as text. 告诉驱动程序始终将“混合”(数字,日期,字符串等)数据列读作文本。 Note that this option might affect excel sheet write access negative. 请注意,此选项可能会影响excel工作表写入访问权限。

Let me know if this helps. 如果这有帮助,请告诉我。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM