简体   繁体   English

使用OleDbDataAdapter从Excel工作表中获取数据时出现问题

[英]Problem with using OleDbDataAdapter to fetch data from a Excel sheet

First, I want to say that I'm out on deep water here, since I'm just doing some changes to code that is written by someone else in the company, using OleDbDataAdapter to "talk" to Excel and I'm not familiar with that. 首先,我想说我在这里深水,因为我只是对公司中其他人编写的代码进行了一些更改,使用OleDbDataAdapter与Excel“对话”并且我不熟悉接着就,随即。 There is one bug there I just can't follow. 有一个我无法遵循的错误。

I'm trying to use a OleDbDataAdapter to read in a excel file with around 450 lines. 我正在尝试使用OleDbDataAdapter来读取大约450行的excel文件。

In the code it's done like this: 在代码中,它是这样做的:

connection = new OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;" + "Data Source='" + path + "';" + "Extended Properties=\"Excel 8.0;HDR=Yes;IMEX=1;\"");
connection.Open();
OleDbDataAdapter objAdapter = new OleDbDataAdapter(objCommand.CommandText, connection);
objAdapter.Fill(objDataSet, "Excel");

foreach (DataColumn dataColumn in objTable.Columns) {
  if (dataColumn.Ordinal > objDataSet.Tables[0].Columns.Count - 1) {
    objDataSet.Tables[0].Columns.Add();
  }
  objDataSet.Tables[0].Columns[dataColumn.Ordinal].ColumnName = dataColumn.ColumnName;
  objImport.Columns.Add(dataColumn.ColumnName);
}

foreach (DataRow dataRow in objDataSet.Tables[0].Rows) {
   ...
}

Everything seems to be working fine except for one thing. 除了一件事,一切似乎都很好。 The second column is filled with mostly four digit numbers like 6739, 3920 and so one, but fice rows have alphanumeric values like 8201NO and 8205NO. 第二列填充了大多数四位数字,如6739,3920等,但是fice行具有字母数字值,如8201NO和8205NO。 Those five cells are reported as having blank contents instead of their alphanumeric content. 据报道,这五个细胞具有空白内容而不是其字母数字内容。 I have checked in excel, and all the cells in this columns are marked as Text. 我已检入excel,此列中的所有单元格都标记为文本。

This is an xls file by the way, and not xlsx. 顺便说一句,这是一个xls文件,而不是xlsx。

Do anyone have any clue as why these cells are shown as blank in the DataRow, but the numeric ones are shown fine? 有没有人知道为什么这些单元格在DataRow中显示为空白,但数字显示正常? There are other columns with alphanumeric content that are shown just fine. 还有其他具有字母数字内容的列显示得很好。

What's happening is that excel is trying to assign a data type to the spreadsheet column based on the first several values in that column. 发生的事情是,Excel正在尝试根据该列中的前几个值将数据类型分配给电子表格列。 I suspect that if you look at the properties in that column it will say it is a numerical column. 我怀疑如果你看一下那列中的属性,它会说它是一个数字列。

The problem comes when you start trying to query that spreadsheet using jet. 当您开始尝试使用jet查询该电子表格时,问题就出现了。 When it thinks it's dealing with a numerical column and it finds a varchar value it quietly returns nothing. 当它认为它正在处理数字列并且它找到一个varchar值时,它会静静地返回任何内容。 Not even a cryptic error message to go off of. 甚至没有一个神秘的错误消息。

As a possible work around can you move one of the alpha numeric values to the first row of data and then try parsing. 作为一种可能的解决方法,您可以将其中一个字母数字值移动到第一行数据,然后尝试解析。 I suspect you will start getting values for the alpha numeric rows then... 我怀疑你会开始获取字母数字行的值然后......

Take a look at this article . 看看这篇文章 It goes into more detail on this issue. 它详细介绍了这个问题。 it also talks about a possible work around which is: 它还讨论了可能的工作:

However, as per JET documentation, we can override the registry setting thru the Connection String, if we set IMEX=1( as part of Extended Properties), the JET will set the all column type as UNICODE VARCHAR or ADVARWCHAR irrespective of 'ImportMixedTypes' key value.hey 但是,根据JET文档,我们可以通过Connection String覆盖注册表设置,如果我们设置IMEX = 1(作为扩展属性的一部分),JET会将all列类型设置为UNICODE VARCHAR或ADVARWCHAR,而不管'ImportMixedTypes'关键价值。嘿

IMEX=1 means "Read mixed data as text." IMEX=1表示“将混合数据作为文本读取”。

There are some gotchas, however. 然而,有一些陷阱。 Jet will only use several rows to determine whether the data is mixed, and if so happens these rows are all numeric, you'll get this behaviour. Jet只会使用几行来确定数据是否是混合的,如果是这样,这些行都是数字的,你会得到这种行为。

See connectionstrings.com for details: 有关详细信息,请参阅connectionstrings.com

Check out the [HKEY_LOCAL_MACHINE\\SOFTWARE\\Microsoft\\Jet\\4.0\\Engines\\Excel] located registry REG_DWORD "TypeGuessRows". 查看位于注册表REG_DWORD “TypeGuessRows”的[HKEY_LOCAL_MACHINE\\SOFTWARE\\Microsoft\\Jet\\4.0\\Engines\\Excel] That's the key to not letting Excel use only the first 8 rows to guess the columns data type. 这是不让Excel仅使用前8行来猜测列数据类型的关键。 Set this value to 0 to scan all rows. 将此值设置为0可扫描所有行。 This might hurt performance. 这可能会影响性能。 Please also note that adding the IMEX=1 option might cause the IMEX feature to set in after just 8 rows. 另请注意,添加IMEX = 1选项可能会导致IMEX功能仅在8行后设置。 Use IMEX=0 instead to be sure to force the registry TypeGuessRows=0 (scan all rows) to work. 使用IMEX = 0来确保强制注册表TypeGuessRows = 0(扫描所有行)才能工作。

I would advise against using the OleDb data provider stuff to access Excel if you can help it. 如果你能提供帮助,我建议不要使用OleDb数据提供程序来访问Excel。 I've had nothing but problems, for exactly the reasons that others have pointed out. 我完全没有其他问题,因为其他人已经指出了原因。 The performance tends to be atrocious as well when you are dealing with large spreadsheets. 当您处理大型电子表格时,性能往往也很恶劣。

You might try this open source solution: http://exceldatareader.codeplex.com/ 您可以尝试这种开源解决方案: http//exceldatareader.codeplex.com/

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM