简体   繁体   English

SSIS Excel 导入强制列类型不正确

[英]SSIS Excel Import Forcing Incorrect Column Type

I'm trying to import a spreadsheet to our database using SSIS.我正在尝试使用 SSIS 将电子表格导入我们的数据库。 For some reason SSIS wants to believe two of the columns are of type Double, when they contain character data.出于某种原因,SSIS 想要相信其中两列是 Double 类型,当它们包含字符数据时。 I've tried remapping the columns to be nvarchar(255) but it still doesn't want to select the data it thinks is double, because there are characters in it.我尝试将列重新映射为 nvarchar(255) 但它仍然不想 select 它认为是双倍的数据,因为其中有字符。 If I try to edit the SSIS package and change the column types in the Excel Source, it won't let me change the type of the columns in the Error Output and gives me an error if the regular output and error output columns don't match. If I try to edit the SSIS package and change the column types in the Excel Source, it won't let me change the type of the columns in the Error Output and gives me an error if the regular output and error output columns don't匹配。

Why is SSIS insisting that these columns are Double?为什么 SSIS 坚持认为这些列是 Double? How can I force it to realize these are strings?我怎样才能强迫它意识到这些是字符串? Why does everything from microsoft have to not quite work correctly?为什么微软的一切都不能完全正常工作?

EDIT:编辑:

I found this:我找到了这个:

I sorted my data so that mixed data types would be at the top, and guess what: The problem reversed .我对数据进行了排序,以便混合数据类型位于顶部,然后猜猜看:问题反转了 Instead of not importing character data, it stopped importing purely numeric data.它不再导入字符数据,而是停止导入纯数字数据。 Apparently someone doesn't think 12345 can be represented as a string...显然有人不认为 12345 可以表示为一个字符串......

I've seen this issue before, it's Excel that is the issue not SSIS. 我以前见过这个问题,这是Excel的问题而不是SSIS。 Excel samples the 1st few rows and then infers the data type even if you explicitly set it to text. Excel会对前几行进行采样,然后即使您明确将其设置为文本也会推断出数据类型。 What you need to do is put this into the Excel file connection string in the SSIS package. 您需要做的是将其放入SSIS包中的Excel文件连接字符串中。 This instruction tells Excel that the columns contain mixed data types and hints it to do extra checking before deciding that the column is a numeric type when in fact it's not. 该指令告诉Excel列中包含混合数据类型,并提示它在确定列是数字类型之前进行额外检查,而事实上并非如此。

;Extended Properties="IMEX=1"

It should work with this (in most cases). 它应该适用于此(在大多数情况下)。 The safer thing to do is export the Excel data to tab delimited text and use SSIS to import that. 更安全的做法是将Excel数据导出到制表符分隔文本并使用SSIS导入它。

You can convert (ie. force) the column data to text... Try this (Note: These instructions are based on Excel 2007)... 您可以将列数据转换(即强制)为文本...尝试此操作(注意:这些说明基于Excel 2007)...

The following steps should force Excel to treat the column as text: 以下步骤应强制Excel将列视为文本:

Open your spreadsheet with Excel. 使用Excel打开电子表格。

Select the whole column that contains your "mostly numeric data" by clicking on the column header. 通过单击列标题选择包含“主要数字数据”的整个列。

Click on the Data tab on the ribbon menu. 单击功能区菜单上的“数据”选项卡。

Select Text to Columns. 选择文本到列。 This will bring up the Convert Text to Columns Wizard. 这将打开“将文本转换为列向导”。

-On Step 1: Click Next - 在第1步:单击“下一步”

-On Step 2: Click Next - 在第2步:单击“下一步”

-On Step 3: Select Text and click Finish - 在第3步:选择“文本”,然后单击“完成”

Save your Excel sheet. 保存Excel工作表。

Retry the import using the SQL Server 2005 Import Data Wizard. 使用SQL Server 2005导入数据向导重试导入。

Also, here's a link to another question which has additional responses: 此外,这是另一个问题的链接,其中包含其他答案:

Import Data Wizard Does Not Like Data Type I Choose For A Column 导入数据向导不喜欢数据类型我选择列

未在接受的答案中提到的一件事是,“IMEX = 1”参数有去的引用部分

...;Extended Properties="...";

;IMEX=1; ; IMEX = 1; is not always working... Everything about mixed datatypes in Excel: Mixed data types in Excel column 并不总是有效... Excel中混合数据类型的所有内容: Excel列中的混合数据类型

在此输入图像描述

另一种解决方法是使用顶部的字符数据对电子表格进行排序,从而使Excel将列视为字符串,并导入所有内容。

You can also alter the registry to look at more values than just the first 8 rows. 您还可以更改注册表以查看比前8行更多的值。 I have used this method and works quite well. 我使用过这种方法,效果很好。

http://support.microsoft.com/kb/281517 http://support.microsoft.com/kb/281517

Well IMEX=1 did not work for me. IMEX = 1对我来说不起作用。 Neither did Reynier Booysen's suggestion. Reynier Booysen的建议也没有。 (I don't know if it makes a difference but I'm using SQL Server 2008r2). (我不知道它是否有所作为,但我使用的是SQL Server 2008r2)。 A good explanation of some workarounds and also some explanations of why IMEX=1 is limited to the first eight rows of each spreadsheet can be found at http://social.msdn.microsoft.com/Forums/en-US/sqlintegrationservices/thread/78b87712-8ffe-4c72-914b-f1c031ba6c75 一些解决方法的好解释以及IMEX = 1的原因的一些解释仅限于每个电子表格的前八行,可以在http://social.msdn.microsoft.com/Forums/en-US/sqlintegrationservices/thread找到。 / 78b87712-8ffe-4c72-914b-f1c031ba6c75

Hope this helps 希望这可以帮助

I've used the following recipe: 我用过以下食谱:

  1. Import data from Excel to Access 将数据从Excel导入Access
  2. Import data from Access to SQL Server 从Access导入数据到SQL Server

and it worked for me... 它对我有用......

I was banging my head against a wall with this issue for a while. 有一段时间我正在用这个问题撞墙。 In our environment, we consume price files from our suppliers in various formats, some of which have upward of a million records. 在我们的环境中,我们以各种格式从供应商处获取价格文件,其中一些格式有超过一百万条记录。 This issue usually occurs where: 此问题通常发生在:

  • The rows scanned by the OLEDB driver appear to contain numbers, but do contain mixed values later on in the record set, or OLEDB驱动程序扫描的行似乎包含数字,但稍后在记录集中包含混合值,或者
  • Fields do contain only numbers, but the source has some formatted as text (usually Excel files). 字段只包含数字,但源有一些格式化为文本(通常是Excel文件)。

The problem is that even if you set your external input column to the desired data type, the file gets scanned every time you run the package and is dynamically changed to whatever the OLEDB driver thinks the field should be. 问题是,即使您将外部输入列设置为所需的数据类型,每次运行包时都会扫描文件,并动态更改为OLEDB驱动程序认为该字段应该是的任何内容。

Our source files typically contain field headers (text) and prices (numeric fields), which gives me an easy solution: 我们的源文件通常包含字段标题(文本)和价格(数字字段),这为我提供了一个简单的解决方案:

First step: 第一步:

  • Change your SQL statement to include the header fields. 更改SQL语句以包含标头字段。 This forces SSIS to see all fields as text, including the price fields. 这会强制SSIS将所有字段视为文本,包括价格字段。

For mixed fields: 对于混合领域:

  • Your initial problem is solved because your fields are now text, but you still have a header row in your output. 您的初始问题已解决,因为您的字段现在是文本,但您的输出中仍有一个标题行。
  • Prevent the header row from making it into your output by changing the SQL WHERE clause to exclude the header values eg "WHERE NOT([F4]='Price')" 通过更改SQL WHERE子句来排除标题值,例如“WHERE NOT([F4] ='Price')”,防止标题行进入输出

For numeric fields: 对于数字字段:

  • Using the advanced editor for the OLE DB source, set the output column for the price field (or any other numeric field) to a numeric DataType. 使用OLE DB源的高级编辑器,将price字段(或任何其他数字字段)的输出列设置为数字DataType。 This causes any records that contain text in these fields to fail, including the header record, but forces a conversion on numeric values saved as text. 这会导致包含这些字段中的文本的任何记录失败,包括标题记录,但强制转换保存为文本的数值。

  • Set the Error Output to ignore failures on your numeric fields. 设置错误输出以忽略数字字段上的失败。

  • Alternatively, if you still need any errors on the numeric fields redirected, remove the header row by changing the SQL WHERE clause to exclude the header values then, 或者,如果在重定向的数字字段上仍然需要任何错误,请通过更改SQL WHERE子句来删除标题行,然后排除标题值,

  • Set the Error Output to redirect failures on this field. 将错误输出设置为在此字段上重定向失败。

Obviously this method only works where you have header fields, but hopefully this helps some of you. 显然,这种方法只适用于有标题字段的地方,但希望这对你们有些帮助。

Option 1. Use Visual Basic to iterate through each column and format each column as Text. 选项1.使用Visual Basic迭代每列并将每列格式化为Text。

Use the Text-to-Columns menu, don't change the delimination, and change "General" to "Text" 使用“文本到列”菜单,不要更改删除,并将“常规”更改为“文本”

I had the same issue, multiple data type values in single column, package load only numeric values. 我有同样的问题,单列中有多个数据类型值,包只加载数值。 Remains all it updated as null. 保持全部更新为null。

Solution

To fix this changing the excel data type is one of the solution. 要解决此问题,更改Excel数据类型是解决方案之一。 In Excel Copy the column data and paste in different file. 在Excel中复制列数据并粘贴到不同的文件中。 Delete that column and insert new column as Text datatype and paste that copied data in new column. 删除该列 并将新列作为Text数据类型插入 ,并将复制的数据粘贴到新列中。

Now in ssis package delete and recreate the Excel source and destination table change the column data type as varchar . 现在在ssis包中删除并重新创建Excel源和目标表,将列数据类型更改为varchar

This will work. 这会奏效。

If multiple columns in the excel spreadsheet present with the same name, this kind of error occurs. 如果excel电子表格中的多个列具有相同的名称,则会发生此类错误。 The package will work after making the column name's distinct. 在使列名称不同之后,该包将起作用。 Sometime the hidden columns are being ignored while checking the columnn names. 有时在检查列名称时会忽略隐藏列。

  1. Click File on the ribbon menu, and then click on Options. 单击功能区菜单上的“文件”,然后单击“选项”。
  2. Click Advanced, and then under When calculating this workbook, select the Set precision as displayed check box, and then click OK. 单击“高级”,然后在“计算此工作簿时”下,选中“将精度设置为显示”复选框,然后单击“确定”。

  3. Click OK. 单击确定。

  4. In the worksheet, select the cells that you want to format. 在工作表中,选择要格式化的单元格。

  5. On the Home tab, click the Dialog Box Launcher Button image next to Number. 在“主页”选项卡上,单击“数字”旁边的“对话框启动器”按钮图像。

  6. In the Category box, click Number. 在“类别”框中,单击“数字”。

  7. In the Decimal places box, enter the number of decimal places that you want to display. 在“小数位数”框中,输入要显示的小数位数。

This worked for me. 这对我有用。 Select the problematic column in Excel - highlight the whole column. 在Excel中选择有问题的列 - 突出显示整列。 Change the format to "Text". 将格式更改为“文本”。 Save the Excel file. 保存Excel文件。

In your SSIS package, go to the Data Flow pane for your import. 在SSIS包中,转到“数据流”窗格以进行导入。 Double click the Excel Source node. 双击“Excel源”节点。 It should warn you that the types have changed and ask you if you want to remap them. 它应警告您类型已更改并询问您是否要重新映射它们。 Click Yes. 单击是。 Executing should now work and bring in all values. 现在执行应该工作并引入所有值。

Note: I'm using Excel 2013 and Visual Studio 2015, but I assume these instructions would work for earlier versions too. 注意:我使用的是Excel 2013和Visual Studio 2015,但我认为这些说明也适用于早期版本。

It took me a bit to realize the source of the error in my package. 我花了一些时间才意识到我的包中的错误来源。 Ultimately I found that data was converted to null ( Example: from "06" to "NULL" ), and I found this via Preview in the source file connection ( Excel Source> Edit> Connection Manager> Sheet='MySheet'> Preview... ). 最终我发现数据被转换为null( Example: from "06" to "NULL" ),我通过源文件连接中的预览找到了这个( Excel Source> Edit> Connection Manager> Sheet='MySheet'> Preview... )。 I got excited when I read the post by James to edit the connection string to have extended properties: ;Extended Properties="IMEX=1" . 当我阅读James的帖子编辑连接字符串以具有扩展属性时,我很兴奋: ;Extended Properties="IMEX=1" But that did not work for me. 但这对我不起作用。

I was able to resolve the error by changing the Cell Format in Excel worksheet from “Number” to “Text”. 我能够通过将Excel工作表中的单元格格式从“数字”更改为“文本”来解决错误。 After changing the format, the upload process ran successfully! 更改格式后,上传过程成功运行! My connection string looks like: Provider=Microsoft.ACE.OLEDB.12.0;Data Source=\\\\myServer\\d$\\Folder1\\Folder2\\myFile.xlsx;Extended Properties="EXCEL 12.0 XML;HDR=NO"; 我的连接字符串如下所示: Provider=Microsoft.ACE.OLEDB.12.0;Data Source=\\\\myServer\\d$\\Folder1\\Folder2\\myFile.xlsx;Extended Properties="EXCEL 12.0 XML;HDR=NO";

Here is are some screenshots that resolved my error message. 以下是解决我的错误消息的一些屏幕截图。

Error : Metadata of Excel file connection 错误Excel文件连接的元数据 在此输入图像描述

Source of error : “General” format 错误来源“一般”格式 在此输入图像描述

Source of error changed : “Text” format 错误来源已更改“文本”格式 在此输入图像描述

Error fixed : Metadata of Excel file connection 错误已修复Excel文件连接的元数据 在此输入图像描述

I saw your question today, I was having the same problem and I found the EASIET way is to save the excel sheet as 97-2003 format and the import will keep the columns with the same data type you specified.我今天看到了你的问题,我遇到了同样的问题,我发现 EASIET 方法是将 excel 表保存为 97-2003 格式,导入将保持列与您指定的数据类型相同。 Hope this helps!希望这可以帮助!

I had the same problem. 我有同样的问题。 The problem sit in the Excel Source task. 问题出在Excel源任务中。 When you setup this task the first time, the task will connect to the specified Excel file (via the Excel connection) and decide what type each column is based on the current spreadsheet. 第一次设置此任务时,任务将连接到指定的Excel文件(通过Excel连接),并根据当前电子表格确定每列的类型。

Thus, if you set up the Excel Source task, just make sure that the columns that should be text only has text in the column. 因此,如果设置Excel源任务,只需确保应该是文本的列在列中包含文本。 This means that the Excel Source task will always assume that any subsequent spreadsheets will have the same format and will read 12345 as text because the column was text when the task was set up. 这意味着Excel Source任务将始终假定任何后续电子表格将具有相同的格式,并将读取12345作为文本,因为该列是设置任务时的文本。

Hope it makes sense! 希望它有意义!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM