简体   繁体   English

C#SqlBulkCopy恢复错误

[英]C# SqlBulkCopy resume on error

I am trying to do a bulk insert of 10 million of rows from Oracle to SQL Server using BulkCopy.WriteToServer(). 我正在尝试使用BulkCopy.WriteToServer()从Oracle到SQL Server进行1000万行的批量插入。

I have made sure 我确定

  • The table columns and data types on both sides are the same. 两侧的表列和数据类型相同。 I meant Oracle's Date data type maps to the Sql Server's datetime data type. 我的意思是Oracle的Date数据类型映射到Sql Server的datetime数据类型。 Varchar2 maps to varchar, etc. Varchar2映射到varchar等。
  • No triggers and indexes on the destination table 目标表上没有触发器和索引

When it came to just about 1.4 million rows it failed with System.ArgumentOutOfRangeException: Hour, Minute, and Second parameters describe an un-representable DateTime. 当涉及到大约140万行时,它因System.ArgumentOutOfRangeException失败:Hour,Minute和Second参数描述了无法表示的DateTime。 at System.DateTime.DateToTicks(Int32 year, Int32 month, Int32 day) 在System.DateTime.DateToTicks(Int32年,Int32月,Int32天)

Here is my code 这是我的代码

      SqlBulkCopy copy;
      copy = new SqlBulkCopy(destConn, SqlBulkCopyOptions.TableLock, null); 
      // ColumnMappings property is used to map column positions, not data type
      copy.DestinationTableName = DestTable;
      copy.NotifyAfter = 5000; 
      copy.SqlRowsCopied += new SqlRowsCopiedEventHandler(OnSqlRowsCopied);
      copy.BulkCopyTimeout = 0;
      try { copy.WriteToServer((IDataReader)rd); }
      catch (Exception ex)
      {
        AppInfo.TableMsg[SrcTable] = AppInfo.TableMsg[SrcTable] + "\r\n" + "bulkcopy.WriteToServer(rd) failed. " + ex.Message;
        throw ex;
      }

My table got over 100 columns and there are 26 DATE columns. 我的表格有100多个列,并且有26个DATE列。 It's hard to sort out where the bad data is 很难找出错误数据的位置

So I got 3 questions here 所以我在这里有3个问题

  1. Is there any setting/option to make WriteToServer() continue or ignore exception? 是否有任何设置/选项可以使WriteToServer()继续执行或忽略异常? Or any way I can do in catch block to make it continue? 还是我可以在catch区域中使它继续进行的任何方式? I don't care leaving the bad data behind. 我不在乎留下不良数据。 I am looking for a way to tell it to continue on insertion errors. 我正在寻找一种方法来告诉它继续出现插入错误。
  2. Is there any way to prevent this from happening? 有什么办法可以防止这种情况的发生? For example anything I can to in select query that fills the OracleDataReader? 例如,在填充OracleDataReader的选择查询中我可以做的任何事情?
  3. If no solution to the above 2 questions, then is there any good way to cleanse the "bad date" out on Oracle side? 如果上述两个问题都没有解决方案,那么有什么好的方法可以消除Oracle方面的“不好的日期”?

Thanks, 谢谢,

Update: I have done the following 更新:我已经做了以下

  1. Change destination table data type from datetime to datetime2 将目标表数据类型从datetime更改为datetime2
  2. Modify the select list to 修改选择列表为

    CASE WHEN my_date_column < To_Date('01/01/1753', 'mm/dd/yyyy') THEN To_Date('01/01/1753','mm/dd/yyyy') ELSE my_date_column END my_date_column <To_Date('01 / 01/1753','mm / dd / yyyy')THEN To_Date('01 / 01/1753','mm / dd / yyyy')ELSE my_date_column结束

    for all the columns with DATE datatype. 对于所有具有DATE数据类型的列。

But the error still persists. 但是错误仍然存​​在。 Here is complete error message. 这是完整的错误消息。

System.ArgumentOutOfRangeException was caught
  HResult=-2146233086
  Message=Hour, Minute, and Second parameters describe an un-representable DateTime.
  Source=mscorlib
  StackTrace:
       at System.DateTime.TimeToTicks(Int32 hour, Int32 minute, Int32 second)
       at Oracle.DataAccess.Client.OracleDataReader.GetDateTime(Int32 i)
       at Oracle.DataAccess.Client.OracleDataReader.GetValue(Int32 i)
       at System.Data.SqlClient.SqlBulkCopy.GetValueFromSourceRow(Int32 destRowIndex, Boolean& isSqlType, Boolean& isDataFeed, Boolean& isNull)
       at System.Data.SqlClient.SqlBulkCopy.ReadWriteColumnValueAsync(Int32 col)
       at System.Data.SqlClient.SqlBulkCopy.CopyColumnsAsync(Int32 col, TaskCompletionSource`1 source)
       at System.Data.SqlClient.SqlBulkCopy.CopyRowsAsync(Int32 rowsSoFar, Int32 totalRows, CancellationToken cts, TaskCompletionSource`1 source)
       at System.Data.SqlClient.SqlBulkCopy.CopyBatchesAsyncContinued(BulkCopySimpleResultSet internalResults, String updateBulkCommandText, CancellationToken cts, TaskCompletionSource`1 source)
       at System.Data.SqlClient.SqlBulkCopy.CopyBatchesAsync(BulkCopySimpleResultSet internalResults, String updateBulkCommandText, CancellationToken cts, TaskCompletionSource`1 source)
       at System.Data.SqlClient.SqlBulkCopy.WriteToServerInternalRestContinuedAsync(BulkCopySimpleResultSet internalResults, CancellationToken cts, TaskCompletionSource`1 source)
       at System.Data.SqlClient.SqlBulkCopy.WriteToServerInternalRestAsync(CancellationToken cts, TaskCompletionSource`1 source)
       at System.Data.SqlClient.SqlBulkCopy.WriteToServerInternalAsync(CancellationToken ctoken)
       at System.Data.SqlClient.SqlBulkCopy.WriteRowSourceToServerAsync(Int32 columnCount, CancellationToken ctoken)
       at System.Data.SqlClient.SqlBulkCopy.WriteToServer(IDataReader reader)

From the error message it looks like the offending part is OracleDataReader rather than SqlBulkCopy. 从错误消息中,看起来令人讨厌的部分是OracleDataReader,而不是SqlBulkCopy。

How can I quickly spot these offending value using a Oracle query? 如何使用Oracle查询快速发现这些违规值? Any further suggestions? 还有其他建议吗?

Oracle Database can store dates in the Julian era, ranging from January 1, 4712 BCE through December 31, 9999 CE (Common Era, or 'AD'). Oracle数据库可以存储儒略时代的日期,范围是从公元前4712年1月1日到9999年12月31日(通用时代或“ AD”)。 Unless BCE ('BC' in the format mask) is specifically used, CE date entries are the default. 除非专门使用BCE(格式掩码中的“ BC”),否则默认为CE日期条目。

SQL Server's datetime cannot do that. SQL Server的datetime无法做到这一点。 datetime2 is recommended for new development and it can hold all practical date and time values. 建议将datetime2用于新开发,它可以保存所有实际的日期和时间值。 If you still hit any range limit, run Oracle queries of the style SELECT * FROM T WHERE SomeDateCol < '0000-01-01' to find the invalid data. 如果仍然达到任何范围限制,请运行SELECT * FROM T WHERE SomeDateCol < '0000-01-01'样式的Oracle查询,以查找无效数据。

TL;DR: Research the exact supported value ranges and find any values that cannot be mapped. TL; DR:研究确切的支持值范围并找到任何无法映射的值。

Your questions: 你的问题:

  1. No, SQL Server cannot do that alas. 不,SQL Server无法做到这一点。
  2. Yes, treat invalid rows differently. 是的,以不同的方式对待无效行。 Maybe filter them out or convert the invalid values to NULL . 也许将它们过滤掉或将无效值转换为NULL Your choice. 你的选择。
  3. See above. 往上看。

OK. 好。 I figure it out. 我知道了。 I am answering my 2nd and 3rd questions. 我在回答第二和第三题。

It is the bad date in Oracle that looks like this '01/26/2006 17:94:00'. Oracle中的糟糕日期看起来像是“ 01/26/2006 17:94:00”。

To_char(my_column,'hh24:mi:ss') shows '00:00:00' To_char(my_column,'mi') shows '00' To_char(my_column,'hh24:mi:ss')显示为'00:00:00'To_char(my_column,'mi')显示为'00'

It appear as valid data and cannot be identified as invalid by using to_char() function as filter 它显示为有效数据,无法通过使用to_char()函数作为过滤器将其标识为无效数据

What I can do is using dump function 我能做的就是使用转储功能

DELETE FROM my_table
WHERE my_column IS NOT NULL
  AND (To_Number(SubStr(Dump(my_column), InStr(Dump(my_column),':',1,1)+2, InStr(Dump(my_column),',',1,1)-InStr(Dump(my_column),':',1,1)-2))-100 < 0
   OR To_Number(SubStr(Dump(my_column), InStr(Dump(my_column),',',1,2)+1, InStr(Dump(my_column),',',1,3)-1-InStr(Dump(my_column),',',1,2))) NOT BETWEEN 1 AND 12
   OR To_Number(SubStr(Dump(my_column), InStr(Dump(my_column),',',1,3)+1, InStr(Dump(my_column),',',1,4)-1-InStr(Dump(my_column),',',1,3))) NOT BETWEEN 1 AND 31
   OR To_Number(SubStr(Dump(my_column), InStr(Dump(my_column),',',1,4)+1, InStr(Dump(my_column),',',1,5)-1-InStr(Dump(my_column),',',1,4))) NOT BETWEEN 1 AND 24
   OR To_Number(SubStr(Dump(my_column), InStr(Dump(my_column),',',1,5)+1, InStr(Dump(my_column),',',1,6)-1-InStr(Dump(my_column),',',1,5))) NOT BETWEEN 1 AND 60
   OR To_Number(SubStr(Dump(my_column), InStr(Dump(my_column),',',-1)+1)) NOT BETWEEN 1 AND 60)

And that cleans up the bad data. 这样就清除了不良数据。

The problem is well described by "usr"'s answer. “ usr”的答案很好地描述了该问题。 You can do a CASE statement on the source Oracle source columns with the problem dates to convert invalid dates to NULL or a default value. 您可以在源Oracle源列上使用问题日期执行CASE语句,以将无效日期转换为NULL或默认值。 Now, identifying the rows or columns with the problem is a BIG issue. 现在,确定出现问题的行或列是一个大问题。 I have comeup with a method to identify the problem. 我想出一种确定问题的方法。

Please read this blog post of mine to identify the problem columns so that you can do the appropriate DECODE to convert the problem dates to NULL or valid defaults https://sqljana.wordpress.com/tag/datetime-odp-net-oracle/ 请阅读我的这篇博客文章以识别问题列,以便您可以执行适当的DECODE将问题日期转换为NULL或有效的默认值https://sqljana.wordpress.com/tag/datetime-odp-net-oracle/

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM