简体   繁体   English

任务并行库使用IEnumerable引发NullReferenceException

[英]Task Parallel Library throws NullReferenceException with IEnumerable

I have an IEnumerable entity which holds around 100 thousand records. 我有一个IEnumerable实体,可保存约10万条记录。 I wanted to perform Parallel.ForEach to insert these data. 我想执行Parallel.ForEach插入这些数据。

Say here is the class what I have: Employee.cs 说这是我所拥有的课程:Employee.cs

SqlConneciton conn = base.GetConnection();
conn.open();
IEnumerable<Employee> employeeList = GetListofEmployeesFromDB();
Parallel.ForEach(employeeList
, employee =>
{
    employee.add(conn, sqlTransaction);
});

Empployee.cs 
{
      public void add(SqlConnection conn, SqlTransaction sqlTransaction)
      {
             using (SqlCommand insertCmd = new SqlCommand("EmployeeInsert", conn))
             {
                 insertCmd.CommandType = CommandType.StoredProcedure;
                 insertCmd.Transaction = transaction;
                 insertCmd.Parameters["@Name"].Value = this.Name;
                 insertCmd.ExecuteNonQuery();
                 this.id = (int)insertCmd.Parameters["@Id"].Value;
             }
      }
}

As the data inserts, I see that there is a NPE at: 插入数据时,我发现在以下位置有一个NPE:

this.id = (int)insertCmd.Parameters["@Id"].Value;

Not sure I i am missing something or not. 不知道我是否想念什么。 here is the exception that i see. 这是我看到的例外。

System.AggregateException was unhandled
  Message=AggregateException_ctor_DefaultMessage
  Source=System.Threading
  StackTrace:
       at System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken)
       at System.Threading.Tasks.Task.Wait()
       at System.Threading.Tasks.Parallel.PartitionerForEachWorker[TSource,TLocal](Partitioner`1 source, ParallelOptions parallelOptions, Action`1 simpleBody, Action`2 bodyWithState, Action`3 bodyWithStateAndIndex, Func`4 bodyWithStateAndLocal, Func`5 bodyWithEverything, Func`1 localInit, Action`1 localFinally)
       at System.Threading.Tasks.Parallel.ForEach[TSource](Partitioner`1 source, ParallelOptions parallelOptions, Action`1 body)
        :
        :
        :
       at System.Threading.ExecutionContext.runTryCode(Object userData)
       at System.Runtime.CompilerServices.RuntimeHelpers.ExecuteCodeWithGuaranteedCleanup(TryCode code, CleanupCode backoutCode, Object userData)
       at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
       at System.Threading.ThreadHelper.ThreadStart(Object obj)
  InnerException: System.NullReferenceException
       Message=Object reference not set to an instance of an object.
       Source=Jobvite.Library
       StackTrace:            
            :
            :
            :
            at System.Threading.Tasks.Parallel.<>c__DisplayClass32`2.<PartitionerForEachWorker>b__30()
            at System.Threading.Tasks.Task.InnerInvoke()
            at System.Threading.Tasks.Task.InnerInvokeWithArg(Task childTask)
            at System.Threading.Tasks.Task.<>c__DisplayClass3.<ExecuteSelfReplicating>b__2(Object )
       InnerException: 

System.AggregateException is raised for a possibility of multiple Exceptions being raised from the application. 引发System.AggregateException是为了从应用程序引发多个异常。

Reason 原因
You are accessing the Connection object in parallel mode. 您正在以并行方式访问Connection对象。 Multiple tasks are trying to access it at the sametime and raising exception when they couldn't get hold of it. 多个任务试图同时访问它,并在无法获取它时引发异常。 Only one thread can access the db connection at a moment. 一次只有一个线程可以访问数据库连接。
Creating multiple threads to insert data into DB will not speed things up anyway. 创建多个线程以将数据插入DB始终不会加快速度。 (Even if you manage to find any parallel method) because the DB would locked for each write of data and all the data will be inserted sequentially. (即使您设法找到任何并行方法也是如此),因为DB会为每次写入数据而锁定,并且所有数据将被顺序插入。

Go with a normal insert process and it will be much faster. 使用正常的插入过程,它将更快。

(Once I figured out what "1 lac" is) It looks like you want to do a Bulk Insert . (一旦我弄清楚“ 1 lac”是什么),就好像您要进行批量插入 You can do that using SqlBulkCopy - it was designed to efficiently load a SQL table. 您可以使用SqlBulkCopy做到这一点-它旨在有效地加载SQL表。

However , I see you also want the ids back so the above won't get you all the way. 但是 ,我看到您也希望返回ID,因此以上内容不会完全解决您的问题。 I see you are using a stored procedure so one way of doing it (assuming you have SQL 2008 and above): 我看到您正在使用存储过程,因此是一种存储过程(假设您拥有SQL 2008及更高版本):

  1. Create a table-valued data type to contain the data you want to insert. 创建一个表值数据类型以包含要插入的数据。

     CREATE TYPE [dbo].[EmployeeDataType] As Table ( ID INT, -- employee details ) 
  2. Change your stored procedure to use this table valued parameter as input and when it performs the insert, it does and OUTPUT. 更改您的存储过程以将该表值参数用作输入,并在执行插入操作时将其与OUTPUT一起使用。 eg 例如

     CREATE PROCEDURE [dbo].[EmployeeInsert] ( @EmployeeInsertParameter As [dbo].[EmployeeDataType] READONLY ) AS ... INSERT INTO Employee SELECT * FROM @EmployeeInsertParameter e OUTPUT INSERTED.* 

(Obviously you would name the columns and not use *) (显然,您将命名列而不使用*)

  1. Change your code to not use Parallel.ForEach and instead do this: 更改您的代码以使用Parallel.ForEach ,而是这样做:

     DataTable employeeDataTable = new DataTable("EmployeeDataType"); // fill in the rows using ... insertCmd.Parameters["@EmployeeInsertParameter"].Value = employeeDataTable; ... 
  2. Read the result of the stored procedure execution into List<Employee> 将存储过程执行的结果读取到List<Employee>

Conclusion : Basically don't use Parallel.For for DB connections. 结论 :基本上不使用Parallel.For进行数据库连接。 This way will have you use one connection correctly (without resulting in "NPEs") and most of the processing will be done in memory and as long as you have the RAM, it will be orders of magnitude quicker. 这种方式将使您正确使用一个连接(不会导致“ NPE”),并且大多数处理将在内存中完成,并且只要您具有RAM,它的数量级就会更快。


Here is another example possible way, but is more involved: https://stackoverflow.com/a/21689413/3419825 这是另一种可能的示例,但涉及更多: https : //stackoverflow.com/a/21689413/3419825

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM