简体   繁体   中英

Non thread-safe code using async and DataRow

When executing the following code.

using System;
using System.Collections.Generic;
using System.Data;
using System.Threading.Tasks;

namespace AsyncDataRow
{
    internal class Program
    {
        private static void Main(string[] args)
        {
            var program = new Program();

            try
            {
                Console.WriteLine("Execute");

                for (int i = 0; i < 100; i++)
                {
                    program.Execute();
                }
            }
            catch (AggregateException aggregateException)
            {
                foreach (var exception in aggregateException.InnerExceptions)
                {
                    Console.WriteLine("AggregateException.InnerExceptions: " + exception.Message);
                }
            }
            catch (Exception ex)
            {
                Console.WriteLine("Exception.Message: " + ex.Message);
            }

            Console.ReadKey();
        }

        private void Execute()
        {
            DataTable dataTable = new DataTable();
            dataTable.Columns.Add("ID", typeof(int));
            dataTable.Columns.Add("Name", typeof(string));

            for (int i = 0; i < 1000; i++)
            {
                dataTable.Rows.Add(i, i.ToString());
            }

            var taskList = new List<Task>();

            foreach (DataRow dataRow in dataTable.Rows)
            {
                taskList.Add(ChangeStringValue(dataRow));
            }

            Task.WaitAll(taskList.ToArray());
        }

        private async Task ChangeStringValue(DataRow dataRow)
        {
            var id = (int)dataRow["ID"];

            var newValue = await Task.Run(() => CreateNewValue(id));

            dataRow["Name"] = newValue;
        }

        private string CreateNewValue(int id)
        {
            Console.WriteLine(string.Format("{0} - CreateNewValue", id));
            return id.ToString() + "New StringValue";
        }
    }
}

Intermittently a System.ArgumentOutOfRangeException is thrown on this line of code

dataRow["Name"] = newValue

Exception message:

Exception thrown: 'System.ArgumentOutOfRangeException' in mscorlib.dll
Additional information: Index was out of range. Must be non-negative and less than the size of the collection.

I'm trying to be more specific identifying the issue, but the closest I got was to simplify the code to reproduce the error.

I think it has something to do with the DataRow reference type being passed into the async method, but I would like to know why this code is not thread-safe.

The number of thread-safe classes in .Net framework is very small and each would explicitly call out its safety (most are found in System.Collecions.Concurrent namespace). For all other types you need to provide your own thread safety mechanisms. DB related classes definitely fall into category of non-thread safe for write. Frequently classes explicitly allow multiple thread-safe reads.

In your case DataRow can be read from multiple threads and you only need to synchronize writes:

Thread Safety

This type is safe for multithreaded read operations. You must synchronize any write operations.

Standard approach for building a lot items - split items in ranges and generate each independently and then merge after all are done.

Partial sample based on original code, MyRowResult is class to hold all data for row that needs to be updated.

        var taskList = new List<Task<MyDataRow>();

        foreach (DataRow dataRow in dataTable.Rows)
        {
            taskList.Add(ChangeStringValue(dataRow));
        }

        // await for proper async code, WaitAll is fin for console app.
        await Task.WhenAll(taskList.ToArray());

        // back to single thread - safe to update rows
        foreach (var task  in taskList)
        {
            task.Result.DataRow["Name"] = task.Result.Name;
        }
    }

    private async Task<MyRowResult> ChangeStringValue(DataRow dataRow)
    {
        // on multiple threads - perform read only operations of rows
        // which is safe as explicitly called out in the documentation
        // "This type is safe for multithreaded read operations."
        var id = (int)dataRow["ID"];

        var newValue = await Task.Run(() => CreateNewValue(id));

        return new MyRowResult { Name = newValue, DataRow = dataRow };
    }

Notes:

  • some classes (like UI controls) require even stricter thread safety- can be called on original thread only
  • Helpers in Parallel class may simplify running code in parallel (like Parallel.ForEach ).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM