简体   繁体   中英

SqlBulkCopy.WriteToServerAsync does not respect the `await` keyword. Why?

SqlBulkCopy.WriteToServerAsync does not respect the await keyword. Why?

Here is my code:

public async Task UpdateDBWithXML(Action<Func<DataTable, Task>> readXmlInBatches, string hashKey, string hash)
{
    using (var transaction = this.Context.Database.BeginTransaction(IsolationLevel.ReadUncommitted))
    using (var bulk = new SqlBulkCopy((SqlConnection)this.Connection, SqlBulkCopyOptions.Default, (SqlTransaction)transaction.UnderlyingTransaction))
    {
        //this.Context.Database.ExecuteSqlCommand("DELETE FROM [dbo].[LegalContractorTemps]");

        bulk.DestinationTableName = "LegalContractorTemps";
        readXmlInBatches(async (DataTable table) =>
        {
            if (bulk.ColumnMappings.Count == 0)
            {
                foreach (DataColumn column in table.Columns)
                {
                    bulk.ColumnMappings.Add(new SqlBulkCopyColumnMapping(column.ColumnName, column.ColumnName));
                }
            }

            await bulk.WriteToServerAsync(table);
        });

        await this.Context.Database.ExecuteSqlCommandAsync(
            "EXECUTE dbo.LegalContractorsDataSynchronize @hashKey, @hash",
            new SqlParameter("@hashKey", hashKey),
            new SqlParameter("@hash", hash)
        );

        transaction.Commit();
    }
}

In the readXmlInBatches parameter I pass the following function as an argument:

public void ReadXMLInBatches(Func<DataTable, Task> processBatch)
{
    int batchSize = 10000;
    var table = new DataTable();
    foreach (var col in columnNames)
    {
        table.Columns.Add(col);
    }

    using (var reader = new StreamReader(pathToXml, Encoding.GetEncoding(encoding)))
    using (var xmlReader = XmlReader.Create(reader))
    {
        string lastElement = null;
        DataRow lastRow = null;
        while (xmlReader.Read())
        {
            switch (xmlReader.NodeType)
            {
                case XmlNodeType.Element:
                    if (xmlReader.Name == "RECORD")
                    {
                        if (table.Rows.Count >= batchSize)
                        {
                            processBatch(table);
                            table.Rows.Clear();
                        }

                        lastRow = table.Rows.Add();
                    }
                    lastElement = xmlReader.Name;
                    break;
                case XmlNodeType.Text:
                    ReadMember(lastRow, lastElement, xmlReader.Value);
                    break;
            }
        }
        if (table.Rows.Count > 0)
        {
            processBatch(table);
            table.Rows.Clear();
        }
    }
}

I have in the XML something about 1.7 million records. After my program have read a few batches I am getting the error:

System.Data.RowNotInTableException : 'This row has been removed from a table and does not have any data. BeginEdit() will allow creation of new data in this row.'

I researched the source code of the SqlBulkCopy . And found the method which throws an error:

public Task WriteToServerAsync(DataTable table, DataRowState rowState, CancellationToken cancellationToken) {
            Task resultTask = null;
            SqlConnection.ExecutePermission.Demand();

            if (table == null) {
                throw new ArgumentNullException("table");
            }

            if (_isBulkCopyingInProgress){
                throw SQL.BulkLoadPendingOperation();
            }

            SqlStatistics statistics = Statistics;
            try {
                statistics = SqlStatistics.StartTimer(Statistics);
                _rowStateToSkip = ((rowState == 0) || (rowState == DataRowState.Deleted)) ? DataRowState.Deleted : ~rowState | DataRowState.Deleted;
                _rowSource = table;
                _SqlDataReaderRowSource = null;
                _dataTableSource = table;
                _rowSourceType = ValueSourceType.DataTable;
                _rowEnumerator = table.Rows.GetEnumerator();
                _isAsyncBulkCopy = true;
                resultTask = WriteRowSourceToServerAsync(table.Columns.Count, cancellationToken); //It returns Task since _isAsyncBulkCopy = true; 
            }
            finally {
                SqlStatistics.StopTimer(statistics);
            }
            return resultTask;
        }

I noticed the field _isBulkCopyingInProgress and decided to check it while debugging. And I found out that when the error is thrown the field is true . How is that possible? I would expect the bulk insert to happen first (before the execution continues and the WriteToServerAsync will be called a second time) since I add the await here: await bulk.WriteToServerAsync(table); .

What could I be missing?

You are passing an asynchronous function to ReadXMLInBatches , but it's execution isn't being awaited inside your method, therefore ReadXMLInBatches may terminate before all the calls to WriteToServerAsync have completed.

Try the following changes:

public async Task ReadXMLInBatchesAsync(Func<DataTable, Task> processBatch)
{
    //...
    await processBatch(table);
    //...
}

public async Task UpdateDBWithXML(Func<Func<DataTable, Task>, Task> readXmlInBatches, string hashKey, string hash)
{
    //...
    await readXmlInBatches(async (DataTable table) =>
    //...
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM