简体   繁体   English

SqlBulkCopy.WriteToServerAsync() 不写入目标 SQL 服务器表,bulkCopy.WriteToServer() 执行

[英]SqlBulkCopy.WriteToServerAsync() does not write to target SQL Server table, bulkCopy.WriteToServer() does

Just as the title states.正如标题所述。 I am trying to load a ~8.45GB csv file with ~330 columns (~7.5 million rows) into a SQL Server instance, but I'm doing the parsing internally as it the file has some quirks to it (with comma delimitations and quotes, etc).我正在尝试将具有 ~330 列(~750 万行)的 ~8.45GB csv 文件加载到 SQL 服务器实例中,但我在内部进行解析,因为该文件有一些怪癖(用逗号分隔和引号) , ETC)。 The heavy duty bulk insert and line parsing is done as below:重型批量插入和行解析如下完成:

var dataTable = new DataTable(TargetTable);
using var streamReader = new StreamReader(FilePath);
using var bulkCopy = new SqlBulkCopy(this._connection, SqlBulkCopyOptions.TableLock, null)
{
    DestinationTableName = TargetTable,
    BulkCopyTimeout = 0,
    BatchSize = BatchSize,

};

/// ...
var outputFields = new string[columnsInCsv];
this._connection.Open();
while ((line = streamReader.ReadLine()) != null)
{
    //get data
    CsvTools.ParseCsvLineWriteDirect(line, ref outputFields);

    // insert into datatable
    dataTable.LoadDataRow(outputFields, true);

    // update counters
    totalRows++;
    rowCounter++;

    if (rowCounter >= BatchSize)
    {
        try
        {
            // load data
            bulkCopy.WriteToServer(dataTable); // this works.
            //Task.Run(async () => await bulkCopy.WriteToServerAsync(dataTable)); // this does not.
            //bulkCopy.WriteToServerAsync(dataTable)) // this does not write to the table either. 
            rowCounter = 0;
            dataTable.Clear();
        }
        catch (Exception ex)
        {
            Console.Error.WriteLine(ex.ToString());
            return;
        }
    }
}
// check if we have any remnants to load
if (dataTable.Rows.Count > 0)
{
    
    bulkCopy.WriteToServer(dataTable); // same here as above
    //Task.Run(async () => await bulkCopy.WriteToServerAsync(dataTable));
    //bulkCopy.WriteToServerAsync(dataTable));
    dataTable.Clear();
}
this._connection.Close();

Obviously I would like this to be fast as possible.显然,我希望它尽可能快。 I noticed via profiling that the WriteToServerAsync method was almost 2x as fast (in its execution duration) as the WriteToServer method, but when I use the async version, no data appears to be written to the target table (whereas the non- async version seems to commit the data fine but much more slowly).我通过分析注意到WriteToServerAsync方法的速度几乎是WriteToServer方法的 2 倍(在执行期间),但是当我使用async版本时,似乎没有数据写入目标表(而非async版本似乎可以很好地提交数据,但要慢得多)。 I'm assuming there is something here I forgot (to somehow trigger the commit to the table), but I am not sure what could prevent committing the data to the target table.我假设这里有一些我忘记的东西(以某种方式触发对表的提交),但我不确定什么可以阻止将数据提交到目标表。

Note that I am aware that SQL Server has a BULK INSERT statement but I need more control over the data for other reasons and would prefer to do this in C#. Also perhaps relevant is that I am using SQL Server 2022 Developer edition.请注意,我知道 SQL 服务器有一个BULK INSERT语句,但出于其他原因我需要对数据进行更多控制,并且更愿意在 C# 中执行此操作。也许相关的还有我正在使用 SQL 服务器 2022 开发版。

Fire and forget tasks触发并忘记任务

By performing Task.Run(...) or DoSomethingAsync() without a corresponding await essentially makes the task a fire and forget task.通过在没有相应await的情况下执行Task.Run(...)DoSomethingAsync()实质上使任务成为一个即发即的任务。 The "fire" refers to the creation of the task and the "forget" due to the fact that the coder appears not to be interested in any return value (if applicable) or desires any knowledge as to when the task completes. “fire”指的是任务的创建,而“forget”指的是编码器似乎对任何返回值(如果适用)不感兴趣或希望知道任务何时完成这一事实。

Though not immediately problematic, it is if the calling thread or process exits before the task completes.虽然不会立即出现问题,但如果调用线程或进程在任务完成之前退出就会出现问题。 The task will be terminated before completion, This problem typically occurs in short-lived processes such as console apps, not so much for say Windows Services.该任务将在完成之前终止,此问题通常发生在短暂的进程中,例如控制台应用程序,而不是 Windows 服务。 web apps with 20-minute App Domain timeouts et all. web 个应用程序具有 20 分钟的应用程序域超时等等。

Example例子

  • sending an asynchronous keep-alive/heartbeat to a remote service or monitor.向远程服务或监视器发送异步保持活动/心跳

    • there is no return value to monitor, asynchronous or otherwise没有要监控的返回值,异步或其他方式
    • It won't matter if it fails as a more up-to-date call will eventually replace it它是否失败并不重要,因为更新的调用最终会取代它
    • It won't matter if it doesn't complete in time if the hosting process exits for some reason (after-all we are a heartbeat, if the process is ended naturally there is no heart to beat).如果托管进程由于某种原因退出(毕竟我们是心跳,如果进程自然结束就没有心跳),如果它没有及时完成也没关系。

Awaited tasks等待任务

Consider prefixing it with a await as in await bulkCopy.WriteToServerAsync(...);考虑用await作为前缀,如await bulkCopy.WriteToServerAsync(...); . . This way the task is linked to the parent task/thread and ensures the former (unless it is terminated by some other means) does not exit before the task completes.通过这种方式,任务链接到父任务/线程,并确保前者(除非它被其他方式终止)在任务完成之前不会退出。

Naturally the containing method and those in the call stack will need to be marked async and also have await prefixes on the corresponding methods.自然地,包含方法和调用堆栈中的方法需要标记为async ,并且在相应的方法上也有await前缀。 This "async all the way" creates a nice daisy chain of linked tasks all the way to the parent (or at least to the last method in the call chain with an await or a legacy ContinueWith() ).这种“一路异步”创建了一个很好的链接任务菊花链,一直到父级(或者至少到调用链中带有await或遗留ContinueWith()的最后一个方法)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM