[英]SqlBulkCopy.WriteToServerAsync() does not write to target SQL Server table, bulkCopy.WriteToServer() does
Just as the title states.正如标题所述。 I am trying to load a ~8.45GB csv file with ~330 columns (~7.5 million rows) into a SQL Server instance, but I'm doing the parsing internally as it the file has some quirks to it (with comma delimitations and quotes, etc).我正在尝试将具有 ~330 列(~750 万行)的 ~8.45GB csv 文件加载到 SQL 服务器实例中,但我在内部进行解析,因为该文件有一些怪癖(用逗号分隔和引号) , ETC)。 The heavy duty bulk insert and line parsing is done as below:重型批量插入和行解析如下完成:
var dataTable = new DataTable(TargetTable);
using var streamReader = new StreamReader(FilePath);
using var bulkCopy = new SqlBulkCopy(this._connection, SqlBulkCopyOptions.TableLock, null)
{
DestinationTableName = TargetTable,
BulkCopyTimeout = 0,
BatchSize = BatchSize,
};
/// ...
var outputFields = new string[columnsInCsv];
this._connection.Open();
while ((line = streamReader.ReadLine()) != null)
{
//get data
CsvTools.ParseCsvLineWriteDirect(line, ref outputFields);
// insert into datatable
dataTable.LoadDataRow(outputFields, true);
// update counters
totalRows++;
rowCounter++;
if (rowCounter >= BatchSize)
{
try
{
// load data
bulkCopy.WriteToServer(dataTable); // this works.
//Task.Run(async () => await bulkCopy.WriteToServerAsync(dataTable)); // this does not.
//bulkCopy.WriteToServerAsync(dataTable)) // this does not write to the table either.
rowCounter = 0;
dataTable.Clear();
}
catch (Exception ex)
{
Console.Error.WriteLine(ex.ToString());
return;
}
}
}
// check if we have any remnants to load
if (dataTable.Rows.Count > 0)
{
bulkCopy.WriteToServer(dataTable); // same here as above
//Task.Run(async () => await bulkCopy.WriteToServerAsync(dataTable));
//bulkCopy.WriteToServerAsync(dataTable));
dataTable.Clear();
}
this._connection.Close();
Obviously I would like this to be fast as possible.显然,我希望它尽可能快。 I noticed via profiling that the WriteToServerAsync
method was almost 2x as fast (in its execution duration) as the WriteToServer
method, but when I use the async
version, no data appears to be written to the target table (whereas the non- async
version seems to commit the data fine but much more slowly).我通过分析注意到WriteToServerAsync
方法的速度几乎是WriteToServer
方法的 2 倍(在执行期间),但是当我使用async
版本时,似乎没有数据写入目标表(而非async
版本似乎可以很好地提交数据,但要慢得多)。 I'm assuming there is something here I forgot (to somehow trigger the commit to the table), but I am not sure what could prevent committing the data to the target table.我假设这里有一些我忘记的东西(以某种方式触发对表的提交),但我不确定什么可以阻止将数据提交到目标表。
Note that I am aware that SQL Server has a BULK INSERT
statement but I need more control over the data for other reasons and would prefer to do this in C#. Also perhaps relevant is that I am using SQL Server 2022 Developer edition.请注意,我知道 SQL 服务器有一个BULK INSERT
语句,但出于其他原因我需要对数据进行更多控制,并且更愿意在 C# 中执行此操作。也许相关的还有我正在使用 SQL 服务器 2022 开发版。
By performing Task.Run(...)
or DoSomethingAsync()
without a corresponding await
essentially makes the task a fire and forget task.通过在没有相应await
的情况下执行Task.Run(...)
或DoSomethingAsync()
实质上使任务成为一个即发即忘的任务。 The "fire" refers to the creation of the task and the "forget" due to the fact that the coder appears not to be interested in any return value (if applicable) or desires any knowledge as to when the task completes. “fire”指的是任务的创建,而“forget”指的是编码器似乎对任何返回值(如果适用)不感兴趣或希望知道任务何时完成这一事实。
Though not immediately problematic, it is if the calling thread or process exits before the task completes.虽然不会立即出现问题,但如果调用线程或进程在任务完成之前退出就会出现问题。 The task will be terminated before completion, This problem typically occurs in short-lived processes such as console apps, not so much for say Windows Services.该任务将在完成之前终止,此问题通常发生在短暂的进程中,例如控制台应用程序,而不是 Windows 服务。 web apps with 20-minute App Domain timeouts et all. web 个应用程序具有 20 分钟的应用程序域超时等等。
Example例子
sending an asynchronous keep-alive/heartbeat to a remote service or monitor.向远程服务或监视器发送异步保持活动/心跳。
Consider prefixing it with a await
as in await bulkCopy.WriteToServerAsync(...);
考虑用await
作为前缀,如await bulkCopy.WriteToServerAsync(...);
. . This way the task is linked to the parent task/thread and ensures the former (unless it is terminated by some other means) does not exit before the task completes.通过这种方式,任务链接到父任务/线程,并确保前者(除非它被其他方式终止)在任务完成之前不会退出。
Naturally the containing method and those in the call stack will need to be marked async
and also have await
prefixes on the corresponding methods.自然地,包含方法和调用堆栈中的方法需要标记为async
,并且在相应的方法上也有await
前缀。 This "async all the way" creates a nice daisy chain of linked tasks all the way to the parent (or at least to the last method in the call chain with an await
or a legacy ContinueWith()
).这种“一路异步”创建了一个很好的链接任务菊花链,一直到父级(或者至少到调用链中带有await
或遗留ContinueWith()
的最后一个方法)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.