简体   繁体   English

批量删除行。 如何打开/重用 SQL Server 连接?

[英]Deleting rows by batches. How to open / reuse SQL Server connection?

What would be the most effective way to open/use a SQL Server connection if we're reading rows to be deleted in batches?如果我们要批量读取要删除的行,打开/使用 SQL Server 连接的最有效方法是什么?

foreach(IEnumerable<Log> logsPage in LogsPages)
{
    foreach(Log logEntry in logsPage)
    {
        // 1. get associated filenames
        // 2. delete row
        // 3. try delete each file
    }
}
  • Log page size is about 5000 rows日志页大小约为 5000 行
  • Files associated with the log entries may vary in size.与日志条目相关联的文件大小可能不同。 I don't think they are larger than say 500 Mb.我不认为它们比 500 Mb 大。
  • We use Dapper我们使用 Dapper

Should we let Dapper open connections on each step of the foreach loop?我们应该让 Dapper 在foreach循环的每一步打开连接吗? I suppose SQL Server connection pooling takes place here?我想 SQL Server 连接池发生在这里?

Or should we open an explicit connection per batch?或者我们应该每批打开一个显式连接?

If you're performing multiple database operations in a tight loop, it would usually be preferable to open the connection for the duration of all the operations.如果您在紧密循环中执行多个数据库操作,通常最好在所有操作期间打开连接。 Returning the connection to the pool can be beneficial in contested systems where there can be an indeterminate interval before the next database operation, but if you're doing lots of sequential operations: constantly fetching and returning connections from the pool (and executing sp_reset_connection , which happens behind the scenes) add overhead for no good reason.将连接返回到池在有争议的系统中可能是有益的,在这种系统中,在下一次数据库操作之前可能存在不确定的间隔,但如果您正在执行大量顺序操作:不断地从池中获取和返回连接(并执行sp_reset_connection ,它发生在幕后)无缘无故地增加开销。

So to be explicit, I'd have the Open[Async]() here above the first foreach .因此,明确地说,我将Open[Async]()放在第一个foreach之上。

Note: for batching, you might find that there are ways of doing this with fewer round-trips, in particular making use of the IN re-writing in Dapper based on the ids.注意:对于批处理,您可能会发现有一些方法可以减少往返次数,特别是利用基于 id 的 Dapper 中的IN重写。 Since you mention SQL-Server, This can be combined with setting a SqlMapper.Settings.InListStringSplitCount to something positive (5, 10, etc are reasonable choices; note that this is a global setting);既然你提到了 SQL-Server,这可以与将SqlMapper.Settings.InListStringSplitCount设置为正数相结合(5、10 等是合理的选择;注意这是一个全局设置); for example, for a simple scenario:例如,对于一个简单的场景:

connection.Execute("delete from Foo where Id in @ids",
    new { ids = rows.Select(x => x.Id) });

is much more efficient than:有效:

foreach (var row in rows)
{
    connection.Execute("delete from Foo where Id = @id",
        new { id = row.Id });
}

Without InListStringSplitCount , the first version will be re-written as something like:如果没有InListStringSplitCount ,第一个版本将被重写为:

delete from Foo where Id in (@ids0, @ids1, @ids2, ..., @idsN)

With InListStringSplitCount , the first version will be re-written as something like:使用InListStringSplitCount ,第一个版本将被重写为:

delete from Foo where Id in (select cast([value] as int) from string_split(@ids,','))

which allows the exact same query to be used many times, which is good for query-plan re-use.这允许多次使用完全相同的查询,这有利于查询计划的重用。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM