简体   繁体   English

为什么SQL Server插入这么慢?

[英]Why are SQL server inserts so slow?

I'm trying to insert rows of in-memory data into a table on SQL Server Express 2005. It is running what seems to me very slowly - about 5 seconds per 1000 rows inserted. 我试图将内存中的数据行插入SQL Server Express 2005上的表中。对于我来说,它运行的速度非常慢-每插入1000行大约5秒。 I am just using a basic "INSERT INTO" command. 我只是在使用基本的“ INSERT INTO”命令。 The slowness does not depend on the table data - it is still slow with a table with one int column and no index. 慢速不取决于表数据-对于具有一个int列且没有索引的表,它仍然很慢。 It is nothing to do with my software - it is just as slow running SQL in a loop from Management Studio. 这与我的软件无关,它与Management Studio中循环运行的SQL一样慢。 There is nothing else accessing the database at the same time. 没有什么可以同时访问数据库了。 On a 3Ghz Xeon (old I know), this will take about 10 seconds to execute: 在3Ghz Xeon(我知道是旧的)上,执行此过程大约需要10秒:

declare @i int  
set @i = 0  
set nocount on  
while @i < 2000  
begin  
insert into testdb(testcolumn)  
values (1)  
set @i = @i + 1  
end  

Is there a better way to insert bulk in-memory data than looping on INSERT? 有没有比在INSERT上循环更好的插入大容量内存数据的方法? Or some configuration I should change in SQL Server? 还是我应该在SQL Server中更改一些配置?

You perform each insert inside its own transaction. 您可以在自己的事务中执行每个插入操作。

Beginning and committing transaction is very expensive in SQL Server . SQL Server开始和提交事务非常昂贵。

Enclose everything into a single transaction block: 将所有内容封装在一个事务块中:

declare @i int
set @i = 0
set nocount on
BEGIN TRANSACTION
while @i < 2000
begin
insert into testdb(testcolumn)
values (1)
set @i = @i + 1
end
COMMIT

To generate sample data, you can use a recursive CTE : 要生成样本数据,可以使用递归CTE

WITH    q (num) AS
        (
        SELECT  1
        UNION ALL
        SELECT  num + 1
        FROM    q
        WHERE   num < 2000
        )
INSERT
INTO    testdb(testcolumn)
SELECT  1
FROM    q
OPTION (MAXRECURSION 0)

, which will be faster. ,这样会更快。

1) Log Flush on commit. 1)提交时记录刷新。 Every transaction has to ensure the log is flushed to the disk before the commit returns. 每个事务都必须确保在提交返回之前将日志刷新到磁盘。 Every INSERT statement is an implicit transaction. 每个INSERT语句都是一个隐式事务。 Bulk commit: 批量提交:

declare @i int
set @i = 0
set nocount on
begin transaction
while @i < 2000
begin
  insert into testdb(testcolumn)
  values (1)
  set @i = @i + 1
  if (@i % 1000 = 0)
  begin
   commit;
   begin transaction;
  end
end
commit

2) Slow disk. 2)磁盘慢。 Check the Avg. 检查平均 Disk sec/Transfer performance counter for your data and your log disks. 磁盘秒/传输性能计数器,用于您的数据和日志磁盘。
3) To many indices (unlikely on a test table). 3)许多索引(不太可能在测试表上)。 Each index is nearly as expensive as a 'table' for inserts. 每个索引几乎与插入的“表”一样昂贵。
4) Triggers (again, unlikely) 4)触发器(再次,不太可能)

Ultimately, measure. 最终,衡量。 Follow the guidelines of a whitepaper like Troubleshooting Performance Problems in SQL Server 2005 if you don't know where to start. 如果您不知道从哪里开始,请遵循白皮书的准则,例如对SQL Server 2005中的性能问题进行故障排除

You have plenty of tools/techniques to get more performance out of this type of work load. 您拥有大量的工具/技术,可以从此类工作负荷中获得更高的性能。

  1. If appropriate Bulk Load anything you can. 如果合适,则可以批量加载任何内容。 Somethings you can't. 您无法做到的事情。 Need to run validated against the records, destination table has nullable columns... 需要针对记录运行验证,目标表具有可为空的列...
  2. Consider moving complex Data Warehousing/ETL operations to a staging database with no transaction logging (aka simple mode). 考虑将复杂的数据仓库/ ETL操作移动到没有事务日志记录(也称为简单模式)的登台数据库中。 This will improved performance greatly. 这将大大提高性能。 Then batch/bulk the data to the destination system. 然后将数据批量/批量到目标系统。
  3. Batch non-bulk load insert operations. 批量非批量加载插入操作。 Commit every n records start with 1,000 and performance tune from there. 每隔n条记录提交1,000条记录,并从那里开始进行性能调整。
  4. Improve the speed of your disk storage. 提高磁盘存储速度。 Smaller faster disk are much better than bigger and slower. 较小的较快磁盘比较大和较慢的磁盘要好得多。 The last db performance tuning project I worked on we moved from local disk 10,000 RPM to SAN then back to solid state disk on the server for some operations. 我从事的最后一个数据库性能调整项目从本地磁盘10,000 RPM移到SAN,然后又移回到服务器上的固态磁盘以进行某些操作。 Solid State most definitely rocks! 固态绝对是岩石! But is expensive. 但是很贵。
  5. Use the force, um performance tuning tools for Sql Server to find less obvious bottle necks. 使用Sql Server的强制性能优化工具来发现不太明显的瓶颈。 Sometimes the best course of action might be to drop and rebuilt indexes based on what % of records are being inserted/deleted compared to the table size; 有时,最好的做法是根据与表大小相比要插入/删除的记录百分比,删除和重建索引。 disable triggers during certain operations; 在某些操作期间禁用触发器; and modifying the sparseness of records in data blocks. 并修改数据块中记录的稀疏性。

In addition to indices, if you're actual scenario is as per your example, you could do a set-based approach to insert 2000 records like this: 除了索引之外,如果您的实际情况符合您的示例,则可以执行基于集合的方法来插入2000条记录,如下所示:

INSERT testdb(testcolumn)
SELECT 1
FROM master..spt_values
WHERE number BETWEEN 1 AND 2000

Insert speed is driven by the following things: 插入速度受以下因素驱动:

  1. The speed of your log disk. 日志磁盘的速度。 In particular, it's important that the log be on a volume by itself, so that disk seeks don't slow things down (can be a 40x effect) 尤其重要的是,日志本身应位于卷上,以使磁盘搜索不会减慢速度(可以达到40倍的效果)
  2. The structure of your table and associated indexes / keys / triggers, etc. 表的结构以及相关的索引/键/触发器等
  3. The size of your transactions. 您的交易规模。 Larger transactions require fewer round-trips to the log disk, and less associated overhead. 较大的事务需要较少的往返日志磁盘的往返,并减少了相关的开销。
  4. The size of your command batches. 您的命令批次的大小。 Larger batches are more efficient than many individual ones. 较大的批次比许多单个批次更有效。

In case it's of any interest, I go through this in detail in my book ( Ultra-Fast ASP.NET ), including benchmarks and example code. 如果有任何兴趣,我会在我的书( Ultra-Fast ASP.NET )中详细进行介绍,其中包括基准测试和示例代码。

Having a clustered index (usually primary key) actually increases insert speed, so verify you have one of those. 具有聚集索引(通常是主键)实际上可以提高插入速度,因此请确认您具有其中之一。 And running 1000 transactions against a table isn't the fastest way if you can have all of the data at once and insert it into the table (This can be accomplished by using table valued parameters in sql server 2008 or xml parameters in 2005). 如果可以一次拥有所有数据并将其插入到表中,那么对一个表运行1000个事务并不是最快的方法(这可以通过使用sql server 2008中的表值参数或2005年的xml参数来实现)。

I would google to "SQL Server Tuning"... There are many books written on the subject. 我会用谷歌搜索“ SQL Server Tuning” ...关于这个主题有很多书。 It is a very hard thing to solve as there are MANY things that affect speed, from query syntax, to RAM allocated to the server, to proportions of allocated RAM (to which part of SQL Server you allocate RAM), to RAID array configuration, and MANY other factors. 这是一件很难解决的事情,因为有很多因素会影响速度,从查询语法到分配给服务器的RAM,再到分配的RAM的比例(SQL Server分配RAM的那一部分),再到RAID阵列配置,和许多其他因素。 You can have a database server optimized for insert/updates (OLTP) or for querying (data warehouse type of stuff). 您可以使数据库服务器针对插入/更新(OLTP)或查询(东西的数据仓库类型)进行优化。 In other words, don't expect a single, simple answer to this, even thought your problem seems straightforward. 换句话说,即使您的问题看起来很简单,也不要期望对此有一个简单的答案。

This is why you have database server administrators. 这就是为什么您有数据库服务器管理员的原因。

Or you could just not sweat the server-side issues and optimize your client-code as much as possible, if timing is not very important to you. 或者,如果时间安排对您而言不是很重要,那么您可能就不会汗水服务器端问题并尽可能优化客户端代码。

I would look into prepared statements and transactions as a way to begin to optimize. 我将研究准备好的语句和事务作为开始进行优化的一种方式。 Then look at indexing (if this is a set of inserts that do not happen very often I would consider dropping indices, doing the import, and creating the indices again). 然后看一下索引(如果这是一组插入,但很少发生,我会考虑删除索引,进行导入,然后再次创建索引)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM