简体   繁体   English

SQL-更新大表(9亿条)记录的查询性能

[英]SQL - Update query performance for large table (900 million) records

I have a database table that has 900 million records. 我有一个拥有9亿条记录的数据库表。 I am in a situation where I need to update 4 different keys in that table, by joining them to a dimension and setting the key of the fact table to the key of the dimension. 我处于一种情况,需要更新该表中的4个不同键,方法是将它们联接到一个维度,然后将事实表的键设置为该维度的键。 I have written 4 different SQL scripts (see example below) to perform the update, however problem is it is taking way too long to execute. 我已经编写了4种不同的SQL脚本(请参见下面的示例)来执行更新,但是问题是执行时间太长。 The query has been running for more than 20 hours and I am not even sure how far it go and how long this will take. 该查询已经运行了20多个小时,我什至不确定它走了多长时间以及需要多长时间。 Is there any way I can do to improve this so it only takes few hours to complete. 我有什么办法可以改善此状况,因此只需几个小时即可完成。 Would adding Indexes improve this? 添加索引会改善这一点吗?

UPDATE f
SET f.ClientKey = c.ClientKey
FROM dbo.FactSales f
JOIN dbo.DimClient c
ON f.ClientId = c.ClientId
  1. Script foreign keys. 脚本外键。 Drop them. 放下
  2. Script indexes on updated columns (which are not part of condition). 更新列上的脚本索引(不属于条件)。 Drop them. 放下
  3. Disable triggers if exist. 禁用触发器(如果存在)。
  4. Disable all processes which can make locks (=all, include selects). 禁用所有可以进行锁定的进程(= all,包括selects)。
  5. Update your keys. 更新密钥。
  6. Recreate your foreign keys, indexes, enable triggers. 重新创建您的外键,索引,启用触发器。
  7. Be happy. 要开心。

And comment for 5 - prepare your primary key from destination table with all new source code only and do one statement. 并注释5-仅使用所有新的源代码从目标表中准备主键,并执行一条语句。 It means lesser cost on joins and it will be only one join. 这意味着较少的连接成本,并且只有一个连接。

Can use this to not fill up the transaction log 可以用它来不填写交易日志

select 1 
while(@@rowcount > 0)
begin 
    UPDATE f
    SET top (100000) f.ClientKey = c.ClientKey
    FROM dbo.FactSales f
    JOIN dbo.DimClient c
    ON  f.ClientId   = c.ClientId 
    AND f.ClientKey != c.ClientKey
end

If you need to update 4 different keys then do them all at once 如果您需要更新4个不同的键,则一次完成所有操作
Most of the cost is acquiring lock 大部分成本是获取锁

Disable f.ClientKey, run the update, and then rebuild it 禁用f.ClientKey,运行更新,然后重建它

If you are sure DimClient is not going to change with (nolock) but need to be sure 如果您确定DimClient不会更改with (nolock)但需要确保

If you are the only process that need to update FactSales take a tablock holdlock 如果您是唯一需要更新FactSales的进程,请使用Tablock Holdlock

Create a new table with the correct values. 用正确的值创建一个新表。 Add indexes, constraints afterwards. 然后添加索引和约束。 Drop the existing table and rename the new one to existing one in one transaction if that is possible . 删除现有表并将新表重命名为现有的一对一事务。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM