简体   繁体   English

SQL 服务器更新查询极慢

[英]SQL Server update query extremely slow

I want to a simple update statement on a table with c.200m records.我想在包含 c.200m 条记录的表上执行一个简单的更新语句。 However, it seems to be taking ages.然而,这似乎需要很长时间。

UPDATE a
SET hybrid_trade_flag = CASE WHEN b.trade_id IS NOT NULL THEN 'Y' ELSE 'N' END
FROM [tbl_master_trades] a
LEFT JOIN [tbl_hybrid_trades_subset] b
    ON a.trade_id = b.trade_id

The 1st table tbl_master_trades has c.200m records and has an index created on trade_id and another column (together).第一个表tbl_master_trades有 c.200m 条记录,并在trade_id和另一列(一起)上创建了一个索引。 The 2nd table tbl_hybrid_trades_subset has around 200k.第二张表tbl_hybrid_trades_subset有大约 200k。 This query ran for over 40 mins before I had to cancel it (cancellation itself took around 30 min).在我不得不取消它之前,这个查询运行了 40 多分钟(取消本身花了大约 30 分钟)。

I thought maybe converting the 2nd table into a temp table and splitting the statement would help, so converted it into the following:我认为也许将第二个表转换为临时表并拆分语句会有所帮助,因此将其转换为以下内容:

UPDATE a
SET hybrid_trade_flag = 'Y'
FROM [tbl_master_trades] a
INNER JOIN #tmp_hybrid_trades b
    ON a.trade_id = b.trade_id

UPDATE a
SET hybrid_trade_flag = 'N'
FROM [tbl_master_trades] a
WHERE hybrid_trade_flag IS NULL

Even above two queries took 30 min to run.甚至以上两个查询也需要 30 分钟才能运行。 I need to run several such updates (c.80) on the 1st table, so I'm not sure if this is viable as it would take days?我需要在第一张桌子上运行几个这样的更新(c.80),所以我不确定这是否可行,因为这需要几天时间? Can someone please advise on if/how I can speed this up?有人可以建议我是否/如何加快速度吗?

I would start by rewriting the query to use exists :我将首先重写查询以使用exists

update t
set hybrid_trade_flag = case 
    when exists(select 1 from tbl_hybrid_trades_subset ts where ts.trade_id = t.trade_id)
    then 'Y'
    else 'N'
end
from tbl_master_trades t

Then, I would recommend an index on tbl_hybrid_trades_subset(trade_id) so the subquery can execute quickly.然后,我会推荐tbl_hybrid_trades_subset(trade_id)上的索引,以便子查询可以快速执行。

An index on tbl_master_trades(trade_id) might also help (without any other column in the index), but the index on the table that the subquery addresses seems more important. tbl_master_trades(trade_id)上的索引也可能有帮助(索引中没有任何其他列),但子查询所寻址的表上的索引似乎更重要。

That said, 200M rows is still a large number of rows to proceed, so the query will probably take quite a lot of time anyway.也就是说,200M 行仍然是需要处理的大量行,因此查询可能会花费相当多的时间。

You're running into 2 problems你遇到了两个问题

  • you're checking the existence of c.200m values in another table您正在检查另一个表中是否存在 c.200m 值
  • you're updating c.200m values in the base table您正在更新基表中的 c.200m 值

To work around this you can要解决这个问题,您可以

  • add the appropriate index to the lookup table将适当的索引添加到查找表
  • avoid updating where not strictly needed避免在不需要的地方更新

The index in question would be有问题的索引是

CREATE INDEX idx_trade_id ON tbl_hybrid_trades_subset (trade_id)

To limit the amount of updates use this:要限制更新数量,请使用:

UPDATE a
   SET hybrid_trade_flag = CASE WHEN b.trade_id IS NOT NULL THEN 'Y' ELSE 'N' END
  FROM [tbl_master_trades] a
  LEFT OUTER JOIN [tbl_hybrid_trades_subset] b
               ON b.trade_id = a.trade_id
 WHERE hybrid_trade_flag != CASE WHEN b.trade_id IS NOT NULL THEN 'Y' ELSE 'N' END

The first time might still take a while but subsequent updates should be quite a bit faster.第一次可能还需要一段时间,但后续更新应该会快很多。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM