The below query that I'm executing through SQL Server Management Studio is painfully slow.
The input table tbl_sb12_bhs
has about 40000 records and after an hour only 40 records are processed.
What can be changed here to make this run a bit faster?
DECLARE @bsrange INT
SET @bsrange = 0
WHILE @bsrange <= (SELECT max([p_a_l_out])
FROM [DB001].[FD\f7].[tbl_sb12_bhs])
BEGIN
INSERT INTO [FD\f7].tbl_sb13_b_lin1
(aId,
p_a_l_out,
bs_id,
bs_db,
bs_tbl,
bs_column,
Int1,
cd1,
Hop1,
Int2,
cd2,
Hop2,
Int3,
cd3,
Hop3,
Int4,
cd4,
Hop4,
Int5,
cd5,
Hop5,
Int6,
cd6,
Hop6,
Int7,
cd7,
Hop7,
Int8,
cd8,
Hop8,
Int9,
cd9,
Hop9,
Int10,
cd10,
Hop10,
Int11,
cd11,
Hop11,
Int12,
cd12,
Hop12,
Int13,
cd13,
Hop13,
Int14,
cd14,
Hop14,
Int15,
cd15,
Hop15,
Int16,
cd16,
Hop16)
SELECT DISTINCT tbl_sb12_bhs.aId,
tbl_sb12_bhs.p_a_l_out,
tbl_sb12_bhs.bs_id,
tbl_sb12_bhs.bs_db,
tbl_sb12_bhs.bs_tbl,
tbl_sb12_bhs.bs_column,
tbl_rpt_val_pt_crl.pt_el_Int AS Int1,
tbl_rpt_val_pt_crl.user_cd AS cd1,
tbl_rpt_val_pt_crl.cfk_upel AS Hop1,
tbl_rpt_val_pt_crl_1.pt_el_Int AS Int2,
tbl_rpt_val_pt_crl_1.user_cd AS cd2,
tbl_rpt_val_pt_crl_1.cfk_upel AS Hop2,
tbl_rpt_val_pt_crl_2.pt_el_Int AS Int3,
tbl_rpt_val_pt_crl_2.user_cd AS cd3,
tbl_rpt_val_pt_crl_2.cfk_upel AS Hop3,
tbl_rpt_val_pt_crl_3.pt_el_Int AS Int4,
tbl_rpt_val_pt_crl_3.user_cd AS cd4,
tbl_rpt_val_pt_crl_3.cfk_upel AS Hop4,
tbl_rpt_val_pt_crl_4.pt_el_Int AS Int5,
tbl_rpt_val_pt_crl_4.user_cd AS cd5,
tbl_rpt_val_pt_crl_4.cfk_upel AS Hop5,
tbl_rpt_val_pt_crl_5.pt_el_Int AS Int6,
tbl_rpt_val_pt_crl_5.user_cd AS cd6,
tbl_rpt_val_pt_crl_5.cfk_upel AS Hop6,
tbl_rpt_val_pt_crl_6.pt_el_Int AS Int7,
tbl_rpt_val_pt_crl_6.user_cd AS cd7,
tbl_rpt_val_pt_crl_6.cfk_upel AS Hop7,
tbl_rpt_val_pt_crl_7.pt_el_Int AS Int8,
tbl_rpt_val_pt_crl_7.user_cd AS cd8,
tbl_rpt_val_pt_crl_7.cfk_upel AS Hop8,
tbl_rpt_val_pt_crl_8.pt_el_Int AS Int9,
tbl_rpt_val_pt_crl_8.user_cd AS cd9,
tbl_rpt_val_pt_crl_8.cfk_upel AS Hop9,
tbl_rpt_val_pt_crl_9.pt_el_Int AS Int10,
tbl_rpt_val_pt_crl_9.user_cd AS cd10,
tbl_rpt_val_pt_crl_9.cfk_upel AS Hop10,
tbl_rpt_val_pt_crl_10.pt_el_Int AS Int11,
tbl_rpt_val_pt_crl_10.user_cd AS cd11,
tbl_rpt_val_pt_crl_10.cfk_upel AS Hop11,
tbl_rpt_val_pt_crl_11.pt_el_Int AS Int12,
tbl_rpt_val_pt_crl_11.user_cd AS cd12,
tbl_rpt_val_pt_crl_11.cfk_upel AS Hop12,
tbl_rpt_val_pt_crl_12.pt_el_Int AS Int13,
tbl_rpt_val_pt_crl_12.user_cd AS cd13,
tbl_rpt_val_pt_crl_12.cfk_upel AS Hop13,
tbl_rpt_val_pt_crl_13.pt_el_Int AS Int14,
tbl_rpt_val_pt_crl_13.user_cd AS cd14,
tbl_rpt_val_pt_crl_13.cfk_upel AS Hop14,
tbl_rpt_val_pt_crl_14.pt_el_Int AS Int15,
tbl_rpt_val_pt_crl_14.user_cd AS cd15,
tbl_rpt_val_pt_crl_14.cfk_upel AS Hop15,
tbl_rpt_val_pt_crl_15.pt_el_Int AS Int16,
tbl_rpt_val_pt_crl_15.user_cd AS cd16,
tbl_rpt_val_pt_crl_15.cfk_upel AS Hop16
FROM (SELECT DISTINCT pk_a AS aId,
p_a_l_out,
bs_id,
bs_db,
bs_tbl,
bs_column,
hop_pt_id_1,
hop_pt_id_2,
hop_pt_id_3,
hop_pt_id_4,
hop_pt_id_5,
hop_pt_id_6,
hop_pt_id_7,
hop_pt_id_8,
hop_pt_id_9,
hop_pt_id_10,
hop_pt_id_11,
hop_pt_id_12,
hop_pt_id_13,
hop_pt_id_14,
hop_pt_id_15,
hop_pt_id_16
FROM [FD\f7].tbl_sb12_bhs
WHERE [p_a_l_out] >= @bsrange
AND [p_a_l_out] < ( @bsrange + 1 )) AS tbl_sb12_bhs
LEFT JOIN tbl_rpt_val_pt_crl
ON tbl_sb12_bhs.hop_pt_id_1 = tbl_rpt_val_pt_crl.sk_el_pt
LEFT JOIN tbl_rpt_val_pt_crl AS tbl_rpt_val_pt_crl_1
ON tbl_sb12_bhs.hop_pt_id_2 = tbl_rpt_val_pt_crl_1.sk_el_pt
LEFT JOIN tbl_rpt_val_pt_crl AS tbl_rpt_val_pt_crl_2
ON tbl_sb12_bhs.hop_pt_id_3 = tbl_rpt_val_pt_crl_2.sk_el_pt
LEFT JOIN tbl_rpt_val_pt_crl AS tbl_rpt_val_pt_crl_3
ON tbl_sb12_bhs.hop_pt_id_4 = tbl_rpt_val_pt_crl_3.sk_el_pt
LEFT JOIN tbl_rpt_val_pt_crl AS tbl_rpt_val_pt_crl_4
ON tbl_sb12_bhs.hop_pt_id_5 = tbl_rpt_val_pt_crl_4.sk_el_pt
LEFT JOIN tbl_rpt_val_pt_crl AS tbl_rpt_val_pt_crl_5
ON tbl_sb12_bhs.hop_pt_id_6 = tbl_rpt_val_pt_crl_5.sk_el_pt
LEFT JOIN tbl_rpt_val_pt_crl AS tbl_rpt_val_pt_crl_6
ON tbl_sb12_bhs.hop_pt_id_7 = tbl_rpt_val_pt_crl_6.sk_el_pt
LEFT JOIN tbl_rpt_val_pt_crl AS tbl_rpt_val_pt_crl_7
ON tbl_sb12_bhs.hop_pt_id_8 = tbl_rpt_val_pt_crl_7.sk_el_pt
LEFT JOIN tbl_rpt_val_pt_crl AS tbl_rpt_val_pt_crl_8
ON tbl_sb12_bhs.hop_pt_id_9 = tbl_rpt_val_pt_crl_8.sk_el_pt
LEFT JOIN tbl_rpt_val_pt_crl AS tbl_rpt_val_pt_crl_9
ON tbl_sb12_bhs.hop_pt_id_10 = tbl_rpt_val_pt_crl_9.sk_el_pt
LEFT JOIN tbl_rpt_val_pt_crl AS tbl_rpt_val_pt_crl_10
ON tbl_sb12_bhs.hop_pt_id_11 = tbl_rpt_val_pt_crl_10.sk_el_pt
LEFT JOIN tbl_rpt_val_pt_crl AS tbl_rpt_val_pt_crl_11
ON tbl_sb12_bhs.hop_pt_id_12 = tbl_rpt_val_pt_crl_11.sk_el_pt
LEFT JOIN tbl_rpt_val_pt_crl AS tbl_rpt_val_pt_crl_12
ON tbl_sb12_bhs.hop_pt_id_13 = tbl_rpt_val_pt_crl_12.sk_el_pt
LEFT JOIN tbl_rpt_val_pt_crl AS tbl_rpt_val_pt_crl_13
ON tbl_sb12_bhs.hop_pt_id_14 = tbl_rpt_val_pt_crl_13.sk_el_pt
LEFT JOIN tbl_rpt_val_pt_crl AS tbl_rpt_val_pt_crl_14
ON tbl_sb12_bhs.hop_pt_id_15 = tbl_rpt_val_pt_crl_14.sk_el_pt
LEFT JOIN tbl_rpt_val_pt_crl AS tbl_rpt_val_pt_crl_15
ON tbl_sb12_bhs.hop_pt_id_16 = tbl_rpt_val_pt_crl_15.sk_el_pt
SET @bsrange = @bsrange + 1
END
Well if you have an index or indexes on the target then SQL will reindex every row. I'd disable any indexes on the target table and then renable them when the insert is complete. Id batch the inserts inro ranges of (say) 5k records so any blocking will be reduced, or I'd create a temp file as a result of the select and bcp in the results. Because your doing that horrendous set of left joins each time prior to one record insert. SQL just cant optimise more than about 7 or 8 left or right joins. My guess is that there are little or no indexes on the table being inserted from which means a table scan on for each join or around 17 tables scans for each one row inserted. Sorry but this approach is wrong at every stage. Or you could get you boss to buy you a datecentre....
My best guess is that it's slow because you're doing a number of intensive operations all in one go. Without any sample data it's tough, but I can try to make a few suggestions.
From what you said about it only processing 40 records after an hour, it's what's going on inside the loop that's slowing you down.
SELECT DISTINCT isn't cheap because it has to compare all the data, and you're comparing quite a lot of columns as well. If you can, it might run quicker if you limit the number of columns to the bare minimum required for a distinct selection then self joining that to the original table. It should be simple enough to test in isolation to the rest of it to make sure you're getting the same results and whether or not it's quicker.
Also the more joins you have, the worse the performance is in general... the price we pay for normalisation.
Anyway, I would take a step back from it and try to break this down into its smallest units of work and then you can test each one individually until you find the culprit. In doing so, you might think of a much better way to do this. Again, without any sample data this is a difficult one for me to help with.
Another thing you might want to do is do the join initially into a temp table and then reference that. You would not have to do the distinct or joins every time. Just add the where clause for the bsrange.
So it would be something like:
Create temporary table with as much of the joins/distinct as you can.
while.....
insert into [FD\f7].tbl_sb13_b_lin1
select * from temptable where [p_a_l_out] >= @bsrange
AND [p_a_l_out] < ( @bsrange + 1 )
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.