简体   繁体   English

如何在SQL Server 2008中加快联接和求和产品查询?

[英]How to speed up a join and sum product query in SQL Server 2008?

I have two tables A and B. A has float variables X1, X2, X3, ... , X9, sumprod and 14,000 rows. 我有两个表A和B。A具有浮点变量X1,X2,X3,...,X9,sumprod和14,000行。 B has float variables X1, X2, X3, ... , X9, a text variable 'Model' with values such as 'Model 1', 'Model 2' and so on, and 50 rows. B具有浮点变量X1,X2,X3,...,X9,文本变量“ Model”,其值例如为“ Model 1”,“ Model 2”等,并有50行。
I am trying to join and performs a sumproduct operation using the following code: 我正在尝试加入并使用以下代码执行sumproduct操作:

Update A set a.sumprod = a.X1*b.X1 + a.X2*b.X2 + ... + a.X9*b.X9
from a left join b
on b.Model = 'Model 2';

I have multiple such queries with different tables as A, and corresponding different join conditions on the Model variable in table B. I have identified these queries as taking the longest time in my stored procedure and am looking for a way to make them faster. 我有多个这样的查询,这些查询具有不同的表(如A),并且表B中的Model变量具有相应的不同连接条件。我已经确定这些查询在我的存储过程中花费了最长的时间,并且正在寻找一种使它们更快的方法。

I have tried variants of this query like below without any material changes in runtime: 我尝试了如下所示的此查询的变体,但在运行时没有任何实质性更改:

Variant 1: 变体1:

Update A
set a.sumprod = a.X1*b.X1 + a.X2*b.X2 + ... + a.X9*b.X9
from a left join b
on 1 = 1
where b.Model = 'Model 2';

Variant 2: 变体2:

merge A
using (select X1, X2, ..., X9 from B where Model = 'Model 2') C
on 1 = 1
when matched then update
set sumprod = a.X1*c.X1 + a.X2*c.X2 + ... + a.X9*c.X9;

Edit for greater clarity: 编辑更清晰:

There are multiple table A's: A1, A2, A3, ... Each table A# contains explanatory variables (X1, X2 etc) for a model (corresponding to the model number in table B). 有多个表A:A1,A2,A3,...每个表A#包含模型(与表B中的型号对应)的解释变量(X1,X2等)。

So table A1 may be: 因此表A1可能是:

X1 | X1 | X2 | X2 | X3 | X3 | X4 | X4 | Sumprod 桑普罗德

6 | 6 | 7 | 7 | 3 | 3 | 5 | 5 |

5 | 5 | 3 | 3 | 4 | 4 | 4 | 4 |

... ...

Table A2 would have a different number of explanatory variables, and the explanatory variables themselves would be different. 表A2将具有不同数量的解释变量,并且解释变量本身也将不同。 Also, the number of rows would be different from A1. 此外,行数将与A1不同。

Table B has model coefficients for each model like so: 表B具有每个模型的模型系数,如下所示:

Model | 型号| X1 | X1 | X2 | X2 | X3 | X3 | X4 | X4 | X5 | X5 | X6 | X6 | X7 | X7 | X8 | X8 | X9 X9

Model 1 | 模型1 | 3 | 3 | 2 | 2 | 5 | 5 | 9 | 9 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 0

Model 2 | 模型2 4 | 4 | 7 | 7 | 8 | 8 | 3 | 3 | 5 | 5 | 8 | 8 | 0 | 0 | 0 | 0 | 0 0

... ...

Model 1 has four explanatory variables, so the Model 1 row in table B has zero coefficients for columns X5 onwards. 模型1有四个解释变量,因此表B中的模型1行的X5列以后的系数为零。

What I want to do in the sumprod column of each table A is take the sum product of the explanatory variables and the coefficients from the correct row in table B. There is no common row identifier between the table A's and the coefficient table B. I am taking the sum product of EACH row in A1 with a SINGLE row in B. 我要在每个表A的sumprod列中执行的操作是取解释性变量与表B中正确行的系数的和积。表A与系数表B之间没有通用的行标识符。正在取A1中每个行与B中单个行的总和。

After the join, I want the sumprod column of table A1 to be populated as below: 连接之后,我希望表A1的sumprod列如下所示:

X1 | X1 | X2 | X2 | X3 | X3 | X4 | X4 | Sumprod 桑普罗德

1 | 1 | 7 | 7 | 3 | 3 | 5 | 5 | 6*3 + 7*2 + 3*5 + 5*9 = 92 6 * 3 + 7 * 2 + 3 * 5 + 5 * 9 = 92

5 | 5 | 3 | 3 | 4 | 4 | 4 | 4 | 5*3 + 3*2 + 4*5 + 4*9 = 77 5 * 3 + 3 * 2 + 4 * 5 + 4 * 9 = 77

... ...

Values for the explanatory variables are fixed but values for the model coefficients are user inputs and are expected to change fairly often. 解释变量的值是固定的,但是模型系数的值是用户输入的,并且预计会经常更改。

From the initial comments, it seems that this is not a good database structure for what I want to do. 从最初的评论看,这似乎不是我想要执行的操作的良好数据库结构。 Any suggestions for how I can make this faster? 有什么建议可以使我更快吗?

FROM a LEFT JOIN b ON b.Model = 'Model 2' ? FROM a LEFT JOIN b ON b.Model = 'Model 2'吗? I have no idea what behavior you're expecting from this join, and I suspect the query engine is equally confused. 我不知道您期望从此联接中获得什么行为,并且我怀疑查询引擎同样令人困惑。 Do you actually want a CROSS JOIN ? 您是否真的想要CROSS JOIN You should just say CROSS JOIN then. 您应该只说CROSS JOIN

Here's what I would do: 这就是我要做的:

UPDATE a
SET a.sumprod = a.X1 * b1.X1 + a.X2 * b1.X2 +...+ a.X9 * b1.X9
FROM a
CROSS JOIN (
    SELECT Model, X1, X2, ..., X9 FROM b where Model = 'Model 2'
    ) b1
WHERE a.sumprod <> a.X1 * b1.X1 + a.X2 * b1.X2 +...+ a.X9 * b1.X9
    OR a.sumprod is NULL;

Is there a reason that you have to CROSS JOIN ? 您是否有理由要CROSS JOIN Is there truly no relation between a and b ? ab之间确实没有关系吗? That seems like a design problem. 这似乎是一个设计问题。 You want to make everything in a.sumprod be a function of what's in one row of b ? 您想使a.sumprod中的所有 a.sumprod成为b一行中的内容的函数? Are you planning to change that repeatedly? 您是否打算反复更改? You've abstracted too far to tell what you're trying to accomplish. 您太抽象了,无法告诉您要完成的工作。

Personally, I would create a VIEW that returned the necessary product sums rather than updating a field in a as storing aggregates is a generally poor idea, but if you're already having performance issues that may not be wise. 就个人而言,我会创建一个VIEW返回的必要的乘积,而不是在更新领域a作为存储聚集是一个普遍较差的想法,但如果你已经有性能问题,可能并不明智。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM