简体   繁体   English

使用基于集合的操作快速计算 sql 服务器中的运行总计

[英]Quickly calculating running totals in sql server using set based operations

I have some data that looks like this:我有一些看起来像这样的数据:

+---+--------+-------------+---------------+--------------+
|   |   A    |      B      |       C       |      D       |
+---+--------+-------------+---------------+--------------+
| 1 | row_id | disposal_id | excess_weight | total_weight |
| 2 | 1      | 1           | 0             | 30           |
| 3 | 2      | 1           | 10            | 30           |
| 4 | 3      | 1           | 0             | 30           |
| 5 | 4      | 2           | 5             | 50           |
| 6 | 5      | 2           | 0             | 50           |
| 7 | 6      | 2           | 15            | 50           |
| 8 | 7      | 2           | 5             | 50           |
| 9 | 8      | 2           | 5             | 50           |
+---+--------+-------------+---------------+--------------+

And I am transforming it to look like this:我正在把它改成这样:

+---+--------+-------------+---------------+--------------+
|   |   A    |      B      |       C       |      D       |
+---+--------+-------------+---------------+--------------+
| 1 | row_id | disposal_id | excess_weight | total_weight |
| 2 | 1      | 1           | 0             | 30           |
| 3 | 2      | 1           | 10            | 30           |
| 4 | 3      | 1           | 0             | 20           |
| 5 | 4      | 2           | 5             | 50           |
| 6 | 5      | 2           | 0             | 45           |
| 7 | 6      | 2           | 15            | 45           |
| 8 | 7      | 2           | 5             | 30           |
| 9 | 8      | 2           | 5             | 25           |
+---+--------+-------------+---------------+--------------+

Basically, I need to update the total_weight column by subtracting the sum of the excess_weights from previous rows in the table which belong to the same disposal_id.基本上,我需要通过从表中属于相同disposal_id的先前行中减去excess_weights的总和来更新total_weight列。

I'm currently using a cursor because it's faster then other solutions I've tried (cte, triangular join, cross apply).我目前正在使用 cursor,因为它比我尝试过的其他解决方案(cte、三角连接、交叉应用)更快。 My cursor solution keeps a running total that is reset to zero for each new disposal_id, increments it by the excess weight, and performs updates when needed and runs in about 40 seconds.我的 cursor 解决方案保留了一个运行总计,对于每个新的处理 ID,该总计重置为零,将其增加超重,并在需要时执行更新,并在大约 40 秒内运行。 The other solutions I've tried took anywhere from 3-5 minutes and I'm wondering if there is a relatively performant way to do this using set based operations?我尝试过的其他解决方案需要 3-5 分钟,我想知道是否有一种相对高效的方法可以使用基于集合的操作来做到这一点?

I've spent a lot of time optimizing such queries, ended up with two performant options: either store precalculated running totals, as described in Denormalizing to enforce business rules: Running Totals , or calculate them on the client, which is also fast and easy.我花了很多时间优化这样的查询,最终得到了两个性能选项:或者存储预先计算的运行总计,如反规范化以强制执行业务规则中所述:运行总计,或者在客户端上计算它们,这也既快速又简单.

The other solution you probably already tried is to do something like the answers found here您可能已经尝试过的另一个解决方案是执行类似于此处找到的答案的操作

Unless you are using Oracle, which has decent aggregates for cumulative sum, you're better off using a cursor.除非您使用的 Oracle 具有不错的累积总和,否则最好使用 cursor。 At best, you're going to have to rejoin the table to itself or use another methods for what should be a O(n) operation.充其量,您将不得不将表重新加入到自身或使用其他方法进行 O(n) 操作。 In general, the set based solution for problems like these are messy or really messy.一般来说,针对此类问题的基于集合的解决方案是混乱的或非常混乱的。

'previous rows' implies an ordering. “前几行”意味着排序。 so no - no set based operations there.所以不 - 那里没有基于集合的操作。

Oracle's LEAD and LAG are built for this, but SQL Server forces you into triangular joins... which i suppose you have investigated. Oracle 的 LEAD 和 LAG 是为此而构建的,但是 SQL 服务器迫使您进入三角连接……我想您已经调查过了。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM