简体   繁体   中英

Running SUM in T-SQL

Sorry for bad topic but I wasn't sure what to call it..

I have a table looking like this:

+-----++-----+
| Id  ||Count|
+-----++-----+
| 1   || 1   |
+-----++-----+
| 2   || 5   |
+-----++-----+
| 3   || 8   |
+-----++-----+
| 4   || 3   |
+-----++-----+
| 5   || 6   |
+-----++-----+
| 6   || 8   |
+-----++-----+
| 7   || 3   |
+-----++-----+
| 8   || 1   |
+-----++-----+

I'm trying to make a select from this table where every time the SUM of row1 + row2 + row3 (etc) reaches 10, then it's a "HIT", and the count starts over again.

Requested output:

+-----++-----++-----+
| Id  ||Count|| HIT |
+-----++-----++-----+
| 1   || 1   ||  N  | Count = 1
+-----++-----++-----+
| 2   || 5   ||  N  | Count = 6
+-----++-----++-----+
| 3   || 8   ||  Y  | Count = 14 (over 10)
+-----++-----++-----+
| 4   || 3   ||  N  | Count = 3
+-----++-----++-----+
| 5   || 6   ||  N  | Count = 9
+-----++-----++-----+
| 6   || 8   ||  Y  | Count = 17 (over 10..)
+-----++-----++-----+
| 7   || 3   ||  N  | Count = 3
+-----++-----++-----+
| 8   || 1   ||  N  | Count = 4
+-----++-----++-----+

How do I do this, and with best performance? I have no idea..

This is too long for a comment.

You can't do this using window/analytic functions, because the breakpoints are not known in advance. Sometimes, it is possible to calculate the breakpoints. However, in this case, the breakpoints depend on a non-linear function of the previous values (I can't think of a better word than "non-linear" right now). That is, sometimes adding "1" to an earlier value has zero effect on the calculation for the current row. Sometimes it has a big effect. The implication is that the calculation has to start at the beginning and iterate through the data.

A minor modification to the problem would be solvable using such functions. If the problem were, instead, to carry over the excess amount for each group (instead of restarting the sum), the problem would be solvable using cumulative sums (and some other trickery).

Recursive queries (which others have provided) or a sequential operation is the best way to approach this problem. Unfortunately, it doesn't have a set-based method for solving it.

You could use Recursive Queries

Please note the following query assuming the id value are all in sequence, otherwise, please use ROW_NUMBER() to create a new id

WITH cte AS (
  SELECT id, [Count], [Count] AS total_count
  FROM Table1 
  WHERE id = 1
  UNION ALL
  SELECT t2.id,t2.[Count], CASE WHEN t1.total_count >= 10 THEN t2.[Count] ELSE t1.total_count + t2.[Count] END
  FROM Table1 t2
  INNER JOIN cte t1 
    ON t2.id = t1.id + 1
  )
SELECT *
FROM cte
ORDER BY id

SQL Fiddle

I'm really hoping someone can show us if it's possible to do this using straight-forward window functions. That's the real challenge.

In the meantime, here is how I would do it using recursion. This handles gaps in the sequence, and handles the edge case of the first row already being >= 10 .

I also added the maxrecursion hint to remove the default recursion limit. But I honestly don't know how well it will run with larger amounts of data.

with NumberedRows as (
  select Id, Cnt,
         row_number() over(order by id) as rn
    from CountTable
), RecursiveCTE as (
  select Id, Cnt, rn, 
         case when Cnt >= 10 then 0 else Cnt end as CumulativeSum,
         case when Cnt >= 10 then 'Y' else 'N' end as hit
    from NumberedRows
   where rn = 1
  union all
  select n.Id, n.Cnt, n.rn,
         case when (n.Cnt + r.CumulativeSum) >= 10 then 0 else n.Cnt + r.CumulativeSum end as CumulativeSum,
         case when (n.Cnt + r.CumulativeSum) >= 10 then 'Y' else 'N' end as hit
    from RecursiveCTE r
    join NumberedRows n
      on n.rn = r.rn + 1
)
select Id, Cnt, hit
from RecursiveCTE
order by Id
option (maxrecursion 0)

SQLFiddle Demo

How about this using Running Totals:

DECLARE @Data TABLE(
    Id INT
    ,SubTotal INT
)


INSERT INTO @Data
    VALUES(1, 5)

INSERT INTO @Data
    VALUES(2, 3)

INSERT INTO @Data
    VALUES(3, 4)

INSERT INTO @Data
    VALUES(4, 4)

INSERT INTO @Data
    VALUES(5, 7)

DECLARE @RunningTotal INT = 0
DECLARE @HitCount INT = 0    

SELECT  
        @RunningTotal = CASE WHEN @RunningTotal < 10 THEN @RunningTotal + SubTotal ELSE SubTotal END
        ,@HitCount = @HitCount + CASE WHEN @RunningTotal >= 10 THEN 1 ELSE 0 END
        FROM @Data ORDER BY Id

SELECT @HitCount -- Outputs 2

Having re-read the question I see this does not meet the required output - I'll leave the answer as it may be of some use to someone looking for an example of a running total solution to this type of problem that doesn't need each row tagged with a Y or an N.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM