简体   繁体   English

SQL 服务器 - SUM 前行,条件链接到原始行

[英]SQL Server - SUM Preceding Rows With Condition Linked to Original Row

For each row in the below example data set, the code does a sum of the previous 5 rows when a certain condition is met.对于以下示例数据集中的每一行,当满足某个条件时,代码会对前 5 行进行求和。

The problem I'm having is the condition needs to reference the original row rating eg I need to sum preceding rows only if the rating is within 1 of the current row.我遇到的问题是条件需要引用原始行评级,例如,只有当评级在当前行的 1 以内时,我才需要对前面的行求和。

Example data:示例数据:

DECLARE @tbl TABLE 
             (
                 Team varchar(1),
                 date date, 
                 Rating int, 
                 Score int
             );

INSERT INTO @tbl (Team, Date, Rating, Score)
VALUES
('a', '2020/12/05', '20', '1'),
('a', '2020/12/04', '18', '8'),
('a', '2020/12/03', '21', '3'),
('a', '2020/12/02', '19', '4'),
('a', '2020/12/01', '19', '3');

Current code:当前代码:

SELECT
    Rating, 
    SUM(CASE WHEN Rating >= (Rating-1) AND  Rating <= (Rating+1) THEN SCORE END) 
        OVER (partition by Team ORDER BY Date ASC ROWS BETWEEN 5 PRECEDING AND 1 PRECEDING) AS SUM
FROM
    @tbl
ORDER BY 
    Date DESC

Output: Output:

    +------------------+------------+------------+
    |  Rating          | Current    | Required   | 
    +------------------+------------+------------+
    | 20               | 18         |     7      |
    | 18               | 10         |     7      |
    | 21               | 7          |     NULL   |
    | 19               | 3          |     3      |
    | 18               | NULL       |     NULL   |
    +------------------+------------+------------+

The problem is this following section of the code is not working as the rating is being assessed on a line by line by line basis.问题是代码的以下部分不起作用,因为正在逐行评估评级。

CASE WHEN Rating >= (Rating-1) AND  Rating <= (Rating+1)

I need it to assess against the rating of the original row (I've looked into Top but that isn't working):我需要它来评估原始行的评级(我已经查看了Top但这不起作用):

CASE WHEN Rating >= ((SELECT TOP 1 Rating) - 1) AND Rating <= ((SELECT TOP 1 Rating) + 1)

Any help appreciated as always.一如既往地感谢任何帮助。

What you are describing sounds like a lateral join:您所描述的听起来像是横向连接:

SELECT t.*, t2.*
FROM @tbl t OUTER APPLY
     (SELECT SUM(t2.score) as score_5
      FROM (SELECT TOP (5) t2.*
            FROM @tbl t2
            WHERE t2.date < t.date
            ORDER BY t2.date DESC
           ) t2
      WHERE t2.rating BETWEEN t.rating - 1 AND t.rating + 1
     ) t2
ORDER BY Date DESC

I'm not familiar with sql server syntax, but here's how to do it using Spark SQL.我不熟悉 sql 服务器语法,但这里是使用 Spark SQL 的方法。 Essentially the idea is to create a row for each pair of rows which are within 5 rows of each other, and then do the sum if .本质上,这个想法是为每对彼此相距 5 行以内的行创建一行,然后sum if

select
    Team, date, Rating,
    sum(case when old_score[0] between rating-1 and rating+1 then old_score[1] end) as sum
from (
    select
        *,
        explode_outer(scores) as old_score
    from (
        select
            *,
            collect_list(array(rating, score))
            over (partition by Team order by date rows between 5 preceding and 1 preceding) scores
        from tbl
    )
)
group by Team, date, Rating
order by Team, date, Rating;

which gives这使

a       2020-12-01      19      NULL
a       2020-12-02      19      3
a       2020-12-03      21      NULL
a       2020-12-04      18      7
a       2020-12-05      20      10

and reveals that you've probably made a mistake in your expected output;)并表明您可能在预期的 output 中犯了错误;)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM