简体   繁体   English

SQL-如何在没有多个子选择的情况下比较更改的列值

[英]SQL - How to compare changing column values without multiple sub-selects

I'm writing a TSQL query. 我正在写一个TSQL查询。

I have the following table where the column A and B will occasionally change. 我有下表,其中A列和B列会偶尔更改。 I'm interested in every row where either A or B has changed as compared to the previous row (or when the previos row doesn't exist, that is to say, the first row). 我对与上一行相比(或当previos行不存在时,即第一行)发生变化的每一行都感兴趣。 Each date will always be unique. 每个日期将始终是唯一的。

Date                    A   B       SysId
2015-02-01 00:00:00.000 2   1201    949410
2015-01-01 00:00:00.000 3   1201    949410
2014-01-01 00:00:00.000 2   1201    949410
2013-01-01 00:00:00.000 2   1200    949410
2012-01-01 00:00:00.000 2   1200    949410
2011-01-01 00:00:00.000 2   1200    949410
2010-01-01 00:00:00.000 2   1200    949410
2009-01-01 00:00:00.000 2   1200    949410
2008-01-01 00:00:00.000 2   1200    949410
2007-01-01 00:00:00.000 2   1200    949410
2006-01-01 00:00:00.000 2   1200    949410
2005-01-01 00:00:00.000 2   1200    949410
2004-01-01 00:00:00.000 2   1200    949410
2003-01-01 00:00:00.000 2   1200    949410
2002-01-01 00:00:00.000 3   1200    949410
2001-01-01 00:00:00.000 2   1200    949410
2000-01-01 00:00:00.000 3   1200    949410
1999-01-01 00:00:00.000 3   1200    949410
1998-01-01 00:00:00.000 3   1200    949410
1997-01-01 00:00:00.000 3   1200    949410
1996-01-01 00:00:00.000 3   1200    949410
1995-01-01 00:00:00.000 3   1200    949410
1994-01-01 00:00:00.000 3   1200    949410
1993-01-01 00:00:00.000 3   1200    949410
1992-01-01 00:00:00.000 3   1200    949410
1991-01-01 00:00:00.000 3   1200    949410
1990-01-01 00:00:00.000 3   1200    949410
1989-01-01 00:00:00.000 3   1200    949410
1988-01-01 00:00:00.000 3   1200    949410
1987-01-01 00:00:00.000 3   1200    949410
1986-01-01 00:00:00.000 3   1200    949410
1985-01-01 00:00:00.000 3   1200    949410
1984-01-01 00:00:00.000 2   1200    949410

In this case, the result should be: 在这种情况下,结果应为:

Date                    A   B       SysId
2015-02-01 00:00:00.000 2   1201    949410
2015-01-01 00:00:00.000 3   1201    949410
2014-01-01 00:00:00.000 2   1201    949410
2003-01-01 00:00:00.000 2   1200    949410
2002-01-01 00:00:00.000 3   1200    949410
2001-01-01 00:00:00.000 2   1200    949410
1985-01-01 00:00:00.000 3   1200    949410
1984-01-01 00:00:00.000 2   1200    949410

Since we are interested in the first row where A or B has changed. 由于我们对A或B发生变化的第一行感兴趣。

I have an extremly ugly and expensive select which does this for me: 我有一个非常丑陋和昂贵的选择,可以为我做这件事:

SELECT Date, A, B, SysId
FROM SysHistory fb1
WHERE fb1.SysId = 949410
AND 
(
    (
        ((
            SELECT TOP 1 fb2b.A
            FROM SysHistory fb2b
            WHERE fb2b.Date < fb1.Date 
            AND fb2b.SysId = 949410
            order by Date DESC
        )) <> fb1.StatusId
        OR 
        ((
            SELECT TOP 1 fb2a.A
            FROM SysHistory fb2a
            WHERE fb2a.Date < fb1.Date 
            AND fb2a.SysId= 949410
            order by Date  DESC
        )) IS NULL
    )
    OR
    (
        ((
            SELECT TOP 1 fb3b.B
            FROM SysHistory fb3b
            WHERE fb3b.Date < fb3b.Date 
            AND fb3b.SysId= 949410
            order by Date DESC
        )) <> fb1.StatusId
        OR 
        ((
            SELECT TOP 1 fb3a.B
            FROM SysHistory fb3a
            WHERE fb3a.Date < fb1.Date 
            AND fb3a.SysId = 949410
            order by Date DESC
        )) IS NULL
    )
)
order by Date DESC

Notice that for each I fetch the top A or B attribute from the previous row. 请注意,对于每个我我都从上一行获取顶部的A或B属性。 Since the previous row might be null (in the case when we are on the first row in the table), I also have an OR statement for A and B which checks for null. 由于上一行可能为空(在表中第一行的情况下),所以我还为A和B设置了OR语句,该语句检查null。

I feel like there must be a better way to do this. 我觉得必须有更好的方法来做到这一点。

Is it possible to, in TSQL, compare multiple columns in the same subselect? 在TSQL中,是否可以比较同一子选择中的多个列? Or just generally, how would you improve this query? 或者只是一般而言,您将如何改进此查询? Is there anyway to make it more compact or potentially faster? 是否有使其更紧凑或可能更快的方法?

I guess my question is bordering on best practice but I feel that this is technically a syntax question. 我想我的问题接近最佳实践,但我认为从技术上讲这是一个语法问题。

Import Update I've now noticed that my query doesn't actually give me the results I want. 导入更新我现在注意到,查询实际上并没有给我想要的结果。 So the SQL query above doesn't seem to work. 因此,上面的SQL查询似乎不起作用。 The result in this case should be 在这种情况下的结果应该是

Date                    A   B       SysId
2015-02-01 00:00:00.000 2   1201    949410
2015-01-01 00:00:00.000 3   1201    949410
2014-01-01 00:00:00.000 2   1201    949410
2003-01-01 00:00:00.000 2   1200    949410
2002-01-01 00:00:00.000 3   1200    949410
2001-01-01 00:00:00.000 2   1200    949410
1985-01-01 00:00:00.000 3   1200    949410
1984-01-01 00:00:00.000 2   1200    949410

Instead, the result is: 相反,结果是:

Date                    A   B       SysId
2015-02-01 00:00:00.000 2   1201    949410
2015-01-01 00:00:00.000 3   1201    949410
2003-01-01 00:00:00.000 2   1200    949410
2002-01-01 00:00:00.000 3   1200    949410
2001-01-01 00:00:00.000 2   1200    949410
1985-01-01 00:00:00.000 3   1200    949410
1984-01-01 00:00:00.000 2   1200    949410

You can apply ROW_NUMBER() against the data so that you can perform a self-join to find previous rows: 您可以对数据应用ROW_NUMBER() ,以便可以执行自联接以查找先前的行:

;WITH Numbered as (
  SELECT Date, A, B, SysId,
    ROW_NUMBER() OVER (ORDER BY Date desc) as rn
  FROM SysHistory fb1
  WHERE fb1.SysId = 949410
)
select n1.*
from Numbered n1
   left join
     Numbered n2
        on n1.rn = n2.rn - 1
where
  n2.Date is null or --If you want to include the earliest row
  n1.A <> n2.A or
  n1.B <> n2.B

Results (having put your sample data in a table variable called @SysHistory , changed above query to reference it, and escaped the Date column as [Date] since using type names as column names is usually a bad idea): 结果(将示例数据放入名为@SysHistory的表变量中,在查询上进行了更改以引用它,并使用[Date] @SysHistory Date列,因为使用类型名作为列名通常是一个坏主意):

Date                    A           B           SysId       rn
----------------------- ----------- ----------- ----------- --------------------
2015-02-01 00:00:00.000 2           1201        949410      1
2015-01-01 00:00:00.000 3           1201        949410      2
2014-01-01 00:00:00.000 2           1201        949410      3
2003-01-01 00:00:00.000 2           1200        949410      14
2002-01-01 00:00:00.000 3           1200        949410      15
2001-01-01 00:00:00.000 2           1200        949410      16
1985-01-01 00:00:00.000 3           1200        949410      32
1984-01-01 00:00:00.000 2           1200        949410      33

Which seems to match your expected result (except for my extra column) 这似乎与您的预期结果相符(除了我的额外专栏)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM