繁体   English   中英

从表中选择一列的值不同于该列的先前值的所有行

[英]Select all rows from a table where value on a column differs from previous value on that column

首先让我开始描述我的桌子。

一列是公司ID列(整数值),另一列是格式为yyyymmdd(整数值)的日期。 这两列共同唯一地标识了表中的条目。 表(至少我认为它的方式)由COMPANY_ID,订购日期。

该表还有其他几列。 我会打电话给一个我感兴趣的mycolumn(整数值)。 也为下面的格式感到抱歉,但我不知道如何在此处创建合适的表。

Company_id   Date       mycolumn 
1            20121015   1 
1            20121113   1 
1            20130108   2 
1            20130207   2 
1            20130409   2 
1            20130815   1 
2            20050611   7 
2            20080719   7 
4            20091114   3 
4            20091215   3 
4            20100304   5 
4            20110215   5 

我感兴趣的是每个公司ID的mycolumn中的更改以及更改前后的日期。 例如,对于标识为1的公司,有2个更改(从1到2,然后从2变为1),对于标识为2的公司,没有任何更改,对于标识4的公司,从3到5仅有一个更改。输出表应为:

Company_id   Date       mycolumn 
1            20121113   1 
1            20130108   2 
1            20130409   2 
1            20130815   1 
4            20091215   3 
4            20100304   5 

我知道我可以做一个中间步骤,例如选择mycolumn值超过1的公司,然后使用join语句从表中排除没有变化的公司。 但是我不知道下一步该怎么做...

嗯,我没有想出什么,但它既是凌乱而不能正常工作。 我所做的最初是将2列在第一和最后的日期显示为每个公司ID - mycolumn组合。 然后,我使用了几个步骤来达到所需的位置。 对于像您从值3升到值5的最后一个公司这样的公司来说效果很好,但是对于像第一个从1变到2然后又回到1的公司来说,这很混乱。

谢谢你的帮助!

尝试这个:

select company_id, mycolumn, max(date) from 
tableName
group by company_id, mycolumn

干杯!

这不是世界上最漂亮的东西,并且我敢肯定它可以被优化,但是这应该给您带来结果:

;With Cte As
(
    Select  *, Row_Number() Over (Partition By Company_Id Order By Date) RN
    From    Table
)
Select  *
From
(
    Select  C.company_id, C.Date, C.mycolumn
    From    Cte C
    Cross Apply
    (
        Select  *
        From    Cte X
        Where   X.RN = C.RN + 1
        And     X.company_id = C.company_id 
    ) X
    Where   X.mycolumn <> C.mycolumn
    Union All
    Select  X.company_id, X.Date, X.mycolumn
    From    Cte C
    Cross Apply
    (
        Select  *
        From    Cte X
        Where   X.RN = C.RN + 1
        And     X.company_id = C.company_id 
    ) X
    Where   X.mycolumn <> C.mycolumn
) R
Order By Company_Id, Date
DECLARE @Tbl TABLE (
    Ident INT IDENTITY(1,1),
    [ROW] INT,
    Company_id INT,
    [Date] INT,
    mycolumn INT
)

INSERT INTO @Tbl
          SELECT NULL,1,20121015,1 
    UNION SELECT NULL,1,20121113,1 
    UNION SELECT NULL,1,20130108,2 
    UNION SELECT NULL,1,20130207,2 
    UNION SELECT NULL,1,20130409,2 
    UNION SELECT NULL,1,20130815,1 
    UNION SELECT NULL,2,20050611,7 
    UNION SELECT NULL,2,20080719,7 
    UNION SELECT NULL,4,20091114,3 
    UNION SELECT NULL,4,20091215,3 
    UNION SELECT NULL,4,20100304,5 
    UNION SELECT NULL,4,20110215,5 

INSERT INTO @Tbl
    SELECT
        ROW_NUMBER() OVER(PARTITION BY Company_id ORDER BY Company_id ASC,[Date] ASC),Company_id,[Date],mycolumn
    FROM @Tbl

DELETE @Tbl WHERE [ROW] IS NULL


SELECT
    t.Company_id,t.[Date],t.mycolumn
FROM @Tbl t
INNER JOIN (
    select
        t1.Ident [Ident1],t2.Ident [Ident2]
    from @Tbl t1 
    INNER JOIN @Tbl t2 ON t1.Company_id=t2.Company_id
        AND t1.[ROW]=(t2.[ROW]-1)
        AND t1.mycolumn<>t2.mycolumn
) delta on t.Ident IN (delta.[Ident1],delta.Ident2)
ORDER BY t.Company_id ASC,t.[Date] ASC

当我编辑此代码时,还输入了另外2个答案,但我认为它的价值足以使它包含在内-否则,我浪费了所有键入的内容:(

DECLARE @T TABLE(CompanyID INT, DateInt INT, MyCol INT);
INSERT INTO @T(CompanyID , DateInt , MyCol) 
    VALUES (1, 20121015, 1), (1, 20121113, 1), (1, 20130108, 2), (1, 20130207, 2), (1, 20130409, 2), (1, 20130815, 1 )
          , (2, 20050611, 7), (2, 20080719, 7), (4, 20091114, 3), (4, 20091215, 3), (4, 20100304, 5), (4, 20110215, 5);
with cteRanked as (
    SELECT CompanyID , DateInt , MyCol, ROW_NUMBER() OVER (PARTITION BY CompanyID ORDER BY DateInt) as RowNum
    FROM @T
), cteRuns as (
    SELECT T1.CompanyID , T1.DateInt as D1, T2.DateInt as D2
        , T1.MyCol as C1, T2.MyCol as C2
    FROM cteRanked as T1 
        INNER JOIN cteRanked as T2 ON T1.CompanyID = T2.CompanyID and T1.RowNum + 1 = T2.RowNum 
    WHERE T1.MyCol != T2.MyCol
), ctePaired as (
    SELECT CompanyID, D1 as DateInt, C1 as MyCol FROM cteRuns 
    UNION --or UNION ALL to get repeated rows when a run is 1 long
    SELECT CompanyID, D2 as DateInt, C2 as MyCol FROM cteRuns 
)SELECT * FROM ctePaired
ORDER BY CompanyID, DateInt

您可以使用leadlag

with C
(
  select *,
         lag(mycolumn) over(partition by company_id order by Date) as lagmycolumn,
         lead(mycolumn) over(partition by company_id order by Date) as leadmycolumn
  from YourTable
)
select company_id, Date, mycolumn
from C
where mycolumn <> lagmycolumn or
      mycolumn <> leadmycolumn

由于您使用的是SQL Server 2014,因此这是查询所需结果的最有效方法。 我怀疑有一个更好的方法来“忽略”没有变化的CompanyID。

DECLARE @T TABLE(CompanyID INT, DateInt INT, MyCol INT);
INSERT INTO @T(CompanyID , DateInt , MyCol) 
VALUES  (1, 20121015, 1), (1, 20121113, 1), (1, 20130108, 2), (1, 20130207, 2), (1, 20130409, 2), (1, 20130815, 1 ),
        (2, 20050611, 7), (2, 20080719, 7), (4, 20091114, 3), (4, 20091215, 3), (4, 20100304, 5), (4, 20110215, 5)

;WITH Stage1 AS
(
    SELECT   *
            ,UseThis    = IIF(LEAD(MyCol, 1, 0) OVER (PARTITION BY CompanyID ORDER BY DateInt) != MyCol, 1, 0)
            ,Change     = IIF(LAG(MyCol, 1, 0) OVER (PARTITION BY CompanyID ORDER BY DateInt) = 0 OR LAG(MyCol, 1, 0) OVER (PARTITION BY CompanyID ORDER BY DateInt) = MyCol, 0, 1)
    FROM @T
)
--  Find companies that have not changed
,Stage2 AS
(
    SELECT *
            ,Inert  = SUM(Change) OVER (PARTITION BY CompanyID)
    FROM Stage1
)
SELECT   CompanyID
        ,DateInt
        ,MyCol
FROM Stage2
WHERE UseThis = 1 
AND Inert != 0

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM