简体   繁体   English

如何使用 SQL 中的循环将记录与以前的记录进行比较?

[英]How do you use a loop in SQL to compare a record to a previous record?

I am looking at a data set of Emergency Room visits.我正在查看急诊室就诊的数据集。 I only want to keep visits per ID that are 30 days apart.我只想保持每个 ID 相隔 30 天的访问。 So as an example say I have this below.举个例子,我在下面有这个。

If I start with ID=1:如果我从 ID=1 开始:

  • In Row 1 I can see that the lag between row 1 and 2 is 15 days so I will exclude, or for now flag, row 2.在第 1 行中,我可以看到第 1 行和第 2 行之间的延迟为 15 天,因此我将排除或暂时标记第 2 行。
  • Then I will continue to use Row 1 to evaluate Row 3. Again this is only 17 days so I will exclude Row 3 and look at Row 4.然后我将继续使用第 1 行来评估第 3 行。同样这只有 17 天,所以我将排除第 3 行并查看第 4 行。
  • Row 4 is 30 days away so I keep it and then use Row 4 to evaluate Row 5....and so on.第 4 行是 30 天后,所以我保留它,然后使用第 4 行来评估第 5 行....等等。

I have been trying to do this with the lag function but I can't figure out how to utilize the lag when I have to continue to use the 'anchor' row to evaluate several rows.我一直在尝试使用滞后 function 来做到这一点,但是当我必须继续使用“锚”行来评估几行时,我无法弄清楚如何利用滞后。

Top is what I have and bottom is what I want.顶部是我拥有的,底部是我想要的。 Any ideas?有任何想法吗?

I am using AZURE data studio.我正在使用 AZURE 数据工作室。

HAVE

Row#  ID  DATE
 1    1   1/1/2020
 2    1   1/15/2020
 3    1   1/17/2020
 4    1   2/4/2020
 5    1   3/15/2020
 6    2   1/15/2020
 7    2   3/15/2020
 8    2   3/18/2020

WANT

Row#  ID  DATE
 1    1   1/1/2020
 4    1   2/4/2020
 5    1   3/15/2020
 6    2   1/15/2020
 7    2   3/15/2020

This tutorial page should get you started on a cursor based solution.教程页面应该让您开始使用基于 cursor 的解决方案。

You don't use a loop.你不使用循环。 You continue to use LAG, you were on the right way initially.你继续使用 LAG,你一开始是对的。

;WITH dateLagged AS (
    SELECT 
        ID
     ,  Date
     ,  Diff = ISNULL(DATEDIFF(day,LAG(Date,1) OVER(PARTITION BY ID ORDER BY ID, Date), Date),0) 
    FROM dbo.EmergencyRoom),
 DiffCumulated AS (
    SELECT 
       ID
    ,  Date
    ,  CumDiff = SUM(Diff) OVER(PARTITION BY ID  ORDER BY ID, Date) 
    FROM dateLagged
 ),
 AnchorsMarked AS (
    SELECT
       ID
    ,  Date
    ,  Marker =  IIF(CumDiff = 0 
                  OR CumDiff > 30 AND LAG(CumDiff,1) OVER(ORDER BY ID, Date) < 30 
                  OR CumDiff - LAG(CumDiff,1) OVER(ORDER BY ID, Date) > 30, 1,0)
    FROM  DiffCumulated
  )
SELECT 
   ID
,  Date 
FROM AnchorsMarked WHERE Marker = 1

As a rule of thumb: if you want to use looping in SQL then you've taken a wrong turn somewhere.根据经验:如果您想在 SQL 中使用循环,那么您在某处走错了路。 There are very few problems in SQL which do require looping, it is not one of them. SQL 中很少有问题需要循环,它不是其中之一。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM