简体   繁体   English

按行比较间隔日期

[英]Compare interval date by row

I am trying to group dates within a 1 year interval given an identifier by labeling which is the earliest date and which is the latest date. 我正在尝试通过给标签指定最早的日期和最新的日期来在给定标识符的1年时间间隔内对日期进行分组。 If there are no dates within a 1 year interval from that date, then it will record it's own date as the first and last date. 如果从该日期起的1年间隔内没有日期,则它将记录其自己的日期作为第一个和最后一个日期。 For example originally the data is: 例如,原始数据为:

id | date 
____________
a  | 1/1/2000
a  | 1/2/2001
a  | 1/6/2000
b  | 1/3/2001
b  | 1/3/2000
b  | 1/3/1999
c  | 1/1/2000
c  | 1/1/2002
c  | 1/1/2003

And the output I want is: 我想要的输出是:

id  | first_date | last_date
___________________________
a   | 1/1/2000   | 1/2/2001
b   | 1/3/1999   | 1/3/2001
c   | 1/1/2000   | 1/1/2000
c   | 1/1/2002   | 1/1/2003

I have been trying to figure this out the whole day and can't figure it out. 我一直试图解决这一问题,却无法解决。 I can do it for cases id's with only 2 duplicates, but can't for greater values. 对于只有2个重复项的case id,我可以这样做,但对于较大的值则不能。 Any help would be great. 任何帮助都会很棒。

SELECT id
     , min(min_date) AS min_date
     , max(max_date) AS max_date
     , sum(row_ct)   AS row_ct
FROM  (
   SELECT id, year, min_date, max_date, row_ct
        , year - row_number() OVER (PARTITION BY id ORDER BY year) AS grp
   FROM  (
      SELECT id
           , extract(year FROM the_date)::int AS year
           , min(the_date) AS min_date
           , max(the_date) AS max_date
           , count(*)      AS row_ct
      FROM   tbl
      GROUP  BY id, year
      ) sub1
   ) sub2
GROUP  BY id, grp
ORDER  BY id, grp;

1) Group all rows per ( id, year ), in subquery sub1 . 1)在子查询sub1中将每个( id, year )所有行分组。 Record min and max of the date. 记录日期的最小值和最大值。 I added a count of rows ( row_ct ) for demonstration. 我添加了一些行( row_ct )进行演示。

2) Subtract the row_number() from the year in the second subquery sub2 . 2)从第二sub2查询sub2的年份中减去row_number() Thus, all rows in succession end up in the same group ( grp ). 因此,所有连续的行最终都属于同一组( grp )。 A gap in the years starts a new group. 几年的差距开始了新的团队。

3) In the final SELECT , group a second time, this time by ( id, grp ) and record min, max and row count again. 3)在最后的SELECT ,第二次分组,这次是通过( id, grp )并再次记录min,max和行数。 Voilá. 瞧。 Produces exactly the result you are looking for. 精确产生您要寻找的结果。

-> SQLfiddle demo. -> SQLfiddle演示。

Related answers: 相关答案:
Return array of years as year ranges 返回年份的数组作为年份范围
Group by repeating attribute 按重复属性分组

select id, min ([date]) first_date, max([date]) last_date
from <yourTbl> group by id

Use this ( SQLFiddle Demo ): 使用此( SQLFiddle Demo ):

SELECT id,
    min(date) AS first_date,
    max(date) AS last_date
FROM mytable
GROUP BY 1
ORDER BY 1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM