简体   繁体   中英

Compare interval date by row

I am trying to group dates within a 1 year interval given an identifier by labeling which is the earliest date and which is the latest date. If there are no dates within a 1 year interval from that date, then it will record it's own date as the first and last date. For example originally the data is:

id | date 
____________
a  | 1/1/2000
a  | 1/2/2001
a  | 1/6/2000
b  | 1/3/2001
b  | 1/3/2000
b  | 1/3/1999
c  | 1/1/2000
c  | 1/1/2002
c  | 1/1/2003

And the output I want is:

id  | first_date | last_date
___________________________
a   | 1/1/2000   | 1/2/2001
b   | 1/3/1999   | 1/3/2001
c   | 1/1/2000   | 1/1/2000
c   | 1/1/2002   | 1/1/2003

I have been trying to figure this out the whole day and can't figure it out. I can do it for cases id's with only 2 duplicates, but can't for greater values. Any help would be great.

SELECT id
     , min(min_date) AS min_date
     , max(max_date) AS max_date
     , sum(row_ct)   AS row_ct
FROM  (
   SELECT id, year, min_date, max_date, row_ct
        , year - row_number() OVER (PARTITION BY id ORDER BY year) AS grp
   FROM  (
      SELECT id
           , extract(year FROM the_date)::int AS year
           , min(the_date) AS min_date
           , max(the_date) AS max_date
           , count(*)      AS row_ct
      FROM   tbl
      GROUP  BY id, year
      ) sub1
   ) sub2
GROUP  BY id, grp
ORDER  BY id, grp;

1) Group all rows per ( id, year ), in subquery sub1 . Record min and max of the date. I added a count of rows ( row_ct ) for demonstration.

2) Subtract the row_number() from the year in the second subquery sub2 . Thus, all rows in succession end up in the same group ( grp ). A gap in the years starts a new group.

3) In the final SELECT , group a second time, this time by ( id, grp ) and record min, max and row count again. Voilá. Produces exactly the result you are looking for.

-> SQLfiddle demo.

Related answers:
Return array of years as year ranges
Group by repeating attribute

select id, min ([date]) first_date, max([date]) last_date
from <yourTbl> group by id

Use this ( SQLFiddle Demo ):

SELECT id,
    min(date) AS first_date,
    max(date) AS last_date
FROM mytable
GROUP BY 1
ORDER BY 1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM