简体   繁体   中英

SQL: changing table (ID, [Datetime], [INT]) to (ID, Start_DT, End_DT, [INT])

I have some old data in this format:

ID    DT          NUM 
1     6-1-2012    2
1     6-2-2012    2
1     6-3-2012    4
1     6-4-2012    4
1     6-5-2012    8
1     6-6-2012    8
1     6-7-2012    8
1     6-8-2012    16
1     6-9-2012    2
1     6-10-2012   2

And I need it to look like this:

ID    START_DT    END_DT      NUM
1     6-1-2012    6-2-2012    2 
1     6-3-2012    6-4-2012    4
1     6-5-2012    6-7-2012    8 
1     6-8-2012    6-8-2012    16
1     6-9-2012    6-10-2012   2

This is the best example of the data that I could quickly come up with. I would love to clarify if I accidently included some misunderstanding(s) in it.

The Rules:

  • ID: this does change, it will be grouped on eventually, to make things easy it says the same in my example
  • DT: I get one orginal datetime, in the real data the time part does vary
  • START_DT, END_DT: I need to get these columns out of the original DT
  • NUM: this is just an integer that changes and can reoccur per ID

EDIT: this is very awkward..... (there MUST be a better answer)... i haven't tested this yet with a lot of conditions but it looks okay from the start.... and had to manually find and replace all the field names (be kind)

select * from (
    select  *,row_number() over (partition by if_id, [z.num] order by if_id, [y.num]) as rownum

    from (
            select  y.id,
                    y.dt as [y.dt], 
                    z.dt as [z.dt],    
                    y.num

            from    #temp as y 

                    outer apply (select top 1 id, dt, num

                                    from    #temp as x 

                                    where   x.id = y.id and 
                                            x.dt > y.dtand 
                                            x.num <> y.num

                                    order by x.dt asc) as z   ) as x ) as k
where rownum=1
order by [y.dt]
select id,min(dt) as start_date, max(dt) as end_date, num
from whatevertablename_helps_if_you_supply_these_when_asking_for_code
group by 1,4

It's also possible to do it as a subquery to get the min and a subquery to get the max, but don't think you need to do that here.

My answer is Postgres...I think you'll need to change the group by statement to be id,num instead in t-sql.

Adding:

How do you know that it is

1 6-1-2012 6-2-2012 2

1 6-9-2012 6-10-2012 2

and not

1 6-1-2012 6-10-2012 2

1 6-2-2012 6-9-2012 2

You need more business rules to determine that

select id, [y.dt] as start_dt, [z.dt] as end_dt, num from (
        select  *,row_number() over (partition by id, [z.dt] order by id, [y.dt]) as rownum

        from (
                select  y.id,
                        y.dt as [y.dt], 
                        z.dt as [z.dt],    
                        y.num

                from    #temp as y 

                        outer apply (select top 1 id, dt, num

                                        from    #temp as x 

                                        where   x.id = y.id and 
                                                x.dt > y.dt and 
                                                x.num <> y.num

                                        order by x.dt asc) as z   ) as x ) as k
where rownum=1
order by id, [y.dt]

and that gives us... (with different data)

id     start_dt                 end_dt                         num
6      2011-10-01 00:00:00.000  2012-01-18 00:00:00.000        896
6      2012-01-18 00:00:00.000  2012-02-01 00:00:00.000        864
6      2012-02-01 00:00:00.000  NULL                           896

i posted that up at the top about an hour ago maybe...? and said it was awkward (and sloppy)... i was wondering if anyone has a better answer because mine sucks. but i don't understand why people keep posting that they need better business rules and need to know how to handle certain situations. this code does exactly what i want except end_dt is the datetime of the new num and not the last occurance of the current num.... but I can work with that. It is better than nothing. (sorry, frustrated).

Business rule: the data is already there. it should show the logical span. I need the start_dt and end_dt for num... When NUM = Y, the Start date is when NUM changes from X to Y and the End Date is when Y changes to Z. I can't give you more than I have myself with all of this... These rules were enough for me...??

ok, same data:

 id      start_dt   end_dt       num
 1       6-1-2012   6-3-2012    2
 1       6-3-2012   6-5-2012    4
 1       6-5-2012   6-8-2012    8
 1       6-8-2012   6-9-2012    16
 1       6-9-2012   NULL        2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM