简体   繁体   中英

SQL Server: Select rows where value in column changes from list of values to another list of values

Let's say I have this table in sql server database, sorted by increasing hist_pp:

hist_id   hist_yr   hist_pp   hist_empl_id   hist_empl_sect_id
90619       2017       5         00018509           61
92295       2017       6         00018509           61
93991       2017       7         00018509           61
95659       2017       8         00018509           99
103993      2017       9         00018509           99
120779      2017       10        00018509           99

I want to find the rows where hist_empl_sect_id changes from any values in one group of numbers, say (60, 61, 62, 63) to any values in another group of numbers, say (98, 99, 100, etc). It has to be per year, so for values in 2017. hist_pp will be an increasing number in a year. hist_id is also an id autonumber column.

It should return for this employee

95659       2017       8         00018509           99

Ive tried a few examples i have seen in other posts, tried it with CTE, etc. and I cant seem to get it to work.

Here is an example of something I tried, but didnt work, got multiple rows for an employee when there should only be 1:

select a.hist_id, a.hist_yr, format(cast(a.hist_pp as integer), '0#') as hist_pp, a.hist_empl_id, a.hist_empl_sect_id
from temshist a
where a.hist_empl_sect_id <>
    (SELECT top 1 b.hist_empl_sect_id
    FROM temshist as b
    where a.hist_empl_id = b.hist_empl_id
    and a.hist_yr = b.hist_yr
    and a.hist_pp > b.hist_pp
    Order by b.hist_pp desc
    )
order by hist_empl_id

I suspect Lag() would be a good fit here.

Example

 ;with cte as (
    Select * 
          ,PrevValue= Lag(hist_empl_sect_id,1,hist_empl_sect_id) over (Partition by hist_empl_id Order By hist_pp)
     From  @YourTable
)
Select *
 From  cte Where PrevValue/98<>hist_empl_sect_id/98

EDIT - VamsiPrabhala Pointed Out

You could partition by YEAR as well

      ,PrevValue= Lag(hist_empl_sect_id,1,hist_empl_sect_id) over (Partition by hist_yr,hist_empl_id Order By hist_pp)

Here is another option, I mocked it up with a CTE to simulate your data, then just joined back on itself.

with emp_hist as
(
    select 90619 as hist_id, 2017 as hist_yr, 5 as hist_pp, '00018509' as hist_empl_id, 61 as hist_empl_sect_id from dual
    union all
    select 92295 as hist_id, 2017 as hist_yr, 6 as hist_pp, '00018509' as hist_empl_id, 61 as hist_empl_sect_id from dual
    union all
    select 93991 as hist_id, 2017 as hist_yr, 7 as hist_pp, '00018509' as hist_empl_id, 61 as hist_empl_sect_id from dual
    union all
    select 95659 as hist_id, 2017 as hist_yr, 8 as hist_pp, '00018509' as hist_empl_id, 99 as hist_empl_sect_id from dual
    union all
    select 103993 as hist_id, 2017 as hist_yr, 9 as hist_pp, '00018509' as hist_empl_id, 99 as hist_empl_sect_id from dual
    union all
    select 120779 as hist_id, 2017 as hist_yr, 10 as hist_pp, '00018509' as hist_empl_id, 99 as hist_empl_sect_id from dual
)
select eh2.*
from emp_hist eh1
join emp_hist eh2
on eh1.hist_empl_id = eh2.hist_empl_id
and eh1.hist_pp = (eh2.hist_pp - 1)
and eh1.hist_yr = eh2.hist_yr
where eh2.hist_empl_sect_id in (98, 99, 100)
and eh1.hist_empl_sect_id in (60, 61, 62, 63)
;

You can use a case statement (to determine group membership) and a lag window function (to compare two sequential rows) partitioned by employee and year and ordered by hist_pp

This assumes that (1) Employee id can span multiple years (2) Hist_pp is unique for each employee id, year combination (3) If there is only one unique value for hist_empl_sect_id in a employee id, year combination (hist_empl_sect_id does not change for that employee in that year), the result set should not contain any rows for that employee id, year combination.

Hist_pp can have gaps.

    select hist_id, hist_yr, hist_pp, hist_empl_id, hist_empl_sect_id
    from 
    (

       select a.hist_id, a.hist_yr, 
              format(cast(a.hist_pp as integer), '0#') as hist_pp, 
              a.hist_empl_id, 
              -- hist_empl_sect_id of current row
              a.hist_empl_sect_id,

              -- hist_empl_sect_id of preceding row, when ordered by hist_pp for each employee year combination
              lag(a.hist_empl_sect_id, 1) 
                   OVER (
                          PARTITION BY a.hist_empl_id,a.hist_yr 
                             ORDER BY format(cast(a.hist_pp as integer), '0#') 
                         ) as prev_hist_empl_sect_id
            from temshist a
    ) as outr
where 
    -- group membership of hist_empl_sect_id of current row
    (case when hist_empl_sect_id IN (98, 99, 100) then 1 else 0 end) 
    <> 
    -- group membership of hist_empl_sect_id of preceding row, ordered by hist_pp for each year
    (case when prev_hist_empl_sect_id IN (98, 99, 100) then 1 else 0 end)

    AND

    -- Preceding row does not belong to a different employee or year
    prev_hist_empl_sect_id IS NOT NULL

Looking into your result data, i assume that against every hist_yr and hist_empl_id , you need the record row where hist_pp is min and hist_empl_sect_id is max , below is the query that will generate your desired output.

SELECT t3.* from
  (SELECT t2.*, min(hist_pp) over(partition BY hist_yr, hist_empl_id) AS hist_pp_minValue
   FROM
     (SELECT hist_id, hist_yr, hist_pp, hist_empl_id, hist_empl_sect_id, r1, max(r1) over (partition BY hist_yr, hist_empl_id) AS maxRank
      FROM
        (SELECT hist_id, hist_yr, hist_pp, hist_empl_id, hist_empl_sect_id, dense_rank() over(partition BY hist_yr, hist_empl_id
                                                                                              ORDER BY hist_empl_sect_id) AS r1
         FROM table1)t1) t2
   WHERE t2.maxRank = t2.r1 )t3
WHERE t3.hist_pp_minValue = t3.hist_pp

I have tested on the data provided in question and below is the result.

hist_id | hist_yr  |  hist_pp | hist_empl_id | hist_empl_sect_id
---------------------------------------------------------
95659     2017         8         18509          99

For re-assurance, i have added some more sample data as below.

insert into table1 values(90619  ,2018,5  ,00018508,62);
insert into table1 values(92295  ,2018,6  ,00018508,62);
insert into table1 values(93991  ,2018,7  ,00018508,62);
insert into table1 values(95659  ,2018,8  ,00018508,91);
insert into table1 values(103993 ,2018,9  ,00018508,91);
insert into table1 values(120779 ,2018,10 ,00018508,91);

Below are the generated result.

hist_id | hist_yr  |  hist_pp | hist_empl_id | hist_empl_sect_id
---------------------------------------------------------
95659     2017         8         18509          99
95659     2018         8         18508          91

You can check demo here

Hope this will help.

I think you'll want to use a CTE to solve this problem. This is similar to what it looks like John Cappelletti is doing, but doesn't require SQL 2012 or greater.

declare @temshist table
(
    hist_id int,
    hist_yr int,
    hist_pp int,
    hist_empl_id varchar(max),
    hist_empl_sect_id int
)

insert into @temshist ( hist_id, hist_yr, hist_pp, hist_empl_id, hist_empl_sect_id )
values
    ( 90619, 2017, 5, '00018509', 61 ),
    ( 92295, 2017, 6, '00018509', 61 ),
    ( 93991, 2017, 7, '00018509', 61 ),
    ( 95659, 2017, 8, '00018509', 99 ),
    ( 103993, 2017, 9, '00018509', 99 ),
    ( 120779, 2017, 10, '00018509', 99 )

;with empl_cte as
(
    select
        row_number() over (partition by hist_empl_id, hist_yr order by hist_pp) as [rn],
        hist_id,
        hist_yr,
        hist_pp,
        hist_empl_id,
        hist_empl_sect_id
    from @temshist
)
select 
    nxt.hist_id,
    nxt.hist_yr,
    nxt.hist_pp,
    nxt.hist_empl_id,
    nxt.hist_empl_sect_id
from empl_cte prv
    left join empl_cte nxt on 
        prv.hist_empl_id = nxt.hist_empl_id and 
        prv.rn = nxt.rn - 1
where prv.hist_empl_sect_id in (60, 61, 62, 63/*, ...*/) and nxt.hist_empl_sect_id in (98, 99, 100/*, ...*/)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM