Let's say I have this table in sql server database, sorted by increasing hist_pp:
hist_id hist_yr hist_pp hist_empl_id hist_empl_sect_id
90619 2017 5 00018509 61
92295 2017 6 00018509 61
93991 2017 7 00018509 61
95659 2017 8 00018509 99
103993 2017 9 00018509 99
120779 2017 10 00018509 99
I want to find the rows where hist_empl_sect_id changes from any values in one group of numbers, say (60, 61, 62, 63) to any values in another group of numbers, say (98, 99, 100, etc). It has to be per year, so for values in 2017. hist_pp will be an increasing number in a year. hist_id is also an id autonumber column.
It should return for this employee
95659 2017 8 00018509 99
Ive tried a few examples i have seen in other posts, tried it with CTE, etc. and I cant seem to get it to work.
Here is an example of something I tried, but didnt work, got multiple rows for an employee when there should only be 1:
select a.hist_id, a.hist_yr, format(cast(a.hist_pp as integer), '0#') as hist_pp, a.hist_empl_id, a.hist_empl_sect_id
from temshist a
where a.hist_empl_sect_id <>
(SELECT top 1 b.hist_empl_sect_id
FROM temshist as b
where a.hist_empl_id = b.hist_empl_id
and a.hist_yr = b.hist_yr
and a.hist_pp > b.hist_pp
Order by b.hist_pp desc
)
order by hist_empl_id
I suspect Lag() would be a good fit here.
Example
;with cte as (
Select *
,PrevValue= Lag(hist_empl_sect_id,1,hist_empl_sect_id) over (Partition by hist_empl_id Order By hist_pp)
From @YourTable
)
Select *
From cte Where PrevValue/98<>hist_empl_sect_id/98
EDIT - VamsiPrabhala Pointed Out
You could partition by YEAR as well
,PrevValue= Lag(hist_empl_sect_id,1,hist_empl_sect_id) over (Partition by hist_yr,hist_empl_id Order By hist_pp)
Here is another option, I mocked it up with a CTE to simulate your data, then just joined back on itself.
with emp_hist as
(
select 90619 as hist_id, 2017 as hist_yr, 5 as hist_pp, '00018509' as hist_empl_id, 61 as hist_empl_sect_id from dual
union all
select 92295 as hist_id, 2017 as hist_yr, 6 as hist_pp, '00018509' as hist_empl_id, 61 as hist_empl_sect_id from dual
union all
select 93991 as hist_id, 2017 as hist_yr, 7 as hist_pp, '00018509' as hist_empl_id, 61 as hist_empl_sect_id from dual
union all
select 95659 as hist_id, 2017 as hist_yr, 8 as hist_pp, '00018509' as hist_empl_id, 99 as hist_empl_sect_id from dual
union all
select 103993 as hist_id, 2017 as hist_yr, 9 as hist_pp, '00018509' as hist_empl_id, 99 as hist_empl_sect_id from dual
union all
select 120779 as hist_id, 2017 as hist_yr, 10 as hist_pp, '00018509' as hist_empl_id, 99 as hist_empl_sect_id from dual
)
select eh2.*
from emp_hist eh1
join emp_hist eh2
on eh1.hist_empl_id = eh2.hist_empl_id
and eh1.hist_pp = (eh2.hist_pp - 1)
and eh1.hist_yr = eh2.hist_yr
where eh2.hist_empl_sect_id in (98, 99, 100)
and eh1.hist_empl_sect_id in (60, 61, 62, 63)
;
You can use a case statement (to determine group membership) and a lag window function (to compare two sequential rows) partitioned by employee and year and ordered by hist_pp
This assumes that (1) Employee id can span multiple years (2) Hist_pp is unique for each employee id, year combination (3) If there is only one unique value for hist_empl_sect_id in a employee id, year combination (hist_empl_sect_id does not change for that employee in that year), the result set should not contain any rows for that employee id, year combination.
Hist_pp can have gaps.
select hist_id, hist_yr, hist_pp, hist_empl_id, hist_empl_sect_id
from
(
select a.hist_id, a.hist_yr,
format(cast(a.hist_pp as integer), '0#') as hist_pp,
a.hist_empl_id,
-- hist_empl_sect_id of current row
a.hist_empl_sect_id,
-- hist_empl_sect_id of preceding row, when ordered by hist_pp for each employee year combination
lag(a.hist_empl_sect_id, 1)
OVER (
PARTITION BY a.hist_empl_id,a.hist_yr
ORDER BY format(cast(a.hist_pp as integer), '0#')
) as prev_hist_empl_sect_id
from temshist a
) as outr
where
-- group membership of hist_empl_sect_id of current row
(case when hist_empl_sect_id IN (98, 99, 100) then 1 else 0 end)
<>
-- group membership of hist_empl_sect_id of preceding row, ordered by hist_pp for each year
(case when prev_hist_empl_sect_id IN (98, 99, 100) then 1 else 0 end)
AND
-- Preceding row does not belong to a different employee or year
prev_hist_empl_sect_id IS NOT NULL
Looking into your result data, i assume that against every hist_yr
and hist_empl_id
, you need the record row where hist_pp
is min
and hist_empl_sect_id
is max
, below is the query that will generate your desired output.
SELECT t3.* from
(SELECT t2.*, min(hist_pp) over(partition BY hist_yr, hist_empl_id) AS hist_pp_minValue
FROM
(SELECT hist_id, hist_yr, hist_pp, hist_empl_id, hist_empl_sect_id, r1, max(r1) over (partition BY hist_yr, hist_empl_id) AS maxRank
FROM
(SELECT hist_id, hist_yr, hist_pp, hist_empl_id, hist_empl_sect_id, dense_rank() over(partition BY hist_yr, hist_empl_id
ORDER BY hist_empl_sect_id) AS r1
FROM table1)t1) t2
WHERE t2.maxRank = t2.r1 )t3
WHERE t3.hist_pp_minValue = t3.hist_pp
I have tested on the data provided in question and below is the result.
hist_id | hist_yr | hist_pp | hist_empl_id | hist_empl_sect_id
---------------------------------------------------------
95659 2017 8 18509 99
For re-assurance, i have added some more sample data as below.
insert into table1 values(90619 ,2018,5 ,00018508,62);
insert into table1 values(92295 ,2018,6 ,00018508,62);
insert into table1 values(93991 ,2018,7 ,00018508,62);
insert into table1 values(95659 ,2018,8 ,00018508,91);
insert into table1 values(103993 ,2018,9 ,00018508,91);
insert into table1 values(120779 ,2018,10 ,00018508,91);
Below are the generated result.
hist_id | hist_yr | hist_pp | hist_empl_id | hist_empl_sect_id
---------------------------------------------------------
95659 2017 8 18509 99
95659 2018 8 18508 91
You can check demo here
Hope this will help.
I think you'll want to use a CTE to solve this problem. This is similar to what it looks like John Cappelletti is doing, but doesn't require SQL 2012 or greater.
declare @temshist table
(
hist_id int,
hist_yr int,
hist_pp int,
hist_empl_id varchar(max),
hist_empl_sect_id int
)
insert into @temshist ( hist_id, hist_yr, hist_pp, hist_empl_id, hist_empl_sect_id )
values
( 90619, 2017, 5, '00018509', 61 ),
( 92295, 2017, 6, '00018509', 61 ),
( 93991, 2017, 7, '00018509', 61 ),
( 95659, 2017, 8, '00018509', 99 ),
( 103993, 2017, 9, '00018509', 99 ),
( 120779, 2017, 10, '00018509', 99 )
;with empl_cte as
(
select
row_number() over (partition by hist_empl_id, hist_yr order by hist_pp) as [rn],
hist_id,
hist_yr,
hist_pp,
hist_empl_id,
hist_empl_sect_id
from @temshist
)
select
nxt.hist_id,
nxt.hist_yr,
nxt.hist_pp,
nxt.hist_empl_id,
nxt.hist_empl_sect_id
from empl_cte prv
left join empl_cte nxt on
prv.hist_empl_id = nxt.hist_empl_id and
prv.rn = nxt.rn - 1
where prv.hist_empl_sect_id in (60, 61, 62, 63/*, ...*/) and nxt.hist_empl_sect_id in (98, 99, 100/*, ...*/)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.