Consider the following SAS code:
data test;
format dt date9.
ctry_cd $2.
sn $2.;
input ctry_cd sn dt;
datalines;
US 1 20000
US 1 20001
US 1 20002
CA 1 20003
CA 1 20004
US 1 20005
US 1 20006
US 1 20007
ES 2 20001
ES 2 20002
;
run;
proc sql;
create table check as
select
sn,
ctry_cd,
min(dt) as begin_dt format date9.,
max(dt) as end_dt format date9.
from test
group by sn, ctry_cd;
quit;
This returns:
1 CA 07OCT2014 08OCT2014
1 US 04OCT2014 11OCT2014
2 ES 05OCT2014 06OCT2014
I would like for the proc sql
distinguish between the country moves; that is, return
1 US 04OCT2014 06OCT2014
1 CA 07OCT2014 08OCT2014
1 US 09OCT2014 11OCT2014
2 ES 05OCT2014 06OCT2014
So it still groups the instances by sn and ctry_nm but pays attention to the date so I have a timeline.
You need to create another grouping variable then:
data test;
set test;
prev_ctry_cd=lag(ctry_cd);
if prev_ctry_cd ^= ctry_cd then group+1;
run;
proc sql;
create table check as
select
sn,
ctry_cd,
min(dt) as begin_dt format date9.,
max(dt) as end_dt format date9.
from test
group by group, sn, ctry_cd
order by group;
quit;
If the data is sorted as per your example, then you can achieve your goal in a data step without creating an extra variable.
data want;
keep sn ctry_cd begin_dt end_dt; /* keeps required variables and sets variable order */
set test;
by sn ctry_cd notsorted; /* notsorted option needed as ctry_cd is not in order */
retain begin_dt; /* retains value until needed */
if first.ctry_cd then begin_dt=dt; /* store first date for each new ctry_cd */
if last.ctry_cd then do;
end_dt=dt; /* store last date for each new ctry_cd */
output; /* output result */
end;
format begin_dt end_dt date9.;
run;
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.