简体   繁体   中英

SAS: select groups of observations based on values in two columns and multiple rows

In SAS, I need to select subjects and their data rows based on values in two variables across several rows. In the data below ID is the relevant BY group. I need to output the group of rows associated with a person who has X in (0,1,9) and Y=missing on all rows. Therefore no rows would be outputted for ID=01 because it has an X=1 and non-missing Y in two other rows. Two rows must be output for ID=02 and ID=03. And the row for ID=04 must be output. Thanks.

ID  X Y  
01  1 .  
01  . 1  
01  . 1  
02  0 .  
02  . .  
03  9 .  
03  . .  
04  1 .  

Try this:

data have;
input ID $ X Y;
cards;
01 1 .
01 . 1
01 . 1
02 0 .
02 . .
03 9 .
03 . .
04 1 .
;
proc sql;
  select * from have group by id having x in(0,1,9) and sum(y) is null;
quit;
data have;
input ID $ X Y;
cards;
01 1 .
01 . 1
01 . 1
02 0 .
02 . .
03 9 .
03 . .
04 1 .
;
run;

proc sort data=have;
  by id;
run;

data list;
set have;
by id;
retain keepit;
if first.id then keepit = .;

if missing(keepit) or keepit=1 then do;
    if missing(y) then do;
        if x in (0,1,9) then keepit = 1;
    end;
    else keepit = 0;
end;

if last.id and keepit then output;
keep id;
run;

data want;
    merge
        have (in=a)
        list (in=b)
    ;
    by id;
    if a and b;
run;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM