简体   繁体   中英

How do I add in rows with specific values missing in a single DATA step?

Here is a simple example I came up with. There are 3 players here (id is 1,2,3) and each player gets 3 attempts at the game (attempt is 1,2,3).

data have;
  infile datalines delimiter=",";
  input id attempt score;
datalines;
1,1,100
1,2,200
2,1,150
3,1,60
;
run;    

I would like to add in rows where the score is missing if they did not play attempt 2 or attempt 3.

data want;
set have;
by id attempt;

* ??? ;

run;
proc print data=have;
run;

The output would look something like this.

1   1   100
1   2   200
1   3   .
2   1   150
2   2   .
2   3   .
3   1   60
3   2   .
3   3   .

How do I go about doing this?

You could solve this by first creating a table where you have the structure you want to see: for each ID three attempts. This structure can then be joined with a 'left join' to your 'have' table to get the actual scores if they exist and missing variable if they don't.

/* Create table with all ids for which the structure needs to be created */ 

proc sql;
    create table ids as
    select distinct id from have;
quit;

/* Create table structure with 3 attempts per ID */
data ids (drop = i);
  set ids;

  do i = 1 to 3;
    attempt = i;
    output;
  end;
run;

/* Join the table structure to the actual scores in the have table */
proc sql;
  create table want as
  select a.*,
         b.score
  from ids a left join have b on a.id = b.id and a.attempt = b.attempt;
quit;

A table of possible attempts cross joined with the distinct ids left joined to the data will produce the desired result set.

Example:

data have;
  infile datalines delimiter=",";
  input id attempt score;
datalines;
1,1,100
1,2,200
2,1,150
3,1,60
;
data attempts;
do attempt = 1 to 3; output; end;
run;

proc sql;
  create table want as
  select 
    each_id.id, 
    each_attempt.attempt,
    have.score
  from 
    (select distinct id from have) each_id
  cross join 
    attempts each_attempt
  left join 
    have
  on
    each_id.id = have.id
  & each_attempt.attempt = have.attempt
  order by
    id, attempt
  ;

Update: I figured it out.

proc sort data=have;
by id attempt;

data want;
  set have (rename=(attempt=orig_attempt score=orig_score));
  by id;
  ** Previous attempt number **;
  retain prev;

  if first.id then prev = 0;

  ** If there is a gap between previous attempt and current attempt, output a blank record for each intervening attempt **;
  if orig_attempt > prev + 1 then do attempt = prev + 1 to orig_attempt - 1;
    score = .;
    output;
  end;

  ** Output current attempt **;
  attempt = orig_attempt;
  score = orig_score;
  output;

  ** If this is the last record and there are more attempts that should be included, output dummy records for them **;
  ** (Assumes that you know the maximum number of attempts) **;
  if last.id & attempt < 3 then do attempt = attempt + 1 to 3;
    score = .;
    output;
  end;

  ** Update last attempt used in this iteration **;
  prev = attempt;

run;

Here is a alternative DATA step, a DOW way:

data want;
  do until (last.id);
    set have;
    by id;
    output;
  end;
  call missing(score);
  do attempt = attempt+1 to 3;
    output;
  end;
run;

If the absent observations are only at the end then you can just use a couple of OUTPUT statements and a DO loop. So write each observation as it is read and if the last one is NOT attempt 3 then add more observations until you get to attempt 3.

data want1;
  set have ;
  by id;
  output;
  score=.;
  if last.id then do attempt=attempt+1 to 3;
    output;
  end;
run;

If the absent attempts can appear any where then you need to "look ahead" to see whether the next observations skips any attempts.

data want2;
  set have end=eof;
  by id ;
  if not eof then set have (firstobs=2 keep=attempt rename=(attempt=next));
  if last.id then next=3+1;
  output;
  score=.;
  do attempt=attempt+1 to next-1;
    output;
  end;
  drop next;
run;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM