简体   繁体   中英

SAS loop through 56 character string extract every two characters

Have a couple million records with a string like
"00 00 01 00 00 01 00 01 00 00 00 00 01 01 00 01 00 00 00 00 01"
String has a length of 56. All positions are filled with either a 0 or a 1.
My job is parse the string of each record every two positions
(there are no spaces, that is just for clarification).

If there is a 1 in position two that means increment var1 +1
If there is ALSO a 1 in position four, (don't care about leading "0"'s
in position 1/3/5/9...55, etc.) increment var2 + 1, up to 28 variables.

The entire 56 len string must be parsed every two characters. Potentially
there could be 28 variables that have to be incremented, (but not realistic,
most likely there is only five or six) which could be found in any part of the
string, beginning to end (as long as they are in position 2/4/6/8 up to 56, etc.)

This is what my boss gave me:
if substr(BigString,2,1)='1' then var1+1;

OK. Fine.
A) There are 27 more places to evaluate in the string.
B) there are a couple million records.

28 nested if then do loops doesn't sound like an answer (all I could think of). At least not to me.
Thanx.

if I understood the problem well, this could be the solution: EDITED 2. solution:

/* example with same row*/
data test;
a="00000100000100010000000001010001000000000100000000011110";output;
a="10000100000100010000000001010001000000000100011100011101";output;
a="01000100000100010000000001010001000000000100000001000000";output;
a="10100100000100010000000001010001000000000111111111111110";output;
a="01100100000100010000000001010001000000000101010101010101";output;
a="00000100000100010000000001010001000000000100001100101010";output;
run;

/* work by rows*/
%macro x;
%let i=1;
data test_output(drop=i);
 set test;
    i=1;
    %do %while (&i<=56);
        var&i.=0;
        var&i.=var&i.+input(substr(a,&i,1), best8.); 
        %let i=%eval(&i.+1);
    %end;
run;
%mend;
%x;

/* results:
a                                                          var1 var2 var3 var4 var5 var6 var7   .   .
00000100000100010000000001010001000000000100000000011110    0   0   0   0   0   1   0    .......    
10000100000100010000000001010001000000000100011100011101    1   0   0   0   0   1   0    .......    
01000100000100010000000001010001000000000100000001000000    0   1   0   0   0   1   0    .......    
10100100000100010000000001010001000000000111111111111110    1   0   1   0   0   1   0    .......    
01100100000100010000000001010001000000000101010101010101    0   1   1   0   0   1   0    .......    
00000100000100010000000001010001000000000100001100101010    0   0   0   0   0   1   0    .......    
*/

I think the author is trying to look for an do-loop method. So my suggest is macro %do or array statment in data step.

data _null_;
    text = '000001000001000100000000010100010000000001';

    y = length(text);
    array Var[28];
    do i = 1 to dim(Var);
        Var[i] + (substrn(text,i*2,1)='1');
        put i = Var[i]=;
    end;
run;

Kind of easy, isn't is?

Array the variables that are to be potentially incremented according to string. A DO loop can examine each part of the string and conditionally apply the needed increment.

The SUM statement <variable>+<expression> means the variable's value is automatically retained from row to row.

Due to the nature of retain ed variables, you might want only the final var1-var28 values at the last row in the data. The question does not have enough info regarding what is to be done with the var<n> variables.

Example:

Presume string is named op_string (op for operation). Utilize logical evaluation result True is 1 and False is 0

data want(keep=var1-var28); 
  set have end=done;
  array var var1-var28;
  do index = 1 to 28;
    var(index) + substr(op_string, 2 * index) = '1';  * Add 0 or 1 according to logic eval;
  end;
  if done;  * output one row at the end of the data set;
run;

Use COUNTC() to count the number of 1's in the string then.

data want;
set have;
value = countc(op_string, '1');
run;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM