简体   繁体   中英

SAS: Collapsing and weighted averages calculations

I have a SAS programming problem which I can't solve on my own and I'm thankful for any imput.

I want to collapse data in a dataset by variable and summarize/average two variables based on the weights given by another variable and substract them from each other:

Example data

number   flag     volume   measure1  measure2
1         A         1         2         2        
2         B         2         4         5
3         A         5         8         20
4         B         10        4         1
5         A         9         10        11
6         B         5         2         9
7         A         4         11        23
8         B         3         1         8

Now: I want the volume-weighted average of measure1 and two, then calculate measure1-measure2. All this then grouped by the flags A and B:

Number Flag      Volume       VolWeightMeasure1      VolWeightMeasure2      FinalMeasure
1        A        19        ((1/19)*2)+((5/19)*8)+...     ...            (VolWeightMeasure1-VolWeightMeasure2)
2        B        20        ((2/20)*5)+((10/20)*1)+...    ...            (VolWeightMeasure1-VolWeightMeasure2)

So basically collapsing but with volume weighted measures and then deducting the two. Thank you for any input!

Best

This can be done in a single datastep using two nested SET statements (often referred to as double Do-Loop-of-Whitlock).

The first loop aggregates the value of VOLUME . In the second loop the formulas are calculated. Only one value per group goes to the output.

data have;
input  flag $ volume measure1 measure2;
datalines;
        A         1         2         2        
        B         2         4         5
        A         5         8         20
        B         10        4         1
        A         9         10        11
        B         5         2         9
        A         4         11        23
        B         3         1         8
run;

proc sort data = have; by flag; run;
data want;

  do _n_ = 1 by 1 until (last.flag);
    set have;
    by flag;

    sum_vol = sum(sum_vol,volume);
  end;

  do _n_ = 1 by 1 until (last.flag);
    set have;
    by flag;

    VolWeightMeasure1 = sum(VolWeightMeasure1,(volume/sum_vol)*measure1);
    VolWeightMeasure2 = sum(VolWeightMeasure2,(volume/sum_vol)*measure2);
  end;

  FinalMeasure = VolWeightMeasure1 - VolWeightMeasure2;  

drop volume measure1 measure2;
rename sum_vol = Volume;
run;
proc sql;
   select flag,sum_volume,sum1/sum_volume as volweightmeasure1,sum2/sum_volume as volweightmeasure2,
          calculated volweightmeasure1-calculated volweightmeasure2 as finalmeasure
   from (select flag,sum(volume) as sum_volume, sum(volume*measure1) as sum1, sum(volume*measure2) as sum2 from  have group by flag);
quit;

If you're comfortable with proc summary/means you can do most of the legwork with it:

proc summary data=have nway;
  class flag;
  var measure1 measure2;
  wgt volume;
  output out=wantcomp(drop=_:) sumwgt=Volume mean=VolWeightMeasure1 VolWeightMeasure2;
run;

data want;
  set want;
  FinalMeasure = VolWeightMeasure1-VolWeightMeasure2;
run;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM