简体   繁体   English

SAS:崩溃和加权平均值计算

[英]SAS: Collapsing and weighted averages calculations

I have a SAS programming problem which I can't solve on my own and I'm thankful for any imput. 我有一个无法自行解决的SAS编程问题,我为我的一切努力表示感谢。

I want to collapse data in a dataset by variable and summarize/average two variables based on the weights given by another variable and substract them from each other: 我想按变量折叠数据集中的数据,并根据另一个变量给出的权重对两个变量进行汇总/平均,然后将它们彼此相减:

Example data 示例数据

number   flag     volume   measure1  measure2
1         A         1         2         2        
2         B         2         4         5
3         A         5         8         20
4         B         10        4         1
5         A         9         10        11
6         B         5         2         9
7         A         4         11        23
8         B         3         1         8

Now: I want the volume-weighted average of measure1 and two, then calculate measure1-measure2. 现在:我想要度量1和度量2的体积加权平均值,然后计算度量1-度量2。 All this then grouped by the flags A and B: 然后将所有这些按标志A和B分组:

Number Flag      Volume       VolWeightMeasure1      VolWeightMeasure2      FinalMeasure
1        A        19        ((1/19)*2)+((5/19)*8)+...     ...            (VolWeightMeasure1-VolWeightMeasure2)
2        B        20        ((2/20)*5)+((10/20)*1)+...    ...            (VolWeightMeasure1-VolWeightMeasure2)

So basically collapsing but with volume weighted measures and then deducting the two. 因此基本上会崩溃,但要使用体积加权度量,然后再减去两者。 Thank you for any input! 谢谢您的投入!

Best 最好

This can be done in a single datastep using two nested SET statements (often referred to as double Do-Loop-of-Whitlock). 这可以使用两个嵌套的SET语句(通常称为双重Do-Loop-of-Whitlock)在单个数据datastep完成。

The first loop aggregates the value of VOLUME . 第一个循环汇总VOLUME的值。 In the second loop the formulas are calculated. 在第二个循环中,将计算公式。 Only one value per group goes to the output. 每组只有一个值进入输出。

data have;
input  flag $ volume measure1 measure2;
datalines;
        A         1         2         2        
        B         2         4         5
        A         5         8         20
        B         10        4         1
        A         9         10        11
        B         5         2         9
        A         4         11        23
        B         3         1         8
run;

proc sort data = have; by flag; run;
data want;

  do _n_ = 1 by 1 until (last.flag);
    set have;
    by flag;

    sum_vol = sum(sum_vol,volume);
  end;

  do _n_ = 1 by 1 until (last.flag);
    set have;
    by flag;

    VolWeightMeasure1 = sum(VolWeightMeasure1,(volume/sum_vol)*measure1);
    VolWeightMeasure2 = sum(VolWeightMeasure2,(volume/sum_vol)*measure2);
  end;

  FinalMeasure = VolWeightMeasure1 - VolWeightMeasure2;  

drop volume measure1 measure2;
rename sum_vol = Volume;
run;
proc sql;
   select flag,sum_volume,sum1/sum_volume as volweightmeasure1,sum2/sum_volume as volweightmeasure2,
          calculated volweightmeasure1-calculated volweightmeasure2 as finalmeasure
   from (select flag,sum(volume) as sum_volume, sum(volume*measure1) as sum1, sum(volume*measure2) as sum2 from  have group by flag);
quit;

If you're comfortable with proc summary/means you can do most of the legwork with it: 如果您对proc摘要/方法感到满意,则可以使用它进行大部分操作:

proc summary data=have nway;
  class flag;
  var measure1 measure2;
  wgt volume;
  output out=wantcomp(drop=_:) sumwgt=Volume mean=VolWeightMeasure1 VolWeightMeasure2;
run;

data want;
  set want;
  FinalMeasure = VolWeightMeasure1-VolWeightMeasure2;
run;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM