简体   繁体   English

SAS Macro Proc Logistic将P值放入数据集中

[英]SAS Macro Proc Logistic put P-value in a dataset

I've googled lots papers on the subject but don't seem to find what I want. 我已经搜索了很多关于这个主题的论文,但似乎没有找到我想要的东西。 I'm a beginner at SAS Macro, hoping to get some help here. 我是SAS Macro的初学者,希望能在这里得到一些帮助。 Here is what I want: 这就是我想要的:

I have a dataset with 1200 variables. 我有一个包含1200个变量的数据集。 I want a macro to run those 1199 variables as OUTCOME, and store the P-values of logistic regression in a dataset. 我想要一个宏将这1199个变量作为OUTCOME运行,并将逻辑回归的P值存储在数据集中。 Also the dependent variable "gender" is character, and so are the outcome variables. 因变量“性别”也是字符,结果变量也是如此。 But I don't know how to put class statement in the macro. 但我不知道如何将类声明放在宏中。 Here is an example of how I run it as a single procedure. 以下是我如何将其作为单个过程运行的示例。

 proc logistic data=Baseline_gender ;
 class gender(ref="Male") / param=ref;
 model N284(event='1')=gender ; 
 ods output ParameterEstimates=ok;
 run;

My idea was to create ODS output and delete the unnecessary variables other than the P-value and merge them into one dataset according to the OUTCOME variable names in the model: eg 我的想法是创建ODS输出并删除P值以外的不必要变量,并根据模型中的OUTCOME变量名称将它们合并到一个数据集中:例如

 Variable P-value
 A1       0.005
 A2       0.018
 ..       ....

I tried to play with some proc macro but I just cant get it work!!! 我尝试使用一些proc宏,但我不能让它工作! I really need help on this, Thank you very much. 我真的需要帮助,非常感谢你。

SRSwift might be onto something (don't know enough about his method to tell), but here's a way to do it using a macro. SRSwift可能会出现问题(对他的方法不够了解),但这是使用宏来实现的方法。

First, count the number of variables in your dataset. 首先,计算数据集中的变量数。 Do this by selecting your table from the dictionary.columns table. 通过从dictionary.columns表中选择表来完成此操作。 This puts the number of variables into &sqlobs . 这将变量的数量放入&sqlobs Now read the variable names from the dictionary table into macro variables var1-var&sqlobs . 现在从字典表中读取变量名称到宏变量var1-var&sqlobs

%macro logitall;
proc sql;
create table count as
select name from dictionary.columns
where upcase(libname) = 'WORK'
  and upcase(memname) = 'BASELINE_GENDER'
  and upcase(name) ne 'GENDER'
;

select name into :var1 - :var&sqlobs
from dictionary.columns
where upcase(libname) = 'WORK'
  and upcase(memname) = 'BASELINE_GENDER'
  and upcase(name) ne 'GENDER'
;
quit;

Then run proc logistic for each dependent variable, each time outputting a dataset named after dependent variable.; 然后为每个因变量运行proc logistic,每次输出一个以因变量命名的数据集。

%do I = 1 %to &sqlobs;
  proc logistic data=Baseline_gender ;
    class gender(ref="Male") / param=ref;
    model &&var&I.(event='1')=gender ; 
    ods output ParameterEstimates=&&var&I.;
  run;
%end;

Now put all the output datasets together, creating a new variable with the dataset name using indsname= in the set statement. 现在将所有输出数据集放在一起,使用set语句中的indsname=使用数据集名称创建一个新变量。

data allvars;
  format indsname dsname varname $25.; 
  set
  %do I = 1 %to &sqlobs;
    &&var&I.
  %end;
  indsname=dsname;
  varname=dsname;
  keep varname ProbChiSq;
  where variable ne 'Intercept';
run;
%mend logitall;

%logitall;

Here is a macro free approach. 这是一个无宏的方法。 It restructures the data in advance and uses SAS's by grouping. 它提前重组数据并by分组使用SAS。 The data is stored in a deep format where the all the outcome variable values are stored in one new variable. 数据以深度格式存储,其中所有结果变量值存储在一个新变量中。

Create some sample data: 创建一些示例数据:

data have;
   input 
        outcome1 
        outcome2 
        outcome3 
        gender $;
   datalines;
1 1 1 Male
0 1 1 Male
1 0 1 Female
0 1 0 Male
1 1 0 Female
0 0 0 Female
;
run;

Next transpose the data into a deep format using an array: 接下来,使用数组将数据转换为深层格式:

data trans;
    set have;
    /* Create an array of all the outcome variables */
    array o{*} outcome:;
    /* Loop over the outcome variables */
    do i = 1 to dim(o);
        /* Store the variable name for grouping */
        _NAME_ = vname(o[i]);
        /* Store the outcome value in the  */
        outcome = o[i];
        output;
    end;
    keep _NAME_ outcome gender;
run;
proc sort data = trans;
    by _NAME_;
run;

Reusing your logistic procedure but with an additional by statement: 重用您的物流过程,但有一个额外by语句:

proc logistic data = trans;
    /* Use the grouping variable to select multiple analyses  */
    by _NAME_;
    class gender(ref = "Male");
    /* Use the new variable for the dependant variable */
    model outcome = gender / noint; 
    ods output ParameterEstimates = ok;
run;

Here is another way to do it using macro. 这是另一种使用宏的方法。 First define all the variables to be used as outcome in a global variable and then write the macro script. 首先定义要在全局变量中用作结果的所有变量,然后编写宏脚本。

%let var = var1 var2 var3 ..... var1199;

%macro log_regression;
  %do i=1 %to %eval(%sysfunc(countc(&var., " "))+1);
    %let outcome_var = %scan(&var, &i);
    %put &outcome_var.;

    proc logistic data = baseline_gender desc;
    class gender (ref = "Male") / param = ref;
    model &outcome_var. = gender;
    ods output ParameterEstimates = ParEst_&outcome_var.;
    run;

    %if %sysfunc(exist(univar_result)) %then %do;
      data univar_result;
      set univar_result ParEst_&outcome_var.;
      run;
    %end;
    %else %do;
      data univar_result;
      set ParEst_&outcome_var.;
      run;
    %end;

  %end;
%mend;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM