簡體   English   中英

如何在WHERE語句中使用宏變量按字符串對數據進行子集化? (SAS 9.3)

[英]How do I use a macro variable in WHERE statement to subset data by a string? (SAS 9.3)

我希望能夠在數據集中的變量列表上循環PROC SQL ,並且在SQL代碼中,我想在WHERE語句中使用列表中的變量按字符值對觀察值進行子集化。 具體來說,我希望對數據集中的觀察計數,其中列表中的每個變量被編碼為“未知”。

設置WHERE MISSING(&VAL)=1沒有問題,但是當我嘗試引用字符值時遇到了問題。

這是我的代碼。 由於我顯然無法加粗給我帶來麻煩的區域,因此我在<-問題區域(靠近底部)處進行了表示。 除了提供解決方案之外,任何其他使我的代碼更有效的技巧都將受到贊賞。

    %MACRO PERCENTMISSING(LIST);
    PROC SQL NOPRINT;
       %LET N=%SYSFUNC(COUNTW(&LIST));
       %DO I=1 %TO &N;
       %LET VAL = %SCAN(&LIST,&I);
    CREATE TABLE WORK.SALM_&VAL AS
        SELECT DISTINCT "Salmonella" as PATHOGEN,
                            A.YEAR,
                            X.Missing&VAL,
                            Y.Total&VAL,
                            (X.Missing&VAL/Y.Total&VAL) as PropMiss&VAL,
                            C.Unknown&VAL,
                            (C.Unknown&Val/Y.Total&VAL) as PropUnk&VAL
        FROM allsalm as A
        INNER JOIN (
                    SELECT  YEAR,
                            COUNT(*) AS Missing&VAL
                    FROM allsalm
                    WHERE MISSING(&VAL)=1
                    GROUP BY Year) X
        ON A.Year=X.Year
        INNER JOIN (
                    SELECT  YEAR,
                            COUNT(*) AS Total&VAL
                    FROM allsalm
                    GROUP BY Year) Y
        ON A.Year=Y.Year
        INNER JOIN (
                    SELECT  YEAR,
                            COUNT(*) AS Unknown&VAL
                    FROM allsalm
                    WHERE &VAL IN ("Unknown") <-- PROBLEM AREA
                    GROUP BY Year) C
        ON A.Year=C.Year
        ;
    %END;
    QUIT;
    %MEND;

我收到的錯誤消息是:

ERROR: Column UnknownCity could not be found in the table/view identified with the correlation name C.

我自己搞清楚了,並添加了另一個DO循環以對數據集列表中的變量列表執行PROC SQL 對於嘗試為任意數量的數據集中的任意數量的變量計算缺失值(和/或“未知”值,如果您的數據集也碰巧以這種方式缺失)的比例而言,這可能是一個不錯的模板。

   %MACRO PERCENTMISSING(LIST1,LIST2);
   %LET N1=%SYSFUNC(COUNTW(&LIST1));
   %LET N2=%SYSFUNC(COUNTW(&LIST2));
   %DO I=1 %TO &N1;
      %LET VAL1 = %SCAN(&LIST1,&I);
         %DO J=1 %TO &N2;
            %LET VAL2 = %SCAN(&LIST2,&J);

    PROC SQL NOPRINT;
    CREATE TABLE &VAL1&VAL2 AS
        SELECT DISTINCT "&VAL1" as PATHOGEN,
                            A.YEAR,
                            X.Missing&VAL2,
                            Y.Total&VAL2,
                            (X.Missing&VAL2/Y.Total&VAL2) as PropMiss&VAL2,
                            C.Unknown&VAL2,
                            (C.Unknown&VAL2/Y.Total&VAL2) as PropUnk&VAL2
        FROM &VAL1 as A
        LEFT JOIN (
                    SELECT  YEAR,
                            COUNT(*) AS Missing&VAL2
                    FROM &VAL1
                    WHERE (MISSING(&VAL2)=1) OR (&VAL2=" ")
                    GROUP BY Year) X
        ON A.Year=X.Year
        LEFT JOIN (
                    SELECT  YEAR,
                            COUNT(*) AS Total&VAL2
                    FROM &VAL1
                    GROUP BY Year) Y
        ON A.Year=Y.Year
        LEFT JOIN (
                    SELECT  YEAR,
                            COUNT(*) AS Unknown&VAL2
                    FROM &VAL1
                    WHERE &VAL2 IN ("U","Unknown")
                    GROUP BY Year) C
        ON A.Year=C.Year;
    QUIT;
  %END;
%END;
%MEND;

然后只需調用宏,為LIST1填寫表名,為LIST2填寫變量名。 例如:

%PERCENTMISSING(Table1 Table2 Table3 Table4,Var1 Var2 Var3 Var4 Var5);`

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM