使用do loop-SAS在數據步驟中定義一個帶變量的過濾器

Question

早上好，我有這個問題。

有2個數據集

我擁有的數據集“ ID客戶”：

id       |  Customer Name   |
-----------------------------
123456   | Michael One      |
123123   | George Two       |
123789   | James Three      |

第二個數據集名為“交易”：

id       |  Transaction | Date
-----------------------------------
123456   | Fuel         | 01NOV2018
123456   | Fuel         | 03NOV2018
123123   | Fuel         | 10NOV2018
123456   | Fuel         | 25NOV2018
123123   | Fuel         | 13NOV2018
123456   | Fuel         | 10DEC2018
123789   | Fuel         | 1NOV2018
123123   | Fuel         | 30NOV2018
123789   | Fuel         | 15DEC2018

我想要的結果是創建3個數據庫，例如我在名為第一個數據集的3個客戶ID中：

_01NOV2018_15NOV_123456_F
_01NOV2018_15NOV_123123_F
_01NOV2018_15NOV_123789_F

包含：

For  _01NOV2018_15NOV_123456_F :
id       |  Transaction | Date
-----------------------------------
123456   | Fuel         | 01NOV2018
123456   | Fuel         | 03NOV2018

For _01NOV2018_15NOV_123123_F :

id       |  Transaction | Date
-----------------------------------
123123   | Fuel         | 10NOV2018
123123   | Fuel         | 13NOV2018

For _01NOV2018_15NOV_123789_F

empty

我需要為子句創建一個變量，該子句在數據步驟中……我該如何做？

感謝幫助！ ：）`

Answer 1

HASH OUTPUT方法是在DATA步驟運行時創建動態命名的輸出數據集的唯一方法。 根據問題注釋，您很可能不想將原始數據集拆分為許多內容命名的片段。 無論如何，例如SAS中的過程稱為拆分。

在如何在DATA步驟和PROC步驟中應用WHERE語句和BY組處理方面，對您有更好的幫助。

期望的輸出似乎是根據月的一半進行了分離或分類。 計算一個包含適當分類值的新的semimonth變量，然后在下游使用該變量，例如在PROC PRINT ，可能會為您提供最好的服務。

data customers;
infile cards dlm='|';
attrib
  id length=8
  name length=$20
;
input id name ;
datalines;
123456   | Michael One      |
123123   | George Two       |
123789   | James Three      |
run;

data transactions;
infile cards dlm='|';
attrib
  id length=8
  transaction length=$10
  date length=8 format=date9. informat=date9.
;
input id transaction date;
datalines;
123456   | Fuel         | 01NOV2018
123456   | Fuel         | 03NOV2018
123123   | Fuel         | 10NOV2018
123456   | Fuel         | 25NOV2018
123123   | Fuel         | 13NOV2018
123456   | Fuel         | 10DEC2018
123789   | Fuel         | 1NOV2018
123123   | Fuel         | 30NOV2018
123789   | Fuel         | 15DEC2018
run;

proc sort data=customers;
  by id;
proc sort data=transactions;
  by id date;

* merge datasets and compute semimonth;

data want;
  merge transactions customers;
  by id;

  semimonth = intnx('month',date,0) + 16 * (day(date) > 15);

  attrib semimonth
    format=date9.
    label="Semi-month"
  ;
run;


* process data by semimonth and id, restricting with where;

proc print data=want;
  by semimonth id;
  where semimonth = '01NOV2018'D;
run;

Answer 2

您可以通過一個小宏或僅對proc導出代碼進行快速過濾來執行此操作。

proc export data=sashelp.class (where=(sex='F')) outfile='/folders/myfolders/females.xlsx' dbms=xlsx replace; run;

proc export data=sashelp.class (where=(sex='M')) outfile='/folders/myfolders/females.xlsx' dbms=xlsx replace; run;

或者，您可以將其轉換為一個小宏：

    %macro exportData(group=);

proc export data=sashelp.class (where=(sex="&group."))
outfile="C:\_localdata\&group..xlsx" 
dbms=xlsx 
replace; 
run;

%mend;

*create list of unique elements to call the macro;
proc sort data=sashelp.class nodupkey out=class;
by sex;
run;

*call the macro once for each group;  

data test;
   set class;
    str = catt('%exportData(group=', sex, ');');
    call execute(str);
run;

使用do loop-SAS在數據步驟中定義一個帶變量的過濾器

問題描述

2 個解決方案

解決方案1
0 2019-01-31 14:47:07

解決方案2
0 2019-02-04 18:43:36

使用do loop-SAS在數據步驟中定義一個帶變量的過濾器

問題描述

2 個解決方案

解決方案1 0 2019-01-31 14:47:07

解決方案2 0 2019-02-04 18:43:36

解決方案1
0 2019-01-31 14:47:07

解決方案2
0 2019-02-04 18:43:36