简体   繁体   English

如何从SAS数据集中输出随机观察值

[英]How to output a random set of observations from a SAS data set

I have a data set that selects random numbers from a uniform distribution. 我有一个从均匀分布中选择随机数的数据集。 How do you only output those row indices? 您如何只输出那些行索引? I basically want to select a random set of rows from a SAS data set. 我基本上想从SAS数据集中选择一组随机的行。

data Unif(keep=u x k n m);
call streaminit(123);
a = -1; b = 1;
Min = 1; Max = 28000000;
do i = 1 to &NObs;
   u = rand("Uniform");    /* U[0,1] */
   x = a + (b-a)*u;        /* U[a,b] */
   k = ceil( Max*u );      /* uniform integer in 1..Max */
   n = floor( (1+Max)*u ); /* uniform integer in 0..Max */
   m = min + floor((1+Max-Min)*u); /* uniform integer in Min..Max */
   output;
end;
keep k
run;
     *not sure about this part;
    data final;
     set final;
     where obs in (k);
   run;

The best way to do this is to use PROC SURVEYSELECT . 最好的方法是使用PROC SURVEYSELECT

proc surveyselect data=final out=selected seed=123 n=10;
run;

Or something along those lines depending on how you want to run it - the documentation has a lot of detail on the various options for how to perform the sampling. 或根据您要如何运行而进行的操作- 文档中有许多有关如何执行采样的选项的详细信息。

If you want to do it in the datastep, you need to be running the code from Unif inside the second datastep, in some fashion. 如果要在数据步骤中执行此操作,则需要以某种方式在第二个数据步骤中从Unif运行代码。 I don't entirely follow what it's trying to do; 我并不完全遵循它的意图。 if that's a form of k/n sampling, search 'SAS k/n sampling' and you'll find lots out there as it's a common question, but the general approach is 如果这是k / n采样的形式,请搜索“ SAS k / n采样”,您会发现很多东西,因为这是一个常见问题,但是一般方法是

data final_selected;
set final;
... code to determine if it should be selected...
if (condition); *subsetting if;
run;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM