[英]How to output a random set of observations from a SAS data set
I have a data set that selects random numbers from a uniform distribution. 我有一个从均匀分布中选择随机数的数据集。 How do you only output those row indices? 您如何只输出那些行索引? I basically want to select a random set of rows from a SAS data set. 我基本上想从SAS数据集中选择一组随机的行。
data Unif(keep=u x k n m);
call streaminit(123);
a = -1; b = 1;
Min = 1; Max = 28000000;
do i = 1 to &NObs;
u = rand("Uniform"); /* U[0,1] */
x = a + (b-a)*u; /* U[a,b] */
k = ceil( Max*u ); /* uniform integer in 1..Max */
n = floor( (1+Max)*u ); /* uniform integer in 0..Max */
m = min + floor((1+Max-Min)*u); /* uniform integer in Min..Max */
output;
end;
keep k
run;
*not sure about this part;
data final;
set final;
where obs in (k);
run;
The best way to do this is to use PROC SURVEYSELECT
. 最好的方法是使用PROC SURVEYSELECT
。
proc surveyselect data=final out=selected seed=123 n=10;
run;
Or something along those lines depending on how you want to run it - the documentation has a lot of detail on the various options for how to perform the sampling. 或根据您要如何运行而进行的操作- 文档中有许多有关如何执行采样的选项的详细信息。
If you want to do it in the datastep, you need to be running the code from Unif
inside the second datastep, in some fashion. 如果要在数据步骤中执行此操作,则需要以某种方式在第二个数据步骤中从Unif
运行代码。 I don't entirely follow what it's trying to do; 我并不完全遵循它的意图。 if that's a form of k/n sampling, search 'SAS k/n sampling' and you'll find lots out there as it's a common question, but the general approach is 如果这是k / n采样的形式,请搜索“ SAS k / n采样”,您会发现很多东西,因为这是一个常见问题,但是一般方法是
data final_selected;
set final;
... code to determine if it should be selected...
if (condition); *subsetting if;
run;
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.