[英]Different set.seed each run in R
I want to "measure" which Regression Method is more robust to the outliers.我想“测量”哪种回归方法对异常值更稳健。
For this, I sum the variances of model coefficients.为此,我将 model 系数的方差相加。 Each run, I generate data from t-distribution.
每次运行,我都会从 t 分布中生成数据。 I set.seed Ten times to have Ten specific data.
我 set.seed 十次以获得十个特定数据。
However, I also want to have Ten different seed each run.但是,我还希望每次运行十个不同的种子。 So, in total, I will have 10 sums of the variances.
所以,总的来说,我将有 10 个方差之和。 The code below is giving me one sum of the first (Ten different seed).
下面的代码给了我第一个(十个不同的种子)的总和。
How can I do this?我怎样才能做到这一点?
#######################################
p <- 5
n <- 50
#######################################
FX <- function(seed, data) {
#for loops over a seed #
for (i in seed) {
set.seed(seed)
# generating data from t-distribution #
x<- matrix(rt(n*p,1), ncol = p)
y<-rt(n,1)
dat=cbind(x,y)
data<-as.data.frame(dat)
# performing a regression model on the data #
lm1 <- lm(y ~ ., data=data)
lm.coefs <- coef(lm1)
lad1 <- lad(y ~ ., data=data, method="BR")
lad.coefs <- coef(lad1)
}
# calculate variance of the coefficients #
return(`attr<-`(cbind(lmm=var(lm.coefs), lad=var(lad.coefs)), "seed", seed))
}
#######################################
seeds <- 1:10 ## 10 set seed to have diffrent data set from t-distribution #
res <- lapply(seeds, FX, data=data) # 10 diffrent variance of 10 data/model
sov <- t(sapply(res, colSums)) # put them in matrix
colSums(sov) # sum of 10 varainnces for each model.
Here is something closer to your intended results.这是更接近您预期结果的内容。 The code below fixes a key issues from your original code.
下面的代码修复了原始代码中的一个关键问题。 It was not clear on what data was intended to be returned from the function.
目前尚不清楚打算从 function 返回什么数据。
This creates a vector of seeds numbers inside the function这将在 function 内创建一个种子编号向量
This also creates a vector to inside the function to store the value of the variance of coefficients for each iteration of the loop.这还会在 function 内部创建一个向量,以存储循环每次迭代的系数方差值。 (not sure if is what you want).
(不确定是否是您想要的)。
I needed to comment out the lad
function since I do not know which package this is from.我需要注释掉
lad
function 因为我不知道这是来自哪个 package。 (you would need to follow 2 from above to add this back in. (您需要按照上面的 2 重新添加它。
Some general clean of the code对代码进行一些一般性的清理
p <- 5 n <- 50 FX <- function(seed, data) { #for loops over a seed # #Fixes the starting seed issue startingSeed <- (seed-1)*10 +1 seeds <- seq( startingSeed, startingSeed+9) #create vector to store results from loop iteration lm.coefs <- vector(mode="numeric", length=10) index <- 1 for (i in seeds) { set.seed(i) # generating data from t-distribution # x<- matrix(rt(n*p,1), ncol = p) y<-rt(n,1) data<-data.frame(x, y) # performing a regression model on the data # lm1 <- lm(y ~., data=data) lm.coefs[index] <- var(coef(lm1)) # lad1 <- lad(y ~., data=data, method="BR") # lad.coefs <- coef(lad1) index <- index +1 } # calculate variance of the coefficients # return(`attr<-`(cbind(lmm=lm.coefs), "seed", seed)) } seeds <- 1:10 ## 10 set seed to have diffrent data set from t-distribution # res <- lapply(seeds, FX, data=data) # 10 diffrent variance of 10 data/model sov <- t(sapply(res, colSums)) # put them in matrix colSums(sov) # sum of 10 varainnces for each model.
Hope this provides the answer or at least guidance to solve your problem.希望这能提供答案或至少提供解决您问题的指导。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.