![](/img/trans.png)
[英]Can dissimilarity matrix be used instead of data frame when we have both categorical and continuous variables?
[英]Creating 10 categorical and 10 continuous random variables and save them as a data frame
我想創建一個包含 10 個分類和 10 個連續隨機變量的數據框。 我可以使用以下循環來做到這一點。
p_val=rbeta(10,1,1) #10 probabilities
n=20
library(truncnorm)
mu_val=rtruncnorm(length(p_val),0,Inf, mean = 100, sd=5)#rnorm(length(p))
d_mat_cat=matrix(NA, nrow = n, ncol = length(p))
d_mat_cont= matrix(NA, nrow = n, ncol = length(p))
for ( j in 1:length(p)){
d_mat_cat[,j]=rbinom(n,1,p[j]) #Binary RV
d_mat_cont[,j]=rnorm(n,mu_val[j]) #Cont. RV
}
d_mat=cbind(d_mat_cat, d_mat_cont)
任何替代選項表示贊賞。
rbinom
在prob
上進行了矢量化,並且rnorm
在mean
上進行了矢量化,因此您可以使用它:
cbind(
matrix(rbinom(n * length(p_val), size = 1, prob = p_val),
ncol = length(p_val), byrow = TRUE),
matrix(rnorm(n * length(mu_val), mean = mu_val),
ncol = length(mu_val), byrow = TRUE)
)
我們可以稍微巧妙地使用rep
來使調用更清晰:
p_val = c(0, 0.5, 1)
mu_val = c(1, 10, 100)
n = 4
##
matrix(
c(
rbinom(n * length(p_val), size = 1, prob = rep(c(0, .5, 1), each = n)),
rnorm(n * length(mu_val), mean = rep(c(1, 10, 100), each = n))
),
nrow = n,
)
# [,1] [,2] [,3] [,4] [,5] [,6]
# [1,] 0 1 1 1.1962718 9.373595 100.1739
# [2,] 0 0 1 -0.1854631 9.574706 100.0725
# [3,] 0 1 1 3.4873697 9.447363 100.1345
# [4,] 0 1 1 2.8467450 9.700975 101.3178
您可以嘗試使用sapply
運行rbinom
和rnorm
並cbind
數據。
cbind(sapply(p_val, rbinom, n = n, size = 1), sapply(mu_val, rnorm, n = n))
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.