模拟一个变量的最佳方法是什么，该变量采用 integer 个值，范围从 0 到 40（在 R 中）？统计新手

Question

I am sorry if this question is confusing, but I am stats newbie.. I am trying to simulate a composite variable which takes values ranging from 0 to 40. The composite variable is made of the sum of 8 questions which could take values between 0 and 5. I am aware that I cannot use rnorm , since it will also give me negative values, and the original data is right skewed.如果这个问题令人困惑，我很抱歉，但我是统计新手。我正在尝试模拟一个取值范围为 0 到 40 的复合变量。复合变量由 8 个问题的总和组成，这些问题的取值范围为 0和 5. 我知道我不能使用rnorm ，因为它也会给我负值，并且原始数据是右偏的。 I have the probabilities for each score (o to 5) occurring for each of the compositor variables, so I have considered generating each of them separately using the sample function, and then summing them to create my sum variable.我有每个合成器变量出现每个分数（o 到 5）的概率，因此我考虑使用样本function 分别生成每个分数，然后将它们相加以创建我的总和变量。 However, I am afraid that they probably would be correlated, and I haven't been able to find a way to simulate them simultaneously while also accounting for correlation.但是，我担心它们可能会相关，而且我一直无法找到一种方法来同时模拟它们，同时还要考虑相关性。 Essentially, to make it easier to imagine, the paper contrasts the use of 2 languages in different scenarios, so those same questions are asked twice to each participant for each language.本质上，为了更容易想象，本文对比了两种语言在不同场景中的使用，因此针对每种语言向每位参与者询问了两次相同的问题。 Therefore, the variables might also be correlated between conditions.因此，变量也可能在条件之间相关。 Is there a way to deal with that, or would it be best to simulate the total score variable directly?有没有办法解决这个问题，还是最好直接模拟总分变量？ From what I understand, although I am not sure if I am correct, I could use the rpois function to do this?据我了解，虽然我不确定我是否正确，但我可以使用rpois function 来执行此操作吗？ Or another solution I could thin of is simulate the data using rnorm , and then square it to make a right skew?或者我能想到的另一种解决方案是使用rnorm模拟数据，然后将其平方以做出正确的倾斜？ Any opinion would be very useful!!!任何意见都会非常有用！！！ Thank you in advance!!先感谢您！！

I have tried using rpois and simulating each compositor variable separately.我尝试使用 rpois 并分别模拟每个合成器变量。

Answer 1

Easy enough with the the base R sample function using the probability parameter.使用概率参数，基本 R sample function 就足够简单了。 Example:例子：

# first generate dummy probabilities
x <- runif(41) #41 possible outcomes 0 to 40
probs <- x / sum(x) # normalize the probability vector to 1
samples <- sample(0:40, 1000, replace = TRUE, prob = probs) # generate 1000 prob weighted realizations

模拟一个变量的最佳方法是什么，该变量采用 integer 个值，范围从 0 到 40（在 R 中）？统计新手

问题描述

1 个解决方案

解决方案1
1

模拟一个变量的最佳方法是什么，该变量采用 integer 个值，范围从 0 到 40（在 R 中）？ 统计新手

问题描述

1 个解决方案

解决方案1 1

模拟一个变量的最佳方法是什么，该变量采用 integer 个值，范围从 0 到 40（在 R 中）？统计新手

解决方案1
1