简体   繁体   English

如何为R中的样本分配权重

[英]How to assign weights to sample in R

Before performing some statistical analysis I would like to add weights to my sample as a function of a variable (the population size for each areal unit) so that the higher the population size within each unit, the greater the weight it will get and the opposite.在执行一些统计分析之前,我想将权重添加到我的样本中,作为一个变量(每个区域单位的人口规模)的 function,这样每个单位内的人口规模越大,它获得的权重就越大,反之亦然. Do you have any suggestion on how to do this in R?您对如何在 R 中执行此操作有任何建议吗? Thanks in advance提前致谢

You can do this with weighted.mean() , providing the weights as the second argument.您可以使用weighted.mean()执行此操作,将权重作为第二个参数提供。

Here is a quick example, using population as weights.这是一个简单的例子,使用人口作为权重。

dat <- data.frame(
    country = c("UK", "US", "France", "Zimbabwe"),
    pop = c(6.7e4, 3.31e8, 6.8e4, 1.5e4),
    love_of_british_royal_family = c(5, 9, 2, 1)
)

mean(dat$love_of_british_royal_family) # 4.25

weighted.mean(
    dat$love_of_british_royal_family, 
    w = dat$pop
) # 8.997391

SamR's weighted.mean requires a weight for each member of your vector. SamR 的weighted.mean需要向量中每个成员的权重。 If you have a population vector with many members and want to weight by a catagories of population size, you could use the base R cut function. Here is a toy example:如果你有一个包含许多成员的人口向量,并且想按人口规模的类别加权,你可以使用基数 R cut function。这是一个玩具示例:

population <- sample(200:200000, 100)
df <- data.frame(population)
breaks <- c(200, 10000, 50000, 100000, 200000)
labels <- c(0.1, 0.2, 0.3, 0.4)
cuts <- cut(df$population, breaks = breaks, labels = labels)
df$weights <- as.numeric(as.character(cuts))
head(df)
  population weights
1      25087     0.2
2      92652     0.3
3      99051     0.3
4     136376     0.4
5     184573     0.4
6     147675     0.4

Note that cuts is a vector of factors.请注意, cuts是因子的向量。 Therefore the as.character(cuts) conversion is required to maintain the intended fractional weights.因此, as.character(cuts)转换来保持预期的分数权重。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM