如何为R中的样本分配权重

Question

Before performing some statistical analysis I would like to add weights to my sample as a function of a variable (the population size for each areal unit) so that the higher the population size within each unit, the greater the weight it will get and the opposite.在执行一些统计分析之前，我想将权重添加到我的样本中，作为一个变量（每个区域单位的人口规模）的 function，这样每个单位内的人口规模越大，它获得的权重就越大，反之亦然. Do you have any suggestion on how to do this in R?您对如何在 R 中执行此操作有任何建议吗？ Thanks in advance提前致谢

Answer 1

You can do this with weighted.mean() , providing the weights as the second argument.您可以使用weighted.mean()执行此操作，将权重作为第二个参数提供。

Here is a quick example, using population as weights.这是一个简单的例子，使用人口作为权重。

dat <- data.frame(
    country = c("UK", "US", "France", "Zimbabwe"),
    pop = c(6.7e4, 3.31e8, 6.8e4, 1.5e4),
    love_of_british_royal_family = c(5, 9, 2, 1)
)

mean(dat$love_of_british_royal_family) # 4.25

weighted.mean(
    dat$love_of_british_royal_family, 
    w = dat$pop
) # 8.997391

Answer 2

SamR's weighted.mean requires a weight for each member of your vector. SamR 的weighted.mean需要向量中每个成员的权重。 If you have a population vector with many members and want to weight by a catagories of population size, you could use the base R cut function. Here is a toy example:如果你有一个包含许多成员的人口向量，并且想按人口规模的类别加权，你可以使用基数 R cut function。这是一个玩具示例：

population <- sample(200:200000, 100)
df <- data.frame(population)
breaks <- c(200, 10000, 50000, 100000, 200000)
labels <- c(0.1, 0.2, 0.3, 0.4)
cuts <- cut(df$population, breaks = breaks, labels = labels)
df$weights <- as.numeric(as.character(cuts))
head(df)
  population weights
1      25087     0.2
2      92652     0.3
3      99051     0.3
4     136376     0.4
5     184573     0.4
6     147675     0.4

Note that cuts is a vector of factors.请注意， cuts是因子的向量。 Therefore the as.character(cuts) conversion is required to maintain the intended fractional weights.因此， as.character(cuts)转换来保持预期的分数权重。

如何为R中的样本分配权重

问题描述

2 个解决方案

解决方案1
2 2022-11-14 17:41:35

解决方案2
0 已采纳 2022-11-14 18:23:00

如何为R中的样本分配权重

问题描述

2 个解决方案

解决方案1 2 2022-11-14 17:41:35

解决方案2 0 已采纳 2022-11-14 18:23:00

解决方案1
2 2022-11-14 17:41:35

解决方案2
0 已采纳 2022-11-14 18:23:00