使用 map() function 申请每个元素

Question

我有这个来自调查的示例数据集：

dt<- data.table(
  ID = c(1,2,3,4, 5, 6, 7, 8, 9, 10),
  education_code = c(20,50,20,60, 20, 10,5, 12, 12, 12),
  age = c(87,67,56,52, 34, 56, 67, 78, 23, 34),
  sex = c("F","M","M","M", "F","M","M","M", "M","M"),
  q1_1 = c(NA,1,5,3, 1, NA, 3, 4, 5,1),
  q1_2 = c(NA,1,5,3, 1, 2, NA, 4, 5,1),
  q1_3 = c(NA,1,5,3, 1, 2, 3, 4, 5,1),
  q1_text = c(NA,1,5,3, 1, 2, 3, 4, 5,1), 
  q2_1 = c(NA,1,5,3, 1, 2, 3, 4, 5,1),
  q2_2 = c(NA,1,5,3, 1, 2, 3, 4, 5,1),
  q2_3 = c(NA,1,5,3, 1, NA, 4, 5,1),
  q2_text = c(NA,1,5,3, 1, NA, 3, 4, 5,1),
  no_respond = c(TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE)),

数据集更大，问题也更多。 调查中的问题是选择题，答案级别从 1 到 5。

我需要对数据进行一些统计分析——因此我制作了这个新的数据表并包含了一个“权重”变量，因为我需要对我的数据进行加权。 如您所见，此 cod 仅考虑问题 1 (q1_1)。

dt[, .(ID, education_code, age, sex, item = q1_1)]
dt[, no_respond := is.na(item)]
dt[, weight := 1/(sum(no_respond==0)/.N), keyby = .(sex, education_code, age)]

我需要在map() function 的帮助下，对每个元素应用上述内容

我该怎么做？

Answer 1

如评论中所述，您在dt[, .(ID, education_code, age, sex, item = q1_1)]中错过了一个dt <- ，这使得列item在以下行中不可用dt[, no_respond:= is.na(item)] 。

但是，您的加权方案对我来说并不完全清楚，假设您想在此处执行代码中所做的操作，我将使用 go 和dplyr解决方案来迭代列。

# your data without no_respond column and correcting missing value in q2_3
dt <- data.table::data.table(
  ID = c(1,2,3,4, 5, 6, 7, 8, 9, 10),
  education_code = c(20,50,20,60, 20, 10,5, 12, 12, 12),
  age = c(87,67,56,52, 34, 56, 67, 78, 23, 34),
  sex = c("F","M","M","M", "F","M","M","M", "M","M"),
  q1_1 = c(NA,1,5,3, 1, NA, 3, 4, 5,1),
  q1_2 = c(NA,1,5,3, 1, 2, NA, 4, 5,1),
  q1_3 = c(NA,1,5,3, 1, 2, 3, 4, 5,1),
  q1_text = c(NA,1,5,3, 1, 2, 3, 4, 5,1), 
  q2_1 = c(NA,1,5,3, 1, 2, 3, 4, 5,1),
  q2_2 = c(NA,1,5,3, 1, 2, 3, 4, 5,1),
  q2_3 = c(NA,1,5,3, 1, NA, NA, 4, 5,1),
  q2_text = c(NA,1,5,3, 1, NA, 3, 4, 5,1))


dt %>%
  group_by(sex, education_code, age) %>%    #groups the df by sex, education_code, age
  add_count() %>%                           #add a column with number of rows in each group
  mutate(across(starts_with("q"),           #for each column starting with "q"
                ~ 1/(sum(!is.na(.))/n),     #create a new column following your weight calculation
                .names = '{.col}_wgt')) %>% #naming the new column with suffix "_wgt" to original name
  ungroup()

Answer 2

As dt is of class data.table , you can make a vector of columns of interest (ie your items; below I use grepl on the names), and then apply your weighting function to each of those columns using .SD and .SDcols , with by

qs = names(dt)[grepl("^q", names(dt))]

dt[, (paste0(qs,"wt")):=lapply(.SD, \(q) 1/(sum(!is.na(q))/.N)),
   .(sex, education_code, age), .SDcols = qs]

使用 map() function 申请每个元素

问题描述

2 个解决方案

解决方案1
0 2022-09-01 14:02:06

解决方案2
0 2022-09-01 14:42:35

使用 map() function 申请每个元素

问题描述

2 个解决方案

解决方案1 0 2022-09-01 14:02:06

解决方案2 0 2022-09-01 14:42:35

解决方案1
0 2022-09-01 14:02:06

解决方案2
0 2022-09-01 14:42:35