繁体   English   中英

使用 map() function 申请每个元素

[英]Using map() function to apply for each element

我有这个来自调查的示例数据集:

dt<- data.table(
  ID = c(1,2,3,4, 5, 6, 7, 8, 9, 10),
  education_code = c(20,50,20,60, 20, 10,5, 12, 12, 12),
  age = c(87,67,56,52, 34, 56, 67, 78, 23, 34),
  sex = c("F","M","M","M", "F","M","M","M", "M","M"),
  q1_1 = c(NA,1,5,3, 1, NA, 3, 4, 5,1),
  q1_2 = c(NA,1,5,3, 1, 2, NA, 4, 5,1),
  q1_3 = c(NA,1,5,3, 1, 2, 3, 4, 5,1),
  q1_text = c(NA,1,5,3, 1, 2, 3, 4, 5,1), 
  q2_1 = c(NA,1,5,3, 1, 2, 3, 4, 5,1),
  q2_2 = c(NA,1,5,3, 1, 2, 3, 4, 5,1),
  q2_3 = c(NA,1,5,3, 1, NA, 4, 5,1),
  q2_text = c(NA,1,5,3, 1, NA, 3, 4, 5,1),
  no_respond = c(TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE)),

数据集更大,问题也更多。 调查中的问题是选择题,答案级别从 1 到 5。

我需要对数据进行一些统计分析——因此我制作了这个新的数据表并包含了一个“权重”变量,因为我需要对我的数据进行加权。 如您所见,此 cod 仅考虑问题 1 (q1_1)。

dt[, .(ID, education_code, age, sex, item = q1_1)]
dt[, no_respond := is.na(item)]
dt[, weight := 1/(sum(no_respond==0)/.N), keyby = .(sex, education_code, age)]

我需要在map() function 的帮助下,对每个元素应用上述内容

我该怎么做?

如评论中所述,您在dt[, .(ID, education_code, age, sex, item = q1_1)]中错过了一个dt <- ,这使得列item在以下行中不可用dt[, no_respond:= is.na(item)]

但是,您的加权方案对我来说并不完全清楚,假设您想在此处执行代码中所做的操作,我将使用 go 和dplyr解决方案来迭代列。

# your data without no_respond column and correcting missing value in q2_3
dt <- data.table::data.table(
  ID = c(1,2,3,4, 5, 6, 7, 8, 9, 10),
  education_code = c(20,50,20,60, 20, 10,5, 12, 12, 12),
  age = c(87,67,56,52, 34, 56, 67, 78, 23, 34),
  sex = c("F","M","M","M", "F","M","M","M", "M","M"),
  q1_1 = c(NA,1,5,3, 1, NA, 3, 4, 5,1),
  q1_2 = c(NA,1,5,3, 1, 2, NA, 4, 5,1),
  q1_3 = c(NA,1,5,3, 1, 2, 3, 4, 5,1),
  q1_text = c(NA,1,5,3, 1, 2, 3, 4, 5,1), 
  q2_1 = c(NA,1,5,3, 1, 2, 3, 4, 5,1),
  q2_2 = c(NA,1,5,3, 1, 2, 3, 4, 5,1),
  q2_3 = c(NA,1,5,3, 1, NA, NA, 4, 5,1),
  q2_text = c(NA,1,5,3, 1, NA, 3, 4, 5,1))


dt %>%
  group_by(sex, education_code, age) %>%    #groups the df by sex, education_code, age
  add_count() %>%                           #add a column with number of rows in each group
  mutate(across(starts_with("q"),           #for each column starting with "q"
                ~ 1/(sum(!is.na(.))/n),     #create a new column following your weight calculation
                .names = '{.col}_wgt')) %>% #naming the new column with suffix "_wgt" to original name
  ungroup()

As dt is of class data.table , you can make a vector of columns of interest (ie your items; below I use grepl on the names), and then apply your weighting function to each of those columns using .SD and .SDcols , with by

qs = names(dt)[grepl("^q", names(dt))]

dt[, (paste0(qs,"wt")):=lapply(.SD, \(q) 1/(sum(!is.na(q))/.N)),
   .(sex, education_code, age), .SDcols = qs]

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM