簡體   English   中英

使用 map() function 申請每個元素

[英]Using map() function to apply for each element

我有這個來自調查的示例數據集:

dt<- data.table(
  ID = c(1,2,3,4, 5, 6, 7, 8, 9, 10),
  education_code = c(20,50,20,60, 20, 10,5, 12, 12, 12),
  age = c(87,67,56,52, 34, 56, 67, 78, 23, 34),
  sex = c("F","M","M","M", "F","M","M","M", "M","M"),
  q1_1 = c(NA,1,5,3, 1, NA, 3, 4, 5,1),
  q1_2 = c(NA,1,5,3, 1, 2, NA, 4, 5,1),
  q1_3 = c(NA,1,5,3, 1, 2, 3, 4, 5,1),
  q1_text = c(NA,1,5,3, 1, 2, 3, 4, 5,1), 
  q2_1 = c(NA,1,5,3, 1, 2, 3, 4, 5,1),
  q2_2 = c(NA,1,5,3, 1, 2, 3, 4, 5,1),
  q2_3 = c(NA,1,5,3, 1, NA, 4, 5,1),
  q2_text = c(NA,1,5,3, 1, NA, 3, 4, 5,1),
  no_respond = c(TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE)),

數據集更大,問題也更多。 調查中的問題是選擇題,答案級別從 1 到 5。

我需要對數據進行一些統計分析——因此我制作了這個新的數據表並包含了一個“權重”變量,因為我需要對我的數據進行加權。 如您所見,此 cod 僅考慮問題 1 (q1_1)。

dt[, .(ID, education_code, age, sex, item = q1_1)]
dt[, no_respond := is.na(item)]
dt[, weight := 1/(sum(no_respond==0)/.N), keyby = .(sex, education_code, age)]

我需要在map() function 的幫助下,對每個元素應用上述內容

我該怎么做?

如評論中所述,您在dt[, .(ID, education_code, age, sex, item = q1_1)]中錯過了一個dt <- ,這使得列item在以下行中不可用dt[, no_respond:= is.na(item)]

但是,您的加權方案對我來說並不完全清楚,假設您想在此處執行代碼中所做的操作,我將使用 go 和dplyr解決方案來迭代列。

# your data without no_respond column and correcting missing value in q2_3
dt <- data.table::data.table(
  ID = c(1,2,3,4, 5, 6, 7, 8, 9, 10),
  education_code = c(20,50,20,60, 20, 10,5, 12, 12, 12),
  age = c(87,67,56,52, 34, 56, 67, 78, 23, 34),
  sex = c("F","M","M","M", "F","M","M","M", "M","M"),
  q1_1 = c(NA,1,5,3, 1, NA, 3, 4, 5,1),
  q1_2 = c(NA,1,5,3, 1, 2, NA, 4, 5,1),
  q1_3 = c(NA,1,5,3, 1, 2, 3, 4, 5,1),
  q1_text = c(NA,1,5,3, 1, 2, 3, 4, 5,1), 
  q2_1 = c(NA,1,5,3, 1, 2, 3, 4, 5,1),
  q2_2 = c(NA,1,5,3, 1, 2, 3, 4, 5,1),
  q2_3 = c(NA,1,5,3, 1, NA, NA, 4, 5,1),
  q2_text = c(NA,1,5,3, 1, NA, 3, 4, 5,1))


dt %>%
  group_by(sex, education_code, age) %>%    #groups the df by sex, education_code, age
  add_count() %>%                           #add a column with number of rows in each group
  mutate(across(starts_with("q"),           #for each column starting with "q"
                ~ 1/(sum(!is.na(.))/n),     #create a new column following your weight calculation
                .names = '{.col}_wgt')) %>% #naming the new column with suffix "_wgt" to original name
  ungroup()

As dt is of class data.table , you can make a vector of columns of interest (ie your items; below I use grepl on the names), and then apply your weighting function to each of those columns using .SD and .SDcols , with by

qs = names(dt)[grepl("^q", names(dt))]

dt[, (paste0(qs,"wt")):=lapply(.SD, \(q) 1/(sum(!is.na(q))/.N)),
   .(sex, education_code, age), .SDcols = qs]

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM