简体   繁体   English

使用 purrr 根据现有变量的值创建多个新变量

[英]Using purrr to create several new variables based on values of existing variables

EDIT: added sample df编辑:添加示例 df

I have a 3 item checklist (options a, b, c) in which participants can choose as many responses as apply to them.我有一个 3 项清单(选项 a、b、c),参与者可以在其中选择适用于他们的尽可能多的答案。 In my data, these responses are stored in three binary response options (q4___a, q4___b, q4___c).在我的数据中,这些响应存储在三个二元响应选项中(q4___a、q4___b、q4___c)。 I have this same data across four different time points (1, 2, 3, 4), so my variables are coded like this:我在四个不同的时间点(1、2、3、4)有相同的数据,所以我的变量编码如下:

q4_1___a
q4_1___b
q4_1___c
q4_2___a
q4_2___b

etc., where q4 is the stem, the integer is the time at which the data was collected, and the letter is the response option.等等,其中 q4 是词干,整数是收集数据的时间,字母是响应选项。 Here is a sample dataframe:这是一个示例数据框:

df <- data.frame(
 q4_1___a = rbinom(10, 1, .5),
 q4_1___b = rbinom(10, 1, .5),
 q4_1___c = rbinom(10, 1, .5),
 q4_2___a = rbinom(10, 1, .5),
 q4_2___b = rbinom(10, 1, .5),
 q4_2___c = rbinom(10, 1, .5),
 q4_3___a = rbinom(10, 1, .5),
 q4_3___b = rbinom(10, 1, .5),
 q4_3___c = rbinom(10, 1, .5),
 q4_4___a = rbinom(10, 1, .5),
 q4_4___b = rbinom(10, 1, .5),
 q4_4___c = rbinom(10, 1, .5)
)

I need to create "group" variables that combine the results of the three different binary response variables at each time point.我需要创建“组”变量,在每个时间点组合三个不同的二元响应变量的结果。 I can do this at time point 1 using the following code:我可以使用以下代码在时间点 1 执行此操作:

df%>%
 mutate(q4_1_group = case_when(
  q4_1___a == 1 & q4_1___b == 0 & q4_1___c == 0 ~ "a",
  q4_1___a == 0 & q4_1___b == 1 & q4_1___c == 0 ~ "b",
  q4_1___a == 0 & q4_1___b == 0 & q4_1___c == 1 ~ "c",
  q4_1___a == 1 & q4_1___b == 1 & q4_1___c == 0 ~ "ab",
  q4_1___a == 1 & q4_1___b == 0 & q4_1___c == 1 ~ "ac",
  q4_1___a == 0 & q4_1___b == 1 & q4_1___c == 1 ~ "bc",
  q4_1___a == 1 & q4_1___b == 1 & q4_1___c == 1 ~ "abc"
 ))

I'm having trouble figuring out where to go from here to iterate over this across all four time points.我无法弄清楚从这里到哪里去遍历所有四个时间点。 Essentially, I need to change the 1's in all of the variable names to 2's, 3's, and 4's, so that:本质上,我需要将所有变量名称中的 1 更改为 2、3 和 4,以便:

df%>%
 mutate(q4_[i]_group = case_when(
  q4_[i]___a == 1 & q4_[i]___b == 0 & q4_[i]___c == 0 ~ "a",
  q4_[i]___a == 0 & q4_[i]___b == 1 & q4_[i]___c == 0 ~ "b",
  q4_[i]___a == 0 & q4_[i]___b == 0 & q4_[i]___c == 1 ~ "c",
  q4_[i]___a == 1 & q4_[i]___b == 1 & q4_[i]___c == 0 ~ "ab",
  q4_[i]___a == 1 & q4_[i]___b == 0 & q4_[i]___c == 1 ~ "ac",
  q4_[i]___a == 0 & q4_[i]___b == 1 & q4_[i]___c == 1 ~ "bc",
  q4_[i]___a == 1 & q4_[i]___b == 1 & q4_[i]___c == 1 ~ "abc"
 ))

where [i] corresponds to something like c(1:4) .其中[i]对应于类似c(1:4) I feel like there must be a straightforward way to do this using purrr , but I'm struggling to figure it out.我觉得必须有一种直接的方法来使用purrr来做到这purrr ,但我正在努力弄清楚。 Any help would be greatly appreciated!任何帮助将不胜感激!

We can create a keyval dataset and then do the join我们可以创建一个keyval数据集,然后进行join

library(tidyverse)
keydat <- data.frame(a = c(1, 0, 0, 1, 1, 0, 1),
                     b = c(0, 1, 0, 1, 0, 1, 1), 
                     c = c(0, 0, 1, 0, 1, 1, 1),
                     group = c("a", "b", "c", "ab", "ac", "bc", "abc"), 
            stringsAsFactors = FALSE)
nm1 <- unique(sub("__.*", "", names(df)))
split.default(df, as.numeric(gsub("^q\\d+_|__.*$", "", names(df)))) %>%
     map(~ .x %>%
              left_join(keydat, by = setNames(letters[1:3], names(.x)))) %>%
     bind_cols %>%
     rename_at(vars(matches('group')), ~paste0(nm1, '_group'))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM