R purrr 来自两个列表的逐行查找

Question

这是一个涉及更大、更复杂输入的问题的简化版本。 首先，我创建数据：

input <- tibble(
  person = rep(101:103, each = 12),
  item = rep(1:12, 3),
  response = sample(1:4, 36, replace = T)
)

这些数据是三个人对 12 项测试的回答。 input是一个多级表，其中测试项目嵌套在每个人中。 input的列是：

person : 人101 、 102和103的 ID 号（每个人 12 行）
item ：每人测试项目1-12。 注意项目是如何嵌套在每个人中的
response : 每个项目的分数

该测试分为四个分量表，每个分量表包括三个项目。

scale_assign <- list(1:3, 4:6, 7:9, 10:12)
scale_num <- 1:4

scale_assign是一个四元素列表，包含四个项目集（表示为四个数值范围）：项目 1-3（子量表 1）、项目 4-6（子量表 2）、项目 7-9（子量表 3）和项目 10- 12（子量表 4）。 scale_num是一个四元素数值向量，包含 label 四个子尺度的数字 (1-4)。

我想要 R 做的是按行处理input ，创建一个新列scale ，并为每个项目（即每个项目的子尺度分配）填充正确的scale_num值。 在每一行中，R 需要根据 scale_assign 中的范围检查item的值，并使用与该scale_assign scale_num相对应的scale_assign的值填充scale 。

所需的 output 如下所示：

# A tibble: 36 x 4
#      person  item response scale
#  1    101     1        4     1
#  2    101     2        2     1
#  3    101     3        4     1
#  4    101     4        4     2
#  5    101     5        4     2
#  6    101     6        4     2
#  7    101     7        3     3
#  8    101     8        2     3
#  9    101     9        4     3
# 10    101    10        1     4
# 11    101    11        1     4
# 12    101    12        4     4
# 13    102     1        1     1
# 14    102     2        3     1
# 15    102     3        1     1
# 16    102     4        1     2
# 17    102     5        3     2
# 18    102     6        3     2
# 19    102     7        4     3
# 20    102     8        1     3
# 21    102     9        3     3
# 22    102    10        4     4
# 23    102    11        3     4
# 24    102    12        3     4
# 25    103     1        4     1
# 26    103     2        1     1
# 27    103     3        2     1
# 28    103     4        2     2
# 29    103     5        4     2
# 30    103     6        1     2
# 31    103     7        4     3
# 32    103     8        4     3
# 33    103     9        1     3
# 34    103    10        4     4
# 35    103    11        1     4
# 36    103    12        2     4

更喜欢tidyverse解决方案，我认为这可能是purrr::map2()的工作，因为它似乎涉及对四元素列表scale_assign和四元素向量scale_num的同时迭代。 我尝试在map2()调用中实现scale编码，使用mutate()和case_when()进行编码，但无法使其工作。

提前感谢您的帮助！

Answer 1

如果您将scale_assign更改为命名列表并将其转换为 dataframe 并使用input dataframe 执行right_join ，则无需逐行执行此操作并检查每个值，而是执行连接操作。

scale_assign <- list(1:3, 4:6, 7:9, 10:12)
names(scale_assign) <- 1:4

library(tidyverse)

enframe(scale_assign) %>%
   unnest(cols = value) %>%
   mutate_all(as.integer) %>%
   right_join(input, by = c("value" = "item"))


# A tibble: 36 x 4
#    name value person response
#   <int> <int>  <int>    <int>
# 1     1     1    101        4
# 2     1     2    101        4
# 3     1     3    101        2
# 4     2     4    101        2
# 5     2     5    101        1
# 6     2     6    101        4
# 7     3     7    101        3
# 8     3     8    101        1
# 9     3     9    101        1
#10     4    10    101        2
# … with 26 more rows

在基础 R 中，可以使用stack和merge来完成

merge(input, stack(scale_assign), all.x = TRUE, by.x = "item", by.y = "values")

数据

set.seed(1234)
input <- tibble(
   person = rep(101:103, each = 12),
   item = rep(1:12, 3),
   response = sample(1:4, 36, replace = TRUE))

Answer 2

这是使用更新连接的data.table解决方案。 基本上这是@Ronak Shah 的 Base-R 答案，但使用的是data.table （即在大型数据集上的快速性能）。

library(data.table)
#1. set inpus as data.table
#2. create a lookup-table using `stack( scale_assign )`, 
#  and make that also a data.table (using setDT() )
#3. left update join on item
setDT(input)[ setDT( stack( scale_assign ) ), 
                     scale := i.ind,
                     on = .( item = values ) ][]

output

#     person item response scale
#  1:    101    1        3     1
#  2:    101    2        4     1
#  3:    101    3        3     1
#  4:    101    4        2     2
#  5:    101    5        3     2
#  6:    101    6        4     2
#  7:    101    7        1     3
#  8:    101    8        3     3
#  9:    101    9        4     3
# 10:    101   10        2     4
# 11:    101   11        3     4
# 12:    101   12        4     4
# 13:    102    1        4     1
# 14:    102    2        2     1
# 15:    102    3        3     1
# 16:    102    4        2     2
# 17:    102    5        1     2
# 18:    102    6        4     2
# 19:    102    7        1     3
# 20:    102    8        3     3
# 21:    102    9        2     3
# 22:    102   10        1     4
# 23:    102   11        4     4
# 24:    102   12        3     4
# 25:    103    1        1     1
# 26:    103    2        1     1
# 27:    103    3        2     1
# 28:    103    4        1     2
# 29:    103    5        2     2
# 30:    103    6        4     2
# 31:    103    7        4     3
# 32:    103    8        2     3
# 33:    103    9        3     3
# 34:    103   10        2     4
# 35:    103   11        2     4
# 36:    103   12        2     4
#     person item response scale

R purrr 来自两个列表的逐行查找

问题描述

2 个解决方案

解决方案1
3 已采纳 2019-11-11 05:37:11

解决方案2
1 2019-11-11 07:40:02

R purrr 来自两个列表的逐行查找

问题描述

2 个解决方案

解决方案1 3 已采纳 2019-11-11 05:37:11

解决方案2 1 2019-11-11 07:40:02

解决方案1
3 已采纳 2019-11-11 05:37:11

解决方案2
1 2019-11-11 07:40:02