简体   繁体   中英

R purrr row-wise lookups from two lists

Here's a simplified version of a problem that involves larger, more complex inputs. First, I create data:

input <- tibble(
  person = rep(101:103, each = 12),
  item = rep(1:12, 3),
  response = sample(1:4, 36, replace = T)
)

These data are responses from three persons on a 12-item test. input is a multilevel table in which the test items are nested within each person. The columns of input are:

  • person : ID numbers for persons 101 , 102 , and 103 (12 rows for each person)
  • item : test items 1-12 for each person. Note how the items are nested within each person
  • response : score for each item

The test is divided into four subscales consisting of three items each.

scale_assign <- list(1:3, 4:6, 7:9, 10:12)
scale_num <- 1:4

scale_assign is a four-element list containing four item sets (expressed as four numerical ranges): items 1-3 (subscale 1), items 4-6 (subscale 2), items 7-9 (subscale 3), and items 10-12 (subscale 4). scale_num is a four element numerical vector containing the numbers (1-4) that label the four subscales.

What I want R to do is process input row-wise, creating a new column scale , and filling it with the correct value of scale_num for each item (that is, each item's subscale assignment). In each row, R needs to check the value of item against the ranges in scale_assign and fill in scale with the value of scale_num that corresponds to the scale_assign range for that item.

The desired output looks like this:

# A tibble: 36 x 4
#      person  item response scale
#  1    101     1        4     1
#  2    101     2        2     1
#  3    101     3        4     1
#  4    101     4        4     2
#  5    101     5        4     2
#  6    101     6        4     2
#  7    101     7        3     3
#  8    101     8        2     3
#  9    101     9        4     3
# 10    101    10        1     4
# 11    101    11        1     4
# 12    101    12        4     4
# 13    102     1        1     1
# 14    102     2        3     1
# 15    102     3        1     1
# 16    102     4        1     2
# 17    102     5        3     2
# 18    102     6        3     2
# 19    102     7        4     3
# 20    102     8        1     3
# 21    102     9        3     3
# 22    102    10        4     4
# 23    102    11        3     4
# 24    102    12        3     4
# 25    103     1        4     1
# 26    103     2        1     1
# 27    103     3        2     1
# 28    103     4        2     2
# 29    103     5        4     2
# 30    103     6        1     2
# 31    103     7        4     3
# 32    103     8        4     3
# 33    103     9        1     3
# 34    103    10        4     4
# 35    103    11        1     4
# 36    103    12        2     4

Preferring a tidyverse solution, I thought this might be a job for purrr::map2() , because it seems to involve simultaneous iteration over a four-element list scale_assign and a four-element vector scale_num . I tried to implement the coding of scale within a map2() call, using mutate() and case_when() to do the coding, but could not get it to work.

Thanks in advance for any help!

Instead of performing this operation row-wise and checking for each value it would be easy to perform a join operation if you change scale_assign to named list convert it into a dataframe and do a right_join with input dataframe.

scale_assign <- list(1:3, 4:6, 7:9, 10:12)
names(scale_assign) <- 1:4

library(tidyverse)

enframe(scale_assign) %>%
   unnest(cols = value) %>%
   mutate_all(as.integer) %>%
   right_join(input, by = c("value" = "item"))


# A tibble: 36 x 4
#    name value person response
#   <int> <int>  <int>    <int>
# 1     1     1    101        4
# 2     1     2    101        4
# 3     1     3    101        2
# 4     2     4    101        2
# 5     2     5    101        1
# 6     2     6    101        4
# 7     3     7    101        3
# 8     3     8    101        1
# 9     3     9    101        1
#10     4    10    101        2
# … with 26 more rows

In base R, that can be done using stack and merge

merge(input, stack(scale_assign), all.x = TRUE, by.x = "item", by.y = "values")

data

set.seed(1234)
input <- tibble(
   person = rep(101:103, each = 12),
   item = rep(1:12, 3),
   response = sample(1:4, 36, replace = TRUE))

Here is a data.table solution, using an update-join. Basically this is @Ronak Shah's Base-R answer, but using the data.table -package (ie fast performance on large data-sets).

library(data.table)
#1. set inpus as data.table
#2. create a lookup-table using `stack( scale_assign )`, 
#  and make that also a data.table (using setDT() )
#3. left update join on item
setDT(input)[ setDT( stack( scale_assign ) ), 
                     scale := i.ind,
                     on = .( item = values ) ][]

output

#     person item response scale
#  1:    101    1        3     1
#  2:    101    2        4     1
#  3:    101    3        3     1
#  4:    101    4        2     2
#  5:    101    5        3     2
#  6:    101    6        4     2
#  7:    101    7        1     3
#  8:    101    8        3     3
#  9:    101    9        4     3
# 10:    101   10        2     4
# 11:    101   11        3     4
# 12:    101   12        4     4
# 13:    102    1        4     1
# 14:    102    2        2     1
# 15:    102    3        3     1
# 16:    102    4        2     2
# 17:    102    5        1     2
# 18:    102    6        4     2
# 19:    102    7        1     3
# 20:    102    8        3     3
# 21:    102    9        2     3
# 22:    102   10        1     4
# 23:    102   11        4     4
# 24:    102   12        3     4
# 25:    103    1        1     1
# 26:    103    2        1     1
# 27:    103    3        2     1
# 28:    103    4        1     2
# 29:    103    5        2     2
# 30:    103    6        4     2
# 31:    103    7        4     3
# 32:    103    8        2     3
# 33:    103    9        3     3
# 34:    103   10        2     4
# 35:    103   11        2     4
# 36:    103   12        2     4
#     person item response scale

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM