R - How to convert this nested for loop into an lapply function that can mutate a list

Question

I have data that looks like this

aList <- list(a1 = c("apple", "banana", "orange", "strawberry", "cherry"),
              a2 = c("banana", "cherry", "apple"),
              a3 = c("apple", "strawberry", "pineapple"),
              a4 = c("raspberry", "strawberry", "apple"),
              a5 = c("pineapple", "lemon", "orange", "banana", "apple"),
              a6 = c("lemon", "apple", "blueberry"),
              a7 = c("watermelon", "apple", "banana", "mango"),
              a8 = c("mango", "cherry", "apple", "lemon"),
              a9 = c("orange", "banana", "strawberry"),
              a10 = c("mango", "strawberry"))

I'd like to get it into a vertical format, like what happens when you run this code:

vertical_data <- list()
for (x in names(aList)) {
  for (y in aList[[x]]) {
    if (is.null(vertical_data[[y]])) {
      vertical_data[[y]] <- x
    } else {
      vertical_data[[y]] <- c(x, vertical_data[[y]])
    }
  }
}
vertical_data

I'd like each entry to tell me where the particular fruit occurs.

This was easy enough to do with a double for loop. But when I do the same thing with a nested lapply function, it looks like it doesn't modify the list (ie vertical_data) at all. Why is that? The reason I'd like to do this with an apply function is because it's faster. My actual dataset will have thousands of items, and "fruits". It'll take way too long with for loops.

I'd really appreciate the help.

Thanks

Answer 1

We can use split on the unlist ed data

split(rep(names(aList), lengths(aList)), unlist(aList))

Or another option would be to stack to a two column 'data.frame' and then do the split

with(stack(aList), split(as.character(ind), values))
#$apple
#[1] "a1" "a2" "a3" "a4" "a5" "a6" "a7" "a8"

#$banana
#[1] "a1" "a2" "a5" "a7" "a9"

#$blueberry
#[1] "a6"

#$cherry
#[1] "a1" "a2" "a8"

#$lemon
#[1] "a5" "a6" "a8"

#$mango
#[1] "a7"  "a8"  "a10"

#$orange
#[1] "a1" "a5" "a9"

#$pineapple
#[1] "a3" "a5"

#$raspberry
#[1] "a4"

#$strawberry
#[1] "a1"  "a3"  "a4"  "a9"  "a10"

#$watermelon
#[1] "a7"

Or as @rawr mentioned

unstack(stack(aList)[2:1])

Regarding the assignment within the lapply and the for loop, it is based on the environment. In the for loop, the assignment modifies the object in the global env, but in lapply , it is a self-contained env or else have to do <<- (not advisable) or specify the env as the global env

vertical_data <- list()
lapply(names(aList), function(x) lapply(aList[[x]], 
      function(y) if (is.null(vertical_data[[y]])) {
         vertical_data[[y]] <<- x
         } else {vertical_data[[y]] <<- c(x, vertical_data[[y]])
         }))

Answer 2

We can use enframe to convert names list to dataframe and then split name based on value .

tibble::enframe(aList) %>% tidyr::unnest(value) %>% {split(.$name, .$value)}

#$apple
#[1] "a1" "a2" "a3" "a4" "a5" "a6" "a7" "a8"

#$banana
#[1] "a1" "a2" "a5" "a7" "a9"

#$blueberry
#[1] "a6"

#$cherry
#[1] "a1" "a2" "a8"

#$lemon
#[1] "a5" "a6" "a8"

#$mango
#[1] "a7"  "a8"  "a10"

#$orange
#[1] "a1" "a5" "a9"

#$pineapple
#[1] "a3" "a5"

#$raspberry
#[1] "a4"

#$strawberry
#[1] "a1"  "a3"  "a4"  "a9"  "a10"

#$watermelon
#[1] "a7"

R - How to convert this nested for loop into an lapply function that can mutate a list

Question

2 answers

solution1
4 ACCPTED 2020-03-20 23:14:30

solution2
1 2020-03-21 10:38:50

R - How to convert this nested for loop into an lapply function that can mutate a list

Question

2 answers

solution1 4 ACCPTED 2020-03-20 23:14:30

solution2 1 2020-03-21 10:38:50

solution1
4 ACCPTED 2020-03-20 23:14:30

solution2
1 2020-03-21 10:38:50