R - 使用 map 将列表函数应用于数据框列并使用列表元素创建新列

Question

I have a dataframe with and id column and an eats column, and a separate food list.我有一个带有 id 列和一个吃列的数据框，以及一个单独的食物列表。 I want to process the dataframe so that a column is added for each food in the food list which is populated with 1 if the food is present in eats and 0 otherwise.我想处理数据框，以便为食物列表中的每种食物添加一列，如果食物中存在食物，则填充为 1，否则为 0。

txt <- tibble(id = c(1, 2, 3),
          eats = c("apple, oats, banana, milk, sugar",
                   "oats, banana, sugar",
                   "chocolate, milk, sugar"))

food_list <- c("apple", "oats", "chocolate")

for (i in food_list){
  print(i)
  txt <- txt %>% 
    mutate(!!i := if_else(stringr::str_detect(eats, i), 1, 0))
}

I could do this using a for loop but struggling to do it without a loop.我可以使用 for 循环来做到这一点，但在没有循环的情况下很难做到这一点。 I Will be very grateful if someone can point me to how this can be done without using for loops and instead using the purrr library map functions.如果有人能指出如何在不使用 for 循环而是使用 purrr 库映射函数的情况下完成此操作，我将不胜感激。

Thanks!谢谢！

Answer 1

We could use map as我们可以使用map作为

library(purrr)
library(dplyr)
library(stringr)
txt <- map_dfc(food_list, ~ txt %>%
      transmute(!! .x := +(stringr::str_detect(eats, .x)))) %>% 
    bind_cols(txt, .)

-output -输出

txt
# A tibble: 3 x 5
     id eats                             apple  oats chocolate
  <dbl> <chr>                            <int> <int>     <int>
1     1 apple, oats, banana, milk, sugar     1     1         0
2     2 oats, banana, sugar                  0     1         0
3     3 chocolate, milk, sugar               0     0         1

In base R , this can be done in on-liner在base R ，这可以在线完成

txt[food_list] <- +(sapply(food_list, grepl, x = txt$eats))

Answer 2

You can use cbind and str_detect , with map_df :您可以将cbind和str_detect与map_df一起map_df ：

library(dplyr)
library(purrr)
library(stringr)

cbind(txt, map_dfc(food_list, ~+str_detect(txt$eats, .x))%>%set_names(food_list))

  id                             eats apple oats chocolate
1  1 apple, oats, banana, milk, sugar     1    1         0
2  2              oats, banana, sugar     0    1         0
3  3           chocolate, milk, sugar     0    0         1

Answer 3

Here is an alternative solution:这是一个替代解决方案：

library(dplyr)
library(tidyr)

txt %>%
  separate_rows(eats, sep = ", ") %>%
  rowwise() %>%
  mutate(ext = match(eats, food_list)) %>%
  drop_na() %>%
  pivot_wider(names_from = eats, values_from = ext, values_fn = length, values_fill = 0) %>%
  right_join(txt, by = "id") %>%
  relocate(id, eats)

# A tibble: 3 x 5
     id eats                             apple  oats chocolate
  <dbl> <chr>                            <int> <int>     <int>
1     1 apple, oats, banana, milk, sugar     1     1         0
2     2 oats, banana, sugar                  0     1         0
3     3 chocolate, milk, sugar               0     0         1

Answer 4

You may use base R's Reduce like this您可以像这样使用基本 R 的Reduce

Reduce(function(a, b) {
  a[[b]] <- +(grepl(b, a[["eats"]]))
  a
}, init = txt, food_list)

# A tibble: 3 x 5
     id eats                             apple  oats chocolate
  <dbl> <chr>                            <int> <int>     <int>
1     1 apple, oats, banana, milk, sugar     1     1         0
2     2 oats, banana, sugar                  0     1         0
3     3 chocolate, milk, sugar               0     0         1

You may also use purrr::reduce similarly, where you can use (i) walrus operator and (ii) bang bang operators, instead of subsetting您也可以类似地使用purrr::reduce ，您可以在其中使用 (i) walrus 运算符和 (ii) bang bang 运算符，而不是子集

library(tidyverse)
txt <- tibble(id = c(1, 2, 3),
              eats = c("apple, oats, banana, milk, sugar",
                       "oats, banana, sugar",
                       "chocolate, milk, sugar"))

food_list <- c("apple", "oats", "chocolate")

reduce(food_list, .init = txt, ~ .x %>% 
         mutate(!!.y := +str_detect(eats, .y))
         )
#> # A tibble: 3 x 5
#>      id eats                             apple  oats chocolate
#>   <dbl> <chr>                            <int> <int>     <int>
#> 1     1 apple, oats, banana, milk, sugar     1     1         0
#> 2     2 oats, banana, sugar                  0     1         0
#> 3     3 chocolate, milk, sugar               0     0         1

^{Created on 2021-07-29 by the reprex package (v2.0.0)}^{由reprex 包( v2.0.0 ) 于 2021 年 7 月 29 日创建}

Answer 5

Add word boundaries ( \\\\b ) to the values in food_list so that words are matched completely.将单词边界 ( \\\\b ) 添加到food_list的值，以便单词完全匹配。

For example, see the difference in outputs in the following case -例如，在以下情况下查看输出的差异 -

library(stringr)
x <- c('apple', 'pineapple')

str_detect(x, 'apple')
#[1] TRUE TRUE

str_detect(x, '\\bapple\\b')
#[1]  TRUE FALSE

The same goes for grepl in base R -基础 R 中的grepl也是如此 -

food_list <- c("apple", "oats", "chocolate")
food_pat <- sprintf('\\b%s\\b', food_list)
txt[food_list] <- lapply(food_pat, function(x) as.integer(grepl(x, txt$eats)))
txt

# A tibble: 3 x 5
#     id eats                             apple  oats chocolate
#  <dbl> <chr>                            <int> <int>     <int>
#1     1 apple, oats, banana, milk, sugar     1     1         0
#2     2 oats, banana, sugar                  0     1         0
#3     3 chocolate, milk, sugar               0     0         1

R - 使用 map 将列表函数应用于数据框列并使用列表元素创建新列

问题描述

5 个解决方案

解决方案1
4 2021-07-29 00:02:19

解决方案2
3 已采纳 2021-07-29 00:16:16

解决方案3
3 2021-07-29 00:18:41

解决方案4
2 2021-07-29 05:29:18

解决方案5
1 2021-07-29 06:13:07

R - 使用 map 将列表函数应用于数据框列并使用列表元素创建新列

问题描述

5 个解决方案

解决方案1 4 2021-07-29 00:02:19

解决方案2 3 已采纳 2021-07-29 00:16:16

解决方案3 3 2021-07-29 00:18:41

解决方案4 2 2021-07-29 05:29:18

解决方案5 1 2021-07-29 06:13:07

解决方案1
4 2021-07-29 00:02:19

解决方案2
3 已采纳 2021-07-29 00:16:16

解决方案3
3 2021-07-29 00:18:41

解决方案4
2 2021-07-29 05:29:18

解决方案5
1 2021-07-29 06:13:07