[英]R - using map to apply a list function to dataframe column and create new columns with elements of the list
I have a dataframe with and id column and an eats column, and a separate food list.我有一个带有 id 列和一个吃列的数据框,以及一个单独的食物列表。 I want to process the dataframe so that a column is added for each food in the food list which is populated with 1 if the food is present in eats and 0 otherwise.
我想处理数据框,以便为食物列表中的每种食物添加一列,如果食物中存在食物,则填充为 1,否则为 0。
txt <- tibble(id = c(1, 2, 3),
eats = c("apple, oats, banana, milk, sugar",
"oats, banana, sugar",
"chocolate, milk, sugar"))
food_list <- c("apple", "oats", "chocolate")
for (i in food_list){
print(i)
txt <- txt %>%
mutate(!!i := if_else(stringr::str_detect(eats, i), 1, 0))
}
I could do this using a for loop but struggling to do it without a loop.我可以使用 for 循环来做到这一点,但在没有循环的情况下很难做到这一点。 I Will be very grateful if someone can point me to how this can be done without using for loops and instead using the purrr library map functions.
如果有人能指出如何在不使用 for 循环而是使用 purrr 库映射函数的情况下完成此操作,我将不胜感激。
Thanks!谢谢!
We could use map
as我们可以使用
map
作为
library(purrr)
library(dplyr)
library(stringr)
txt <- map_dfc(food_list, ~ txt %>%
transmute(!! .x := +(stringr::str_detect(eats, .x)))) %>%
bind_cols(txt, .)
-output -输出
txt
# A tibble: 3 x 5
id eats apple oats chocolate
<dbl> <chr> <int> <int> <int>
1 1 apple, oats, banana, milk, sugar 1 1 0
2 2 oats, banana, sugar 0 1 0
3 3 chocolate, milk, sugar 0 0 1
In base R
, this can be done in on-liner在
base R
,这可以在线完成
txt[food_list] <- +(sapply(food_list, grepl, x = txt$eats))
You can use cbind
and str_detect
, with map_df
:您可以将
cbind
和str_detect
与map_df
一起map_df
:
library(dplyr)
library(purrr)
library(stringr)
cbind(txt, map_dfc(food_list, ~+str_detect(txt$eats, .x))%>%set_names(food_list))
id eats apple oats chocolate
1 1 apple, oats, banana, milk, sugar 1 1 0
2 2 oats, banana, sugar 0 1 0
3 3 chocolate, milk, sugar 0 0 1
Here is an alternative solution:这是一个替代解决方案:
library(dplyr)
library(tidyr)
txt %>%
separate_rows(eats, sep = ", ") %>%
rowwise() %>%
mutate(ext = match(eats, food_list)) %>%
drop_na() %>%
pivot_wider(names_from = eats, values_from = ext, values_fn = length, values_fill = 0) %>%
right_join(txt, by = "id") %>%
relocate(id, eats)
# A tibble: 3 x 5
id eats apple oats chocolate
<dbl> <chr> <int> <int> <int>
1 1 apple, oats, banana, milk, sugar 1 1 0
2 2 oats, banana, sugar 0 1 0
3 3 chocolate, milk, sugar 0 0 1
You may use base R's Reduce
like this您可以像这样使用基本 R 的
Reduce
Reduce(function(a, b) {
a[[b]] <- +(grepl(b, a[["eats"]]))
a
}, init = txt, food_list)
# A tibble: 3 x 5
id eats apple oats chocolate
<dbl> <chr> <int> <int> <int>
1 1 apple, oats, banana, milk, sugar 1 1 0
2 2 oats, banana, sugar 0 1 0
3 3 chocolate, milk, sugar 0 0 1
You may also use purrr::reduce
similarly, where you can use (i) walrus operator and (ii) bang bang operators, instead of subsetting您也可以类似地使用
purrr::reduce
,您可以在其中使用 (i) walrus 运算符和 (ii) bang bang 运算符,而不是子集
library(tidyverse)
txt <- tibble(id = c(1, 2, 3),
eats = c("apple, oats, banana, milk, sugar",
"oats, banana, sugar",
"chocolate, milk, sugar"))
food_list <- c("apple", "oats", "chocolate")
reduce(food_list, .init = txt, ~ .x %>%
mutate(!!.y := +str_detect(eats, .y))
)
#> # A tibble: 3 x 5
#> id eats apple oats chocolate
#> <dbl> <chr> <int> <int> <int>
#> 1 1 apple, oats, banana, milk, sugar 1 1 0
#> 2 2 oats, banana, sugar 0 1 0
#> 3 3 chocolate, milk, sugar 0 0 1
Created on 2021-07-29 by the reprex package (v2.0.0)由reprex 包( v2.0.0 ) 于 2021 年 7 月 29 日创建
Add word boundaries ( \\\\b
) to the values in food_list
so that words are matched completely.将单词边界 (
\\\\b
) 添加到food_list
的值,以便单词完全匹配。
For example, see the difference in outputs in the following case -例如,在以下情况下查看输出的差异 -
library(stringr)
x <- c('apple', 'pineapple')
str_detect(x, 'apple')
#[1] TRUE TRUE
str_detect(x, '\\bapple\\b')
#[1] TRUE FALSE
The same goes for grepl
in base R -基础 R 中的
grepl
也是如此 -
food_list <- c("apple", "oats", "chocolate")
food_pat <- sprintf('\\b%s\\b', food_list)
txt[food_list] <- lapply(food_pat, function(x) as.integer(grepl(x, txt$eats)))
txt
# A tibble: 3 x 5
# id eats apple oats chocolate
# <dbl> <chr> <int> <int> <int>
#1 1 apple, oats, banana, milk, sugar 1 1 0
#2 2 oats, banana, sugar 0 1 0
#3 3 chocolate, milk, sugar 0 0 1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.