[英]Dataframe column looping and string concatenation based on conditional in R (pref dplyr)
我有一个 2 列 dataframe。第一列包含 class 项目(在本例中为蔬菜)的单个条目。 第二列是传入的new_item
,它们是不同类别的杂货(肉类、水果、蔬菜等)。
library(tidyverse)
current <- tibble::tribble(
~prev_veg, ~new_item,
"cabbage", "lettuce",
NA, "apple",
NA, "beef",
NA, "spinach",
NA, "broccoli",
NA, "mango"
)
current
我想遍历新项目列,只将蔬菜添加到prev_veg
。 任何新的蔬菜项目都需要添加到现有列表中。 重要的是,我有一个包含所有可能出现在该列表中的蔬菜的向量。 所需的 dataframe 如下。
target_veg <- c("cabbage","lettuce", "spinach", "broccoli"
desired <- tibble::tribble(
~prev_veg, ~new_item,
"cabbage", "lettuce",
"cabbage, lettuce", "apple",
"cabbage, lettuce", "strawbery",
"cabbage, lettuce", "spinach",
"cabbage, lettuce, spinach", "broccoli",
"cabbage, lettuce, spinach, broccoli", "mango"
)
desired
最后,这个dataframe中还有多个其他数据列我没有包含在这里(只包含相关列)。 理想情况下请寻找 dplyr 解决方案。
current <- tibble::tribble(
~prev_veg, ~new_item,
"cabbage", "lettuce",
NA, "apple",
NA, "beef",
NA, "spinach",
NA, "broccoli",
NA, "mango"
)
target_veg <- c("cabbage", "lettuce", "spinach", "broccoli")
library(dplyr, warn.conflicts = FALSE)
library(purrr)
current %>%
mutate(
prev_veg = accumulate(
head(new_item, -1),
~ if_else(.y %in% target_veg, paste(.x, .y, sep = ", "), .x),
.init = prev_veg[1]
)
)
#> # A tibble: 6 × 2
#> prev_veg new_item
#> <chr> <chr>
#> 1 cabbage lettuce
#> 2 cabbage, lettuce apple
#> 3 cabbage, lettuce beef
#> 4 cabbage, lettuce spinach
#> 5 cabbage, lettuce, spinach broccoli
#> 6 cabbage, lettuce, spinach, broccoli mango
由reprex package (v2.0.1) 创建于 2022-02-24
这也可以通过查找match
的索引然后使用rowwise
粘贴来创建
library(dplyr)
library(tidyr)
current %>%
mutate(ind = lag(match(new_item, target_veg))) %>%
fill(ind, .direction = "downup") %>%
rowwise %>%
mutate(ind = toString(target_veg[seq(ind)])) %>%
ungroup %>%
mutate(prev_veg = coalesce(prev_veg, ind), .keep = "unused")
-输出
# A tibble: 6 × 2
prev_veg new_item
<chr> <chr>
1 cabbage lettuce
2 cabbage, lettuce apple
3 cabbage, lettuce beef
4 cabbage, lettuce spinach
5 cabbage, lettuce, spinach broccoli
6 cabbage, lettuce, spinach, broccoli mango
注意:与@IceCreamToucan 的accumulate
相比, rowwise
可能会很慢。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.