For example:
dataframe 1 has:
Keyword <- c("dog", "cat", "tiger", "cheetah", "man")
Category <- c("walk", "house", "jungle", "fast", "office")
and I have a second dataframe 2 with a column that has description:
description examples can be <- c("dog is barking", "cat is purring","tiger is hunting",
"cheetah is running", "man is working")
I want to write a function that will search the description column of dataframe 2 as per the specific keywords in dataframe 1, and then give out a category. How do I do this using tidyverse? thanks!
This may be helpful to you:
library(dplyr)
df2 %>%
rowwise() %>%
mutate(keyword = first(unlist(strsplit(des, "\\s+", perl = TRUE)))) %>%
left_join(df, by = c("keyword" = "Keyword"))
# A tibble: 5 x 3
# Rowwise:
des keyword Category
<chr> <chr> <chr>
1 dog is barking dog walk
2 cat is purring cat house
3 tiger is hunting tiger jungle
4 cheetah is running cheetah fast
5 man is working man office
Or we can make use of match
function instead of left_join
and set the nomatch
argument to NA_character
in case of not being a match. I prefer this solution:
df2 %>%
rowwise() %>%
mutate(keyword = first(unlist(strsplit(des, "\\s+", perl = TRUE))),
cat = df$Category[match(keyword, df$Keyword, nomatch = NA_character_)])
# A tibble: 5 x 3
# Rowwise:
des keyword cat
<chr> <chr> <chr>
1 dog is barking dog walk
2 cat is purring cat house
3 tiger is hunting tiger jungle
4 cheetah is running cheetah fast
5 man is working man office
Data
> dput(df2)
structure(list(des = c("dog is barking", "cat is purring", "tiger is hunting",
"cheetah is running", "man is working")), row.names = c(NA, -5L
), class = "data.frame")
> dput(df)
structure(list(Category = c("walk", "house", "jungle", "fast",
"office"), Keyword = c("dog", "cat", "tiger", "cheetah", "man"
)), class = "data.frame", row.names = c(NA, -5L))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.