簡體   English   中英

R - 如果列包含來自向量的字符串,則 append 標志到另一列

[英]R - If column contains a string from vector, append flag into another column

我的數據

我有一個單詞向量,如下所示。 這是一個過度簡化,我的真實向量超過 600 個字:

myvec <- c("cat", "dog, "bird")

我有一個具有以下結構的 dataframe:

structure(list(id = c(1, 2, 3), onetext= c("cat furry pink british", 
"dog cat fight", "bird cat issues"), cop= c("Little Grey Cat is the nickname given to a kitten of the British Shorthair breed that rose to viral fame on Tumblr through a variety of musical tributes and photoshopped parodies in late September 2014", 
"Dogs have soft fur and tails so do cats Do cats like to chase their tails", 
"A cat and bird can coexist in a home but you will have to take certain measures to ensure that a cat cannot physically get to the bird at any point"
), text3 = c("On October 4th the first single topic blog devoted to the little grey cat was launched On October 20th Tumblr blogger Torridgristle shared a cutout exploitable image of the cat, which accumulated over 21000 notes in just over three months.", 
"there are many fights going on and this is just an example text", 
"Some cats will not care about a pet bird at all while others will make it its life mission to get at a bird You will need to assess the personalities of your pets and always remain on guard if you allow your bird and cat to interact"
)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, 
-3L))

如下圖所示

樣本數據集

我的問題

對於我的向量myvec上的每個關鍵字,我需要在數據集周圍 go 並檢查onetextcoptext3列,如果我在這 3 列中的任何一個上找到關鍵字,那么我需要將append關鍵字放入新列。 結果將如下圖所示:

預期結果

我的原始數據集非常大(最后一列最長),因此執行多個嵌套循環(這是我嘗試過的)並不理想。

編輯:請注意,只要該單詞在該行中出現一次,就足夠了,應該列出。 應列出所有關鍵字。

我怎么能這樣做? 我正在使用 tidyverse,所以我的數據集實際上是一個tibble

類似的帖子(但不完全)

以下帖子有些相似,但不完全是:

以下是實現結果的方法:

  1. 創建向量的模式
  2. 使用mutate across檢查所需的列
  3. 如果檢測到所需的字符串,則提取到新列!
myvec <- c("cat", "dog", "bird")

pattern <- paste(myvec, collapse="|")

library(dplyr)
library(tidyr)
df %>% 
  mutate(across(-id, ~case_when(str_detect(., pattern) ~ str_extract(., pattern)), .names = "new_col{col}")) %>% 
  unite(Match, starts_with('new'), na.rm = TRUE, sep = ',')
     id onetext                cop                                                                                       text3                                                                                                 Match   
  <dbl> <chr>                  <chr>                                                                                     <chr>                                                                                                 <chr>   
1     1 cat furry pink british Little Grey Cat is the nickname given to a kitten of the British Shorthair breed that ro~ On October 4th the first single topic blog devoted to the little grey cat was launched On October 20~ cat,cat 
2     2 dog cat fight          Dogs have soft fur and tails so do cats Do cats like to chase their tails                 there are many fights going on and this is just an example text                                       dog,cat 
3     3 bird cat issues        A cat and bird can coexist in a home but you will have to take certain measures to ensur~ Some cats will not care about a pet bird at all while others will make it its life mission to get at~ bird,ca~
> library(tidyr)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM