如何在 str_detect 中使用邊界（tidyr 包）

Question

這是一些數據。

library(stringr)
library(dplyr)

df <- tibble(sentences)

我想用“她”這個詞來識別所有的句子。 但是，當然，這也會返回帶有“那里”和“這里”之類的詞的句子。

df %>% filter(str_detect(sentences, "her"))
# A tibble: 43 x 1
   sentences                                    
   <chr>                                        
 1 The boy was there when the sun rose.         
 2 Help the woman get back to her feet.         
 3 What joy there is in living.                 
 4 There are more than two factors here.        
 5 Cats and dogs each hate the other.           
 6 The wharf could be seen at the farther shore.
 7 The tiny girl took off her hat.              
 8 Write a fond note to the friend you cherish. 
 9 There was a sound of dry leaves outside.     
10 Add the column and put the sum here.

stringr::str_detect的文檔說，“將字符、單詞、行和句子邊界與boundary()匹配。” 我無法弄清楚如何做到這一點，也無法在任何地方找到示例。 所有文檔示例都涉及str_split或str_count函數。

我的問題與此問題有關，但我特別想了解如何使用stringr::boundary函數。

Answer 1

我們可以在開始和結束時指定單詞邊界（ \\\\b ）以避免任何部分匹配

library(stringr)
library(dplyr)
df %>% 
    filter(str_detect(sentences, "\\bher\\b"))
#                             sentences
#1 Help the woman get back to her feet.
#2      The tiny girl took off her hat.

或者使用boundary來包裹

df %>%
      filter(str_detect(sentences, boundary("her")))

如何在 str_detect 中使用邊界（tidyr 包）

問題描述

1 個解決方案

解決方案1
2 已采納 2020-02-11 16:18:55

如何在 str_detect 中使用邊界（tidyr 包）

問題描述

1 個解決方案

解決方案1 2 已采納 2020-02-11 16:18:55

解決方案1
2 已采納 2020-02-11 16:18:55