從R中的字符串向量中提取單個單詞

Question

假設我有一個像下面這樣的字符串向量，並且我想創建一個邏輯向量，如果在字符串中出現單詞“ white”，“ bull”或“ tiger”（注意不是whitetip），則包含TRUE，如果它們包含FALSE不要。 我該如何在R中執行此操作？ 我嘗試使用Stringr的str_detect（），但結果為“ whitetip”提供了TRUE（而且我不知道如何為每個類別使用str_detect（）...即我必須創建多個邏輯向量-1用於我的3種白虎和公牛中的每一種）。 任何幫助都將非常棒，謝謝！

string<-c("tiger?", "thought to involve a 2.7 m [9'], 400-kb bull",
    "4 m to 5 m [13' to 16.5'] white", "oceanic whitetip shark, 2.5 to 3m", 
    "white","white","bull","white","oceanic whitetip shark, 2.5m","tiger",
    "white, >6'","bull, 6'")

Answer 1

這是匹配所有字符串的一種方法

sapply(c("white","bull","tiger"), function(x) {
    grepl(paste0("\\b",x,"\\b"), string)
})

這給

      white  bull tiger
 [1,] FALSE FALSE  TRUE  # tiger?
 [2,] FALSE  TRUE FALSE  # thought to involve a 2.7 m [9'], 400-kb bull
 [3,]  TRUE FALSE FALSE  # 4 m to 5 m [13' to 16.5'] white
 [4,] FALSE FALSE FALSE  # oceanic whitetip shark, 2.5 to 3m
 [5,]  TRUE FALSE FALSE  # white
 [6,]  TRUE FALSE FALSE  # white
 [7,] FALSE  TRUE FALSE  # bull
 [8,]  TRUE FALSE FALSE  # white
 [9,] FALSE FALSE FALSE  # oceanic whitetip shark, 2.5m
[10,] FALSE FALSE  TRUE  # tiger
[11,]  TRUE FALSE FALSE  # white, >6'
[12,] FALSE  TRUE FALSE  # bull, 6'

Answer 2

如果需要提取相關單詞，可以使用stringr::str_extract ：

str_extract(string, "\\b(bull|tiger|white)\\b")

# [1] "tiger" "bull"  "white" NA      "white" "white" "bull"  "white" NA     
#[10] "tiger" "white" "bull"

從R中的字符串向量中提取單個單詞

問題描述

2 個解決方案

解決方案1
4 2015-07-28 21:51:07

解決方案2
1 2015-07-28 23:19:16

從R中的字符串向量中提取單個單詞

問題描述

2 個解決方案

解決方案1 4 2015-07-28 21:51:07

解決方案2 1 2015-07-28 23:19:16

解決方案1
4 2015-07-28 21:51:07

解決方案2
1 2015-07-28 23:19:16