如何找到R中向量中每个元素的文本字符串的匹配位置？

Question

I have a vector of text characters, say month.name:我有一个文本字符向量，比如month.name：

> month.name
 [1] "January"   "February"  "March"     "April"     "May"       "June"      "July"     
 [8] "August"    "September" "October"   "November"  "December"

What R function should I use to find the position of "ber" such that it returns a numeric vector in the form of c(-1,-1,-1,-1,-1,-1,-1,-1,7,5,6,6), ie, -1 for no match and 5 for the fifth character?我应该使用什么 R 函数来查找“ber”的位置，以便它以 c(-1,-1,-1,-1,-1,-1,-1,-1 的形式返回一个数字向量,7,5,6,6)，即 -1 表示不匹配，5 表示第五个字符？

Answer 1

You could use stringr::str_locate .您可以使用stringr::str_locate 。 It returns a matrix:它返回一个矩阵：

library(stringr)
str_locate(month.name, "ber")

      start end
 [1,]    NA  NA
 [2,]    NA  NA
 [3,]    NA  NA
 [4,]    NA  NA
 [5,]    NA  NA
 [6,]    NA  NA
 [7,]    NA  NA
 [8,]    NA  NA
 [9,]     7   9
[10,]     5   7
[11,]     6   8
[12,]     6   8

So str_locate(month.name, "ber")[, 'start'] returns a vector:所以str_locate(month.name, "ber")[, 'start']返回一个向量：

 [1] NA NA NA NA NA NA NA NA  7  5  6  6

Personally I think NA is a better choice for "no match" than -1.我个人认为 NA 是比 -1 更好的“不匹配”选择。 You could always substitute -1 later if you really want to do so.如果您真的想这样做，您可以随时替换 -1。 For example:例如：

pos <- str_locate(month.name, "ber")[, 'start']
ifelse(is.na(pos), -1, pos)

 [1] -1 -1 -1 -1 -1 -1 -1 -1  7  5  6  6

Answer 2

This is the exact output of ?regexpr (along with some other helpful attributes):这是?regexpr的确切输出（以及其他一些有用的属性）：

regexpr("ber", month.name)
# [1] -1 -1 -1 -1 -1 -1 -1 -1  7  5  6  6
#attr(,"match.length")
# [1] -1 -1 -1 -1 -1 -1 -1 -1  3  3  3  3
#attr(,"index.type")
#[1] "chars"
#attr(,"useBytes")
#[1] TRUE

如何找到R中向量中每个元素的文本字符串的匹配位置？

问题描述

2 个解决方案

解决方案1
1 2020-09-25 00:36:43

解决方案2
1 已采纳 2020-09-25 00:55:19

如何找到R中向量中每个元素的文本字符串的匹配位置？

问题描述

2 个解决方案

解决方案1 1 2020-09-25 00:36:43

解决方案2 1 已采纳 2020-09-25 00:55:19

解决方案1
1 2020-09-25 00:36:43

解决方案2
1 已采纳 2020-09-25 00:55:19