[英]How to find the match position of a text string for each element in a vector in R?
I have a vector of text characters, say month.name:我有一个文本字符向量,比如month.name:
> month.name
[1] "January" "February" "March" "April" "May" "June" "July"
[8] "August" "September" "October" "November" "December"
What R function should I use to find the position of "ber" such that it returns a numeric vector in the form of c(-1,-1,-1,-1,-1,-1,-1,-1,7,5,6,6), ie, -1 for no match and 5 for the fifth character?我应该使用什么 R 函数来查找“ber”的位置,以便它以 c(-1,-1,-1,-1,-1,-1,-1,-1 的形式返回一个数字向量,7,5,6,6),即 -1 表示不匹配,5 表示第五个字符?
You could use stringr::str_locate
.您可以使用
stringr::str_locate
。 It returns a matrix:它返回一个矩阵:
library(stringr)
str_locate(month.name, "ber")
start end
[1,] NA NA
[2,] NA NA
[3,] NA NA
[4,] NA NA
[5,] NA NA
[6,] NA NA
[7,] NA NA
[8,] NA NA
[9,] 7 9
[10,] 5 7
[11,] 6 8
[12,] 6 8
So str_locate(month.name, "ber")[, 'start']
returns a vector:所以
str_locate(month.name, "ber")[, 'start']
返回一个向量:
[1] NA NA NA NA NA NA NA NA 7 5 6 6
Personally I think NA is a better choice for "no match" than -1.我个人认为 NA 是比 -1 更好的“不匹配”选择。 You could always substitute -1 later if you really want to do so.
如果您真的想这样做,您可以随时替换 -1。 For example:
例如:
pos <- str_locate(month.name, "ber")[, 'start']
ifelse(is.na(pos), -1, pos)
[1] -1 -1 -1 -1 -1 -1 -1 -1 7 5 6 6
This is the exact output of ?regexpr
(along with some other helpful attributes):这是
?regexpr
的确切输出(以及其他一些有用的属性):
regexpr("ber", month.name)
# [1] -1 -1 -1 -1 -1 -1 -1 -1 7 5 6 6
#attr(,"match.length")
# [1] -1 -1 -1 -1 -1 -1 -1 -1 3 3 3 3
#attr(,"index.type")
#[1] "chars"
#attr(,"useBytes")
#[1] TRUE
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.