[英]Find whole word that starts at character position in R
Str <- "I love chocolate pudding"
pos <- 8
I need to return the word that starts with the letter c at pos 8, which is chocolate.我需要在 pos 8 返回以字母 c 开头的单词,即巧克力。 How can I do that?
我怎样才能做到这一点?
You can use substring
to get everything after 8th character.您可以使用
substring
来获取第 8 个字符之后的所有内容。 Then remove everything after space using gsub
:然后使用
gsub
删除空格后的所有内容:
gsub(" .*", "", substring(Str, pos))
In case you need to check for the "c":如果您需要检查“c”:
Str <- "I love dogs"
ifelse(
substr(Str, pos, pos) == "c",
gsub(" .*", "", substring(Str, pos)),
""
)
library(stringr)
str_extract(Str, "(?<=[\\w\\s]{7})\\bc\\w+\\b")
[1] "chocolate"
This solution uses str_extract
and positive lookbehind (?<=[\\w\\s]{7})
, which can be glossed along these lines: "if you see seven characters consisting of alphanumeric characters ( \\w
) or white space ( \\s
) to the left, match the immediately next 'word' identified by its boundaries to either side ( \\b
) as well as the letter c
occurring as the first letter of the word.此解决方案使用
str_extract
和正向后视(?<=[\\w\\s]{7})
,可以按照以下方式进行修饰:“如果您看到由字母数字字符 ( \\w
) 或空格组成的七个字符( \\s
) 在左侧,将由其边界标识的紧接下一个“单词”与任一侧 ( \\b
) 以及作为单词的第一个字母出现的字母c
。
Alternatively, use sub
and backreference:或者,使用
sub
和反向引用:
sub(".{7}(\\bc\\w+\\b).*", "\\1", Str)
[1] "chocolate"
Using stringr
and ignoring the 'starting with 'c' condition:使用
stringr
并忽略 'starting with 'c' 条件:
Str %>%
str_sub(pos) %>%
word(1)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.