[英]Extract all words from a sentence ending in an expression using R
suppose I have the next string:假设我有下一个字符串:
"palavras a serem encontradas fazer-se encontrar-se, enganar-se" "palavras a serem encontradas fazer-se encontrar-se, enganar-se"
How can I extract the words "fazer-se" "encontrar-se" "enganar-se"我如何提取单词“fazer-se”“encontrar-se”“enganar-se”
I'm try o use stringr like我正在尝试使用 stringr 之类的
library(stringr)
sentence <- "palavras a serem encontradas fazer-se encontrar-se, enganar-se"
str_extract_all(sentence, "se$")
I'd like this output:我想要这个输出:
[1] "fazer-se" "encontrar-se" "enganar-se"
We can specify the word boundary ( \\\\b
) and not the end ( $
) of the string (there is only one match for that, ie at the end of the string) and we need to get the characters that are not a whitespace before the se
substring, so use \\\\S+
ie one or more non-whitespace characters我们可以指定单词边界( \\\\b
)而不是字符串的结尾( $
)(只有一个匹配,即在字符串的末尾),我们需要获取不是空格的字符在se
子字符串之前,所以使用\\\\S+
即一个或多个非空白字符
library(stringr)
str_extract_all(sentence, "\\S+se\\b")[[1]]
#[1] "fazer-se" "encontrar-se" "enganar-se"
In base R, we can use gregexpr
and regmatches
:在基础 R 中,我们可以使用gregexpr
和regmatches
:
regmatches(sentence, gregexpr('\\w+-se', sentence))[[1]]
#[1] "fazer-se" "encontrar-se" "enganar-se"
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.