Regular expression in R

Question

I am having some troubles with the regular expression in R. I use str_extract from the library stringr and my problem is :

library(stringr)
test="word1 something word2 something word3 something word3"
temp = str_extract(test,'word2.+word3')
print(temp)
## [1] "word2 something word3 something word3"

The problem is that I want it to stop at the first word3, I don't want the last part of the string. Any idea please ? thank you very much

and if I have

test="word1 something word2 something1 word3 something2 word3 something3 word2 something4 word3"

and that I want to keep a 2 size vector like this "word2 something1 word3", "word2 something4 word3" thanks again

Answer 1

Change your regex line to:

temp = str_extract(test,'word2.+?word3')
                                ^

Notice that I added ? which makes the .+ non greedy (ie it captures as little as possible as opposed to capture everything before the next term in the regex).

To extract all the occurrences, use:

temp = str_extract_all(test,'word2.+?word3')

Answer 2

I think that you're trying to extract every occurrence between two points in a string. If I'm wrong my apologies. This can be accomplished with qdap's genXtract and setting with = TRUE . Also this is not a stringr answer:

test="word1 something word2 something1 word3 something2 word3 something3 word2 something4 word3"

library(qdap)
genXtract(test, left = "word2", right = "word3", with=TRUE)

## > genXtract(test, "word2", "word3", with=TRUE)
##         word2  :  word31         word2  :  word32 
## "word2 something1 word3" "word2 something4 word3"

Answer 3

using base r: We can capture all the output before word3 by using a backreference

 sub("(word3).*","\\1",test)
 [1] "word1 something word2 something word3"

Regular expression in R

Question

3 answers

solution1
12 ACCPTED 2013-05-01 17:48:33

solution2
3 2013-05-01 18:43:11

solution3
0 2017-12-24 10:09:37

Regular expression in R

Question

3 answers

solution1 12 ACCPTED 2013-05-01 17:48:33

solution2 3 2013-05-01 18:43:11

solution3 0 2017-12-24 10:09:37

solution1
12 ACCPTED 2013-05-01 17:48:33

solution2
3 2013-05-01 18:43:11

solution3
0 2017-12-24 10:09:37