[英]Regular expression in R
I am having some troubles with the regular expression in R. I use str_extract from the library stringr and my problem is : 我在R中的正则表达式遇到了一些麻烦。我使用了库stringr中的str_extract,我的问题是:
library(stringr)
test="word1 something word2 something word3 something word3"
temp = str_extract(test,'word2.+word3')
print(temp)
## [1] "word2 something word3 something word3"
The problem is that I want it to stop at the first word3, I don't want the last part of the string. 问题是我希望它停在第一个word3,我不想要字符串的最后一部分。 Any idea please ?
有什么好主意吗? thank you very much
非常感谢你
and if I have 如果我有
test="word1 something word2 something1 word3 something2 word3 something3 word2 something4 word3"
and that I want to keep a 2 size vector like this "word2 something1 word3", "word2 something4 word3" thanks again 并且我希望保留像这样的“word2 something1 word3”,“word2 something4 word3”的2尺寸矢量再次感谢
Change your regex line to: 将您的正则表达式行更改为:
temp = str_extract(test,'word2.+?word3')
^
Notice that I added ?
请注意我添加了
?
which makes the .+
non greedy (ie it captures as little as possible as opposed to capture everything before the next term in the regex). 这使得
.+
非贪婪(即它捕获尽可能少,而不是在正则表达式中的下一个术语之前捕获所有内容)。
To extract all the occurrences, use: 要提取所有事件,请使用:
temp = str_extract_all(test,'word2.+?word3')
I think that you're trying to extract every occurrence between two points in a string. 我认为你试图提取字符串中两点之间的每一个事件。 If I'm wrong my apologies.
如果我错了,我道歉。 This can be accomplished with
qdap's
genXtract
and setting with = TRUE
. 这可以通过
qdap's
genXtract
完成,并with = TRUE
设置。 Also this is not a stringr
answer: 这也不是一个
stringr
答案:
test="word1 something word2 something1 word3 something2 word3 something3 word2 something4 word3"
library(qdap)
genXtract(test, left = "word2", right = "word3", with=TRUE)
## > genXtract(test, "word2", "word3", with=TRUE)
## word2 : word31 word2 : word32
## "word2 something1 word3" "word2 something4 word3"
using base r: We can capture all the output before word3 by using a backreference 使用base r:我们可以使用反向引用捕获word3之前的所有输出
sub("(word3).*","\\1",test)
[1] "word1 something word2 something word3"
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.