My string patterns looks like this: UNB+UNOC:3+4399945681577+_GLN_Company__+180101:0050+10870
and I am trying to extract everything after the second last +
, ie 180101:0050+10870
.
Thus far, I managed to address the second last block 180101:0050
with this expression (?<=\\+)[^\\+]+(?=\\+[^\\+]*$)
but fail to include the last block including the last +. Here is my sample: regex101
The expression is meant for R and I still need to escape the characters later on. This format it just for testing purposes in Regex101.
We could capture group based on the occurrence of +
from the end ( $
) of the string.
sub(".*\\+([^+]+\\+[^+]+$)", "\\1", str1)
#[1] "180101:0050+10870"
str1 <- "UNB+UNOC:3+4399945681577+_GLN_Company__+180101:0050+10870"
You may use
\+\K[^+]+\+[^+]*$
Or, if you would like to use it with stringr::str_extract
:
(?<=\+)[^+]+\+[^+]*$
See the regex demo . Details:
\\+
- a +
char \\K
- match reset operator (?<=\\+)
- location right after a +
symbol [^+]+
- one or more chars other than +
\\+
- a +
[^+]+
- one or more chars other than +
$
- end of string. See R demo online :
x <- "UNB+UNOC:3+4399945681577+_GLN_Company__+180101:0050+10870"
regmatches(x, regexpr("\\+\\K[^+]+\\+[^+]*$", x, perl=TRUE))
## => [1] "180101:0050+10870"
library(stringr)
str_extract(x, "(?<=\\+)[^+]+\\+[^+]*$")
## => [1] "180101:0050+10870"
Another way you can do in this case:
library(stringr)
str_extract("UNB+UNOC:3+4399945681577+_GLN_Company__+180101:0050+10870", "\\d+:\\d+\\+\\d+")
#"180101:0050+10870"
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.