简体   繁体   中英

how do you extract values between two characters in R?

I am trying to extract the server name (server101) from this string in R using regular expression:

value between @ and the following first period (.)

t<-c("Current CPU load - jvm machine[example network-app_svc_group_mem4]@server101.example.com")

I've tried this:

gsub('.*\\@(\\d+),(\\d+).*', '\\1', t)

this does not seem to be working, any ideas?

Since you only expect one match, you may use a simple sub here:

t <- "Current CPU load - jvm machine[example network-app_svc_group_mem4]@server101.example.com"
sub(".*@([^.]+)\\..*", "\\1", t)
##  => [1] "server101"

See the R demo online .

Details

  • .* - any 0+ chars, as many as possible
  • @ - a @ char
  • ([^.]+) - Group 1 ( "\\\\1" ):
  • \\\\. - a dot (other chars you need to escape are $ , ^ , * , ( , ) , + , [ , \\ , ? )
  • .* - any 0+ chars, as many as possible

Here are some alternatives.

You may use the following base R code to extract 1+ characters other than . ( [^.]+ ) after the first @ :

> t <- "Current CPU load - jvm machine[example network-app_svc_group_mem4]@server101.example.com"
> pattern="@([^.]+)"
> m <- regmatches(t,regexec(pattern,t))
> result = unlist(m)[2]
> result
[1] "server101"

With regexec , you can access submatches (capturing group contents).

See the online R demo

Another way is to use regmatches / regexpr with a PCRE regex with a (?<=@) lookbehind that only checks for the character presence, but does not put the character into the match:

> result2 <- regmatches(t, regexpr("(?<=@)[^.]+", t, perl=TRUE))
> result2
[1] "server101"

A clean stringr approach would be to use the same PCRE regex with str_extract (that uses a similar (because it also supports lookarounds), ICU, regex flavor):

> library(stringr)
> t<-c("Current CPU load - jvm machine[example network-app_svc_group_mem4]@server101.example.com")
> str_extract(t, "(?<=@)[^.]+")
[1] "server101"

with stringr:

library(stringr)
str_match(t, ".*@([^\\.]*)\\..*")[2]
#[1] "server101"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM