你如何在R中提取两个字符之间的值？

Question

I am trying to extract the server name (server101) from this string in R using regular expression:我正在尝试使用正则表达式从 R 中的这个字符串中提取服务器名称（server101）：

value between @ and the following first period (.) @和下一个句点 (.)之间的值

t<-c("Current CPU load - jvm machine[example network-app_svc_group_mem4]@server101.example.com")

I've tried this:我试过这个：

gsub('.*\\@(\\d+),(\\d+).*', '\\1', t)

this does not seem to be working, any ideas?这似乎不起作用，有什么想法吗？

Answer 1

Since you only expect one match, you may use a simple sub here:由于您只期待一场比赛，您可以在这里使用一个简单的sub ：

t <- "Current CPU load - jvm machine[example network-app_svc_group_mem4]@server101.example.com"
sub(".*@([^.]+)\\..*", "\\1", t)
##  => [1] "server101"

See the R demo online .在线查看R 演示。

Details细节

.* - any 0+ chars, as many as possible .* - 任何 0+ 个字符，尽可能多
@ - a @ char @ - 一个@字符
([^.]+) - Group 1 ( "\\\\1" ): ([^.]+) - 第 1 组（ "\\\\1" ）：
\\\\. - a dot (other chars you need to escape are $ , ^ , * , ( , ) , + , [ , \\ , ? ) - 一个点（你需要转义的其他字符是$ , ^ , * , ( , ) , + , [ , \\ , ? ）
.* - any 0+ chars, as many as possible .* - 任何 0+ 个字符，尽可能多

Here are some alternatives.这里有一些替代方案。

You may use the following base R code to extract 1+ characters other than .您可以使用以下基本 R 代码来提取 1+ 个字符，而不是. ( [^.]+ ) after the first @ : ( [^.]+ ) 在第一个@ ：

> t <- "Current CPU load - jvm machine[example network-app_svc_group_mem4]@server101.example.com"
> pattern="@([^.]+)"
> m <- regmatches(t,regexec(pattern,t))
> result = unlist(m)[2]
> result
[1] "server101"

With regexec , you can access submatches (capturing group contents).使用regexec ，您可以访问子regexec （捕获组内容）。

See the online R demo查看在线 R 演示

Another way is to use regmatches / regexpr with a PCRE regex with a (?<=@) lookbehind that only checks for the character presence, but does not put the character into the match:另一种方法是将regmatches / regexpr与 PCRE 正则表达式一起使用，带有(?<=@)后视，仅检查字符是否存在，但不会将字符放入匹配中：

> result2 <- regmatches(t, regexpr("(?<=@)[^.]+", t, perl=TRUE))
> result2
[1] "server101"

A clean stringr approach would be to use the same PCRE regex with str_extract (that uses a similar (because it also supports lookarounds), ICU, regex flavor):一个干净的stringr方法是使用与str_extract相同的 PCRE 正则表达式（使用类似的（因为它也支持环视）、ICU、正则表达式风格）：

> library(stringr)
> t<-c("Current CPU load - jvm machine[example network-app_svc_group_mem4]@server101.example.com")
> str_extract(t, "(?<=@)[^.]+")
[1] "server101"

Answer 2

with stringr:与字符串：

library(stringr)
str_match(t, ".*@([^\\.]*)\\..*")[2]
#[1] "server101"

你如何在R中提取两个字符之间的值？

问题描述

2 个解决方案

解决方案1
6 2016-12-16 20:15:36

解决方案2
2 已采纳 2016-12-16 20:03:03

你如何在R中提取两个字符之间的值？

问题描述

2 个解决方案

解决方案1 6 2016-12-16 20:15:36

解决方案2 2 已采纳 2016-12-16 20:03:03

解决方案1
6 2016-12-16 20:15:36

解决方案2
2 已采纳 2016-12-16 20:03:03