R：正则表达式环顾四周，以掌握两种模式之间的关联

Question

I have a vector with strings like: 我有一个带有字符串的向量，例如：

x <-c('kjsdf_class-X1(z)20_sample-318TT1X.3','kjjwer_class-Z3(z)29_sample-318TT2X.4')

I wanted to use regular expressions to get what is between substrings 'class-' and '_sample' (such as 'X1(z)20' and 'Z3(z)29' in x ), and thought the lookaround regex ((?=...), (?!...),... and so) would do it. 我想用正则表达式得到的是子“讲座”和“_Sample”（如“X1（Z）20”和“Z3（Z）29”之间x ），并认为环视正则表达式（（？ = ...），（？！...），...等等）就可以做到。 Cannot get it to work though! 虽然无法正常工作！

Sorry if this is similar to other SO questions eg here or here ). 很抱歉，如果这与其他SO问题类似，例如here或here ）。

Answer 1

This is a bit different then what you had in mind, but it will do the job. 这与您的想法有些不同，但是可以完成工作。

gsub("(.*class-)|(.)|(_sample.*)", "\\2", x)

The logic is the following, you have 3 "sets" of strings: 逻辑如下，您有3组“字符串”：

1) characters .* ending in class- 1）字符.*期末class-

2) characters . 2）字符.

3) Characters starting with _sample and characters afterwords .* 3）以_sample字符和后缀.*字符

From those you want to keep the second "set" \\\\2 . 从那些您想要保留第二个“集合” \\\\2 。

Or another maybe easier to understand: 或者另一个可能更容易理解：

gsub("(.*class-)|(_sample.*)", "", x)

Take any number of characters that end in class- and the string _sample followed by any number of characters, and substitute them with the NULL character "" 接受以class-结尾的任意数量的字符，字符串_sample后跟任意数量的字符，然后将它们替换为NULL字符""

Answer 2

We could use str_extract_all from library(stringr) 我们可以使用str_extract_all从library(stringr)

 library(stringr)
 unlist(str_extract_all(x, '(?<=class-)[^_]+(?=_sample)'))
 #[1] "X1(z)20" "Z3(z)29"

This should also work if there are multiple instances of the pattern within a string 如果字符串中有模式的多个实例，这也应该起作用

 x1 <- paste(x, x)
 str_extract_all(x1, '(?<=class-)[^_]+(?=_sample)')
 #[[1]]
 #[1] "X1(z)20" "X1(z)20"

 #[[2]]
 #[1] "Z3(z)29" "Z3(z)29"

Basically, we are matching the characters that are between the two lookarounds ( (?<=class-) and (?=_sample) ). 基本上，我们匹配两个环视（ (?<=class-)和(?=_sample) ）之间的字符。 We extract characters that is not a _ (based on the example) preceded by class- and succeded by _sample . 我们提取不是_字符（基于示例），该字符前面是class- ， _sample是_sample 。

Answer 3

gsub('.*-([^-]+)_.*','\\1',x)
[1] "X1(z)20" "Z3(z)29"

R：正则表达式环顾四周，以掌握两种模式之间的关联

问题描述

3 个解决方案

解决方案1
3 已采纳 2015-08-06 09:13:22

解决方案2
1 2015-08-06 09:37:50

解决方案3
0 2015-08-06 11:10:23

R：正则表达式环顾四周，以掌握两种模式之间的关联

问题描述

3 个解决方案

解决方案1 3 已采纳 2015-08-06 09:13:22

解决方案2 1 2015-08-06 09:37:50

解决方案3 0 2015-08-06 11:10:23

解决方案1
3 已采纳 2015-08-06 09:13:22

解决方案2
1 2015-08-06 09:37:50

解决方案3
0 2015-08-06 11:10:23