Regex look ahead assertion

Question

I need a regex expert on this problem. It's linked to a SO question I've lost, where the data are the following:

x = c("IID:WE:G12D/V/A", "GH:SQ:p.R172W/G", "HH:WG:p.S122F/H")

I need to split each element of x to isolate the end part which can be consituted of letter - slash - letter - .... slash - letter . What I want is to obtain these two vectors as output:

o1 = c("IID:WE:G12", "GH:SQ:p.R172", "HH:WG:p.S122")
o2 = c("D/V/A", "W/G", "F/H")

I have this solution for o1 :

gsub('[A-Z]/.+','',x)
#[1] "IID:WE:G12"   "GH:SQ:p.R172" "HH:WG:p.S122"

Good. For o2 , I tried to use assertion and particularly look-ahead assertion:

gsub('.+(?=[A-Z]/.+)','',x, perl=T)
#[1] "V/A" "W/G" "F/H"

But this is not the wanted result!

Any idea what is going wrong with the second regex?

Answer 1

As a possible solution, you can use the following replacement:

gsub('.*?([^/](?:/[^/])+)$','\\1',x, perl=T)

Or (if there must be a letter):

gsub('.*?([A-Z](?:/[A-Z])+)$','\\1',x, perl=T)

See IDEONE demo

.*? - matches as few as possible characters other than a newline from the start
([^/](?:/[^/])+) - a capturing group matching:
- [^/] - a character other than / (or - if [AZ] - any English uppercase character)
- (?:/[^/])+ - 1 or more sequences of / and a character other than / (or if you use [AZ] , an uppercase letter).
$ - end of string

Answer 2

The following, very near to what you came up with, will work:

gsub('[^/]+(?=[AZ]/.+)','',x, perl=T)

(Your line didn't work because you were asking for "any character", which includes "\\")

Answer 3

Try this:

gsub('\\w\\/.*(\\/.*)?','',x)

Regex look ahead:

gsub('\\w(?=\\/).*','',x,perl=T)

gsub('.*\\d(?=\\w\\/)','',x, perl=T)  #For O2

Regex look ahead assertion

Question

3 answers

solution1
3 2015-07-21 14:20:15

solution2
3 ACCPTED 2015-07-21 14:26:40

solution3
1 2015-07-21 14:25:53

Regex look ahead assertion

Question

3 answers

solution1 3 2015-07-21 14:20:15

solution2 3 ACCPTED 2015-07-21 14:26:40

solution3 1 2015-07-21 14:25:53

solution1
3 2015-07-21 14:20:15

solution2
3 ACCPTED 2015-07-21 14:26:40

solution3
1 2015-07-21 14:25:53