简体   繁体   中英

R Returning all characters after the first underscore

Sample DATA

x=c("AG.av08_binloop_v6","TL.av1_binloopv2")

Sample ATTEMPT

y=gsub(".*_","",x)

Sample DESIRED

WANT=c("binloop_v6","binloopv2")

Basically I aim to extract all the characters AFTER the first underscore value.

In the pattern , we can change the zero or more any characters ( .* - here . is metacharacter that can match any character) to zero or more characters that is not a _ ( [^_]* ) from the start ( ^ ) of the string.

sub("^[^_]*_", "", x)
#[1] "binloop_v6" "binloopv2" 

If we don't specify it as such, the _ will match till the last _ in the string and uptill that substring will be lost returning 'v6' and 'binloopv2'


An easier option would be word from stringr

library(stringr)
word(x, 2, sep = "_")
#[1] "binloop"   "binloopv2"

regexpr gives the position of first match (in this case _ ). Then substring can be used to extract the part of x from relevant position to the end ( nchar(x) )

substring(x, regexpr("_", x) + 1, nchar(x))
#[1] "binloop_v6" "binloopv2" 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM