简体   繁体   中英

Extracting string inbetween two string patterns in R

If I have a character vector:

 links <- c("http://fdsfdsfdsfsdaaa.com/t5/this/bd-p/fdsfsdfdsfscshdad/dasd",
            "http://ffdsfdddddfdf.com/t5/that/bd-p/fdsfdsfsddfjfsd")

I want to extract "this" and "that" knowing that they are between "t5" and "bd-p." Totally lost on this one.

Using sub :

sub(".*t5/(.*)/bd-p.*","\\1",links)
[1] "this" "that"

Try this:

lapply(regmatches(links, regexec("t5/(.*)/bd-p", links)), '[', 2)
[[1]]
[1] "this"

[[2]]
[1] "that"

regexec combined with regmatches is good for getting subexpressions (ie the stuff in the parentheses). regmatches will return the whole search string and the subexpression, which is why I extract only the second element, which is the subexpression.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM