use gsub in R to cut the character out between two slashes

Question

I have a file name captured by R like the following:

"0097_abcdef/0097_0/0097_0_04_bed.dbf"

I need to pick up the term between the two slashes / (ie 0097_0 ), but I have tried gsub(".*/","",dbf.files[1]) , but it gives me "0097_0_04_bed.dbf" , which is not quite what I want.

Can anyone help? Thanks.

Answer 1

you can try using -

 .*/(.*)/.*

and use the first group eg \\1

> x = "0097_abcdef/0097_0/0097_0_04_bed.dbf"
> sub(".*/(.*)/.*","\\1",x)
[1] "0097_0"

Answer 2

A different approach is to use the file path manipulation functions. I my opinion, it is a bit clearer than a regexpr - and it handles Windows paths correctly as well:

# On a Linux path
x <- "0097_abcdef/0097_0/0097_0_04_bed.dbf"
basename( dirname(x) )
# [1] "0097_0"

# On a Windows path
y <- "c:\\0097_abcdef\\0097_0\\0097_0_04_bed.dbf"
basename( dirname(y) )
# [1] "0097_0"

..They are vectorized so you can give them a vector of paths. For completeness, there is also file.path to stitch the parts together again.

Answer 3

You can easily use strsplit instead. For example,

R> x = "0097_abcdef/0097_0/0097_0_04_bed.dbf"
R> strsplit(x, "/")
[[1]]
[1] "0097_abcdef"       "0097_0"            "0097_0_04_bed.dbf"

R> strsplit(x, "/")[[1]][2]
[1] "0097_0"

Answer 4

You can use read.table:

tc <- textConnection(dbf.files)
y <- read.table(tc,sep="/",as.is=TRUE)[2]
close(tc)

use gsub in R to cut the character out between two slashes

Question

4 answers

solution1
8 ACCPTED 2011-09-28 10:39:59

solution2
6 2011-09-28 14:24:01

solution3
2 2011-09-28 10:34:35

solution4
0 2011-09-28 12:31:31

use gsub in R to cut the character out between two slashes

Question

4 answers

solution1 8 ACCPTED 2011-09-28 10:39:59

solution2 6 2011-09-28 14:24:01

solution3 2 2011-09-28 10:34:35

solution4 0 2011-09-28 12:31:31

solution1
8 ACCPTED 2011-09-28 10:39:59

solution2
6 2011-09-28 14:24:01

solution3
2 2011-09-28 10:34:35

solution4
0 2011-09-28 12:31:31