I have a file name captured by R like the following:
"0097_abcdef/0097_0/0097_0_04_bed.dbf"
I need to pick up the term between the two slashes /
(ie 0097_0
), but I have tried gsub(".*/","",dbf.files[1])
, but it gives me "0097_0_04_bed.dbf"
, which is not quite what I want.
Can anyone help? Thanks.
you can try using -
.*/(.*)/.*
and use the first group eg \\1
> x = "0097_abcdef/0097_0/0097_0_04_bed.dbf"
> sub(".*/(.*)/.*","\\1",x)
[1] "0097_0"
A different approach is to use the file path manipulation functions. I my opinion, it is a bit clearer than a regexpr - and it handles Windows paths correctly as well:
# On a Linux path
x <- "0097_abcdef/0097_0/0097_0_04_bed.dbf"
basename( dirname(x) )
# [1] "0097_0"
# On a Windows path
y <- "c:\\0097_abcdef\\0097_0\\0097_0_04_bed.dbf"
basename( dirname(y) )
# [1] "0097_0"
..They are vectorized so you can give them a vector of paths. For completeness, there is also file.path
to stitch the parts together again.
You can easily use strsplit
instead. For example,
R> x = "0097_abcdef/0097_0/0097_0_04_bed.dbf"
R> strsplit(x, "/")
[[1]]
[1] "0097_abcdef" "0097_0" "0097_0_04_bed.dbf"
R> strsplit(x, "/")[[1]][2]
[1] "0097_0"
You can use read.table:
tc <- textConnection(dbf.files)
y <- read.table(tc,sep="/",as.is=TRUE)[2]
close(tc)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.