i want to split this genomic coordinate : chr1:713625-714625
to have only the start coordinate : 713625
I tried this command :
data.table(unlist(lapply(data$gene,function(x)unlist(strsplit(x, [:]))[2])))$V1
but it gives me this : 713625-714625
Do you have any suggestion.
You are almost there when using strsplit
, but should use [:-]
or :|-
> unlist(strsplit("chr1:713625-714625", "[:-]"))[2]
[1] "713625"
> unlist(strsplit("chr1:713625-714625", ":|-"))[2]
[1] "713625"
The following code extracts everything between the :
and -
in the string:
string <- c("chr1:713625-714625")
gsub(".*[:]([^.]+)[-].*", "\\1", string)
Output:
[1] "713625"
I tried these 2 commands and both of them gives me the same result :
gsub(".*[:]([^.]+)[-].*", "\\1", string) by Quinten
data.table(unlist(lapply(data$gene,function(x)unlist(strsplit(x, "[:-]"))[2])))$V1
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.