Substring content between quotation marks

Question

In a DF I have column entries of different length as the following:

tmp_ezg.\\"dr_HE_10691\\" , tmp_ezg.\\"dr_MV_0110200016\\" , tmp_ezg.\\"dr_MV_0111290017\\" etc.

How can I best substring what's in between the quotation marks?

My idea:

substring(DF$name, 10)

Since the content of the quotation marks has different lengths I cannot provide substring() a value where to stop.

Is there a possibility to substring only between certain symbols (ie quotation marks)?

Answer 1

For example

x <- c('tmp_ezg.\"dr_HE_10691\"' , 
       'tmp_ezg.\"dr_MV_0110200016\"' , 
       'tmp_ezg.\"dr_MV_0111290017\"')
res <- sub('.*?"([^"]+)"', "\\1", x)
print(res, quote=F)
# [1] dr_HE_10691     
# [2] dr_MV_0110200016
# [3] dr_MV_0111290017

... if I'm not mistaken.

Answer 2

To separate the content between the quotation marks (assuming there are exactly two in each entry), you just split the string by \\\\\\" (escaped backslash and quotation mark):

y <- strsplit(x, split = "\\\"")

If all entries end with a quotation mark, this will give you a list of entries with two values, and the second value in each entry is your string.

[[1]]
[1] "tmp_ezg."         "dr_HE_10691"
[[2]]
[1] "tmp_ezg."         "dr_MV_0110200016"
[[3]]
[1] "tmp_ezg."         "dr_MV_0111290017"

Substring content between quotation marks

Question

2 answers

solution1
2 ACCPTED 2016-06-03 08:18:06

solution2
2 2016-06-03 08:34:11

Substring content between quotation marks

Question

2 answers

solution1 2 ACCPTED 2016-06-03 08:18:06

solution2 2 2016-06-03 08:34:11

solution1
2 ACCPTED 2016-06-03 08:18:06

solution2
2 2016-06-03 08:34:11