gsub and remove all characters between < and > in R

Question

I have a string:

a="<gml:posList srsDimension=\"2\" count=\"5\">7 -5.067 -3 56.7 -3.3 58.3 -5.65 57 -8.33</gml:posList>"

and want to gsub everything between the < and >, to now avail so far. I want to have only the numbers remaining (ie 7 -5 -3 56 -3 58...) where I can take every even/odd element to process.

I tried Remove all text between two brackets to no avail

    > gsub('<^|*>','',a[[1]],perl=TRUE)
Error in gsub("<^|*>", "", a[[1]], perl = TRUE) : 
  invalid regular expression '<^|*>'
In addition: Warning message:
In gsub("<^|*>", "", a[[1]], perl = TRUE) : PCRE pattern compilation error
    'nothing to repeat'
    at '*>'

and

gsub('<gml.+>\\d','',a[[1]])

which cuts removes the first digit

I am sure I am missing something obvious, as '<' is not a special character.

Here are some other tries (and fails)

> gsub('<.+>','',a[[1]])
[1] ""
> gsub('<.+>.+<.+>','',a[[1]])
[1] ""
> gsub('<gml.+>','',a[[1]])
[1] ""

Answer 1

You can use

 gsub("<[^>]+>", "",a)
[1] "7 -5.067 -3 56.7 -3.3 58.3 -5.65 57 -8.33"

"<" and ">" are literals, "[^>]" matches any character that is not ">" and "+" allows for one or more matches. Using gsub repeats this match as many times as this pattern is found. The pattern is replaced by the empty string "".

Answer 2

library(qdapRegex)
a="<gml:posList srsDimension=\"2\" count=\"5\">7 -5.067 -3 56.7 -3.3 58.3 -5.65 57 -8.33</gml:posList>"
rm_between(a, "<", ">", extract = T)

gsub and remove all characters between < and > in R

Question

2 answers

solution1
10 ACCPTED 2017-07-31 11:42:26

solution2
0 2017-07-31 11:51:05

gsub and remove all characters between < and > in R

Question

2 answers

solution1 10 ACCPTED 2017-07-31 11:42:26

solution2 0 2017-07-31 11:51:05

solution1
10 ACCPTED 2017-07-31 11:42:26

solution2
0 2017-07-31 11:51:05