I have a string:
a="<gml:posList srsDimension=\"2\" count=\"5\">7 -5.067 -3 56.7 -3.3 58.3 -5.65 57 -8.33</gml:posList>"
and want to gsub everything between the < and >, to now avail so far. I want to have only the numbers remaining (ie 7 -5 -3 56 -3 58...) where I can take every even/odd element to process.
I tried Remove all text between two brackets to no avail
> gsub('<^|*>','',a[[1]],perl=TRUE)
Error in gsub("<^|*>", "", a[[1]], perl = TRUE) :
invalid regular expression '<^|*>'
In addition: Warning message:
In gsub("<^|*>", "", a[[1]], perl = TRUE) : PCRE pattern compilation error
'nothing to repeat'
at '*>'
and
gsub('<gml.+>\\d','',a[[1]])
which cuts removes the first digit
I am sure I am missing something obvious, as '<' is not a special character.
Here are some other tries (and fails)
> gsub('<.+>','',a[[1]])
[1] ""
> gsub('<.+>.+<.+>','',a[[1]])
[1] ""
> gsub('<gml.+>','',a[[1]])
[1] ""
You can use
gsub("<[^>]+>", "",a)
[1] "7 -5.067 -3 56.7 -3.3 58.3 -5.65 57 -8.33"
"<" and ">" are literals, "[^>]" matches any character that is not ">" and "+" allows for one or more matches. Using gsub
repeats this match as many times as this pattern is found. The pattern is replaced by the empty string "".
library(qdapRegex)
a="<gml:posList srsDimension=\"2\" count=\"5\">7 -5.067 -3 56.7 -3.3 58.3 -5.65 57 -8.33</gml:posList>"
rm_between(a, "<", ">", extract = T)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.