简体   繁体   中英

R regex - extract strings between two characters for multiple instances

I am trying to extract some keywords from a string in R as follows.

I want to get the strings in between the first ":" after each "[" and ", " or "\\b".

string <- c("[G1]3451:GHEIN, [G2]FR343:4453, [G05]RT3342:34:GR", "[L1]TTG4:4532, [L3]EK445:GHR[1C]", "[RT1]JGR:45,RE")

gsub('\\[\\S+:', '', string)
"GHEIN, 4453, GR" "4532, GHR[1C]"   "45,RE"

The problem is when two ":" are there. I should be getting the output as 34:GR instead of GR .

out <- c("GHEIN, 4453, 34:GR", "4532, GHR[1C]", "45,RE")

How to get the desired result using regex in R ?

Make it non-greedy:

gsub('*?\\[\\S+:', '', string)
[1] "GHEIN, 4453, 34:GR" "4532, GHR[1C]"      "45,RE"      

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM