简体   繁体   中英

How to extract using REGEX in R for a ]

I have a column that has many rows. the column has a value like

[Testing Data 123-INDEPENDENCE, MO] 99 *2-5PLT

I want to write str_extract to extract everything after ] so the output should be 99 *2-5PLT.

Thanks for your help.

This will work:

a <- "[Testing Data 123-INDEPENDENCE, MO] 99 *2-5PLT"
str_extract(a, "(?<=\\] )(.*)")

[1] "99 *2-5PLT"

Here we use a lookbehind to find the closing bracket (also the trailing space), then match everything after:

https://regex101.com/r/Aq9D1p/1

Edit, you could also do something like:

a %>% str_split_fixed(., "] ", n = 2)

     [,1]                                 [,2]        
[1,] "[Testing Data 123-INDEPENDENCE, MO" "99 *2-5PLT"

Also a base R solution:

regmatches(a, regexpr("\\[[^[]*\\]\\s+\\K.*", a, perl = TRUE))

"99 *2-5PLT"

You can drop everything till ] .

Using sub in base R -

x <- "[Testing Data 123-INDEPENDENCE, MO] 99 *2-5PLT"
sub('.*\\]\\s+', '', x)
#[1] "99 *2-5PLT"

Similarly, with stringr::str_remove -

stringr::str_remove(x, '.*\\]\\s+')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM