简体   繁体   中英

Using regular expression, how can I add elements after I find a match in r?

I have a column that has a string of numbers-length range 10 and 11. This is an example of some of the values in the column:

column=c("5699420001","00409226602")

How can I place a hyphen after the first four digits (in the strings with 10 characters) and after the first five digits (in the strings with 11 characters), as well as after the second four digits for both lengths? Output is provided below. I wanted to use stringr for this.

column_standard=c("5699-4200-01","00409-2266-02")

try using this as your expression:

\b(\d{4,5})(\d{4})(\d{2}\b)

It sets up three capture groups that you can later use in your replacement to easily add hyphens between them.

Then you just replace with:

\1-\2-\3

Thanks to @Dunois for pointing out how it would look in code:

column_standard <- sapply(column, function(x) stringr::str_replace(x, "^(\\d{4,5})(\\d{4})(\\d{2})", "\\1\\-\\2-\\3"))

Here is a live example .

Here's a solution using capture groups with stringr 's str_replace() function:

library(stringr)

column <- c("5699420001","00409226602")

column_standard <- sapply(column, function(x){
  ifelse(nchar(x) == 11, 
         stringr::str_replace(x, "^([0-9]{5})([0-9]{4})(.*)", "\\1\\-\\2-\\3"),
         stringr::str_replace(x, "^([0-9]{4})([0-9]{4})(.*)", "\\1\\-\\2-\\3"))
})

column_standard

#     5699420001     00409226602 
# "5699-4200-01" "00409-2266-02"

The code should be fairly self-explanatory. I can provide a detailed explanation upon request.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM