简体   繁体   中英

Pattern matching and replacement in R

I am not familiar at all with regular expressions, and would like to do pattern matching and replacement in R.

I would like to replace the pattern #1 , #2 in the vector: original = c("#1", "#2", "#10", "#11") with each value of the vector vec = c(1,2) .

The result I am looking for is the following vector: c("1", "2", "#10", "#11") I am not sure how to do that. I tried doing:

for(i in 1:2) {
    pattern = paste("#", i, sep = "")
    original = gsub(pattern, vec[i], original, fixed = TRUE)
}

but I get :

#> original
#[1] "1"  "2"  "10" "11"

instead of: "1" "2" "#10" "#11"

I would appreciate any help I can get! Thank you!

Specify that you are matching the entire string from start ( ^ ) to end ( $ ).

Here, I've matched exactly the conditions you are looking at in this example, but I'm guessing you'll need to extend it:

> gsub("^#([1-2])$", "\\1", original)
[1] "1"   "2"   "#10" "#11"

So, that's basically, "from the start, look for a hash symbol followed by either the exact number one or two. The one or two should be just one digit (that's why we don't use * or + or something) and also ends the string. Oh, and capture that one or two because we want to 'backreference' it."

Here's a slightly different take that uses zero width negative lookahead assertion (what a mouthful!). This is the (?!...) which matches # at the start of a string as long as it is not followed by whatever is in ... . In this case two (or equivalently, more as long as they are contiguous) digits. It replaces them with nothing.

gsub( "^#(?![0-9]{2})" , "" , original , perl = TRUE )
[1] "1"   "2"   "#10" "#11"

Another option using gsubfn :

library(gsubfn)
gsubfn("^#([1-2])$",  I, original)   ## Function substituting
[1] "1"   "2"   "#10" "#11"

Or if you want to explicitly use the values of your vector , using vec values:

gsubfn("^#[1-2]$",  as.list(setNames(vec,c("#1", "#2"))), original) 

Or formula notation equivalent to function notation:

gsubfn("^#([1-2])$",  ~ x, original)   ## formula substituting

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM