简体   繁体   中英

Retrieving a partial string from a larger string in R

I have a data set where every observation has an ID value describing multiple things, for example AE1 indicates site A, type E, observation 1. I am trying to generate a column just for type so in the above example I am trying to filter out the E while removing the other data.

I have looked into using gsub however each new type pattern seems to overwrite the previous. The approach that appears to get me the closest is using gsubfn as shown below:

library(gsubfn)

x <- c("AE1", "AE2", "AD1", "AD2", "BE1", "BE2", "BD1", "BD2")
y <- gsubfn(".", list("E" = "easy", "D" = "difficult"), x)

y

[1] "Aeasy1"      "Aeasy2"      "Adifficult1" "Adifficult2" "Beasy1"      "Beasy2"      "Bdifficult1" "Bdifficult2"

The issue with the result is that I still need to remove the initial letter and the final number. In reality I have four type categories not just "E" and "D"

Thanks in advance.

1) gsubfn Your code is actually very close already. Instead of "." use ".(.)." as the regular expression. That will match three characters of which the middle will be processed by the list. The entire match of three characters will be replaced with the result of the processing.

library(gsubfn)

gsubfn(".(.).", list("E" = "easy", "D" = "difficult"), x)
## [1] "easy"      "easy"      "difficult" "difficult" "easy"      "easy"     
## [7] "difficult" "difficult"

2) strapply strapply in the same package would also work. Like other *apply functions it takes the object to work on first, then a qualifier (in this case the regular expression) and finally the list (or function or proto object). Unlike gsubfn instead of substituting the result back into the input string it just returns the result of the processing.

strapply(x, ".(.).", list("E" = "easy", "D" = "difficult"), simplify = TRUE)
## [1] "easy"      "easy"      "difficult" "difficult" "easy"      "easy"     
## [7] "difficult" "difficult"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM