简体   繁体   中英

Replace multiple numbers at once in data.frame R

I have a dataframe look like this:

   C1 C2
   0   1
   2  -1
   1   1
   -1  2
   0   0

and I want to replace all -1 to 'minus' , 0 to 'nc' , 1 to 'plus1' , 2 to 'plus2'. I know how to replace the numbers one by one by using 'gsub' but I do not know how to replace them all at once. as an example for 0 and -1 ,this is my code:

  gsub(df, '0', 'nc');gsub(df, '-1', 'minus')

Thanks in advance,

Something like that maybe? Here I basically creating a "legend" once and then using match over the whole data frame in order to replace the values in all the columns

temp <- data.frame(A = (-1:2), B = c('minus', 'nc', 'plus1', 'plus2'))
df[] <- lapply(df, function(x) temp[match(x, temp$A), "B"])
df
#      C1    C2
# 1    nc plus1
# 2 plus2 minus
# 3 plus1 plus1
# 4 minus plus2
# 5    nc    nc

No need to use regular expressions here. matrix sub-setting and replacement within a simple loop here. Note that for replacement it is generally better to use a for loop than xxxpply family functions.

from <-  -1:2 
to <- c('minus', 'nc', 'plus1', 'plus2')
for (i in seq_along(from))df[df==from[i]] <- to[i]

   C1    C2
1    nc plus1
2 plus2 minus
3 plus1 plus1
4 minus plus2
5    nc    nc

If you don't have any other values except the one specified for conversion, this also works

 lvls <- c('minus', 'nc', 'plus1', 'plus2') #create a vector for specifying the levels of factor.

Convert each column to factor and specify the labels as lvls and reconvert it back to character if you want character columns

 df[] <- lapply(df, function(x) as.character(factor(x, labels=lvls)))

 df
 #     C1    C2
 #1    nc plus1
 #2 plus2 minus
 #3 plus1 plus1
 #4 minus plus2
 #5    nc    nc

Update

Also, in case you want an option with gsub there is mgsub in qdap which will take vectors as search terms and replacements.

library(qdap)
pat <- -1:2
replacer <- c('minus', 'nc', 'plus1', 'plus2')
v1 <- mgsub(pat, replacer, as.matrix(df)) #on the original dataset
dim(v1) <- dim(df)
df[] <- v1
 df
 #    C1    C2
 #1    nc plus1
 #2 plus2 minus
 #3 plus1 plus1
 #4 minus plus2
 #5    nc    nc

data

df <- structure(list(C1 = c(0L, 2L, 1L, -1L, 0L), C2 = c(1L, -1L, 1L, 
2L, 0L)), .Names = c("C1", "C2"), class = "data.frame", row.names = c(NA, 
-5L))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM