简体   繁体   中英

Replace every string in dataframe containing character(s)

In the test dataframe below, I am attempting to change every string in the dataframe containing "NA" to "" (so as to make NAs blank).

dat <- as.data.frame(matrix(ncol=2, nrow=2))
dat$V1 <- c("  NA", "foo")
dat$V2 <- c("bar", "NA   ")

dat
   V1   V2
1  NA  bar
2 foo NA 

However, the following command returns a completely blank dataframe, as if all strings contained "NA". Why does this happen and what would be the correct solution?

value <- "NA"

dat[grepl(value, dat)] <- ""
dat <- lapply(dat, function(x) {gsub("NA", "", x)})
dat <- data.frame(dat)

Just using gsub

value <- "NA" 

for (i in 1:ncol(dat)) {
  dat[,i] <- gsub(value, "", dat[,i])  
}
dat
library(data.table)
setDT(dat)

for(j in seq_along(dat)){
  set(dat, i = which(dat[[j]] %like% "NA"), j = j, value = "")
}
      V1  V2
# 1:     bar
# 2: foo  

Maybe in your case you are better off with a matrix.

datm <- as.matrix(dat)

Now your proposed solution works:

datm[grepl(value, datm)] <- ""

or using gsub:

datm = gsub("\\s*NA\\s*", "",datm)

You can convert it to a dataframe after data cleansing.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM