set.seed(1)
data=data.frame("a"=sample(-5:5, 20, r=T),
"b"=sample(-5:5, 20, r=T),
"c"=sample(-5:5, 20, r=T))
What is most effective way to set values : -5, -3, 4 to 'NA' across 'a' and 'b' and 'c' with using actual column names?
One option in base R
(without any packages) is to replicate the values, do a comparison and assign to NA
data[data == c(-5, -3, 4)[col(data)]] <- NA
If we need only selected columns
nm1 <- c('a', 'c')
data[nm1][data[nm1]== c(-5, 4)[col(data[nm1])]] <- NA
If we are replacing multiple values for each column, then use lapply
data[nm1] <- lapply(data[nm1], function(x) replace(x, x %in% c(-5, -3, 4), NA))
In tidyverse
, we can use case_when
library(dplyr)
data %>%
mutate_at(vars(nm1), ~ case_when(!. %in% c(-5, -3, 4) ~ .))
For multiple values
data %>%
mutate_at(vars(nm1), ~ case_when(. %in% 4 ~ 99L, !. %in% c(-5, -3) ~ .))
# a b c
#1 3 3 3
#2 -2 3 2
#3 1 -1 3
#4 NA -1 1
#5 -4 -4 2
#6 1 4 0
#7 5 3 99
#8 -4 -5 1
#9 5 -2 NA
#10 NA -3 99
#11 NA 0 0
#12 -1 4 2
#13 -1 4 -4
#14 99 0 -4
#15 0 -2 0
#16 99 -2 0
#17 1 4 NA
#18 3 3 NA
#19 -1 1 NA
#20 -1 0 2
With data.table
, we can use fcase
library(data.table)
setDT(data)[, (nm1) := lapply(.SD, function(x) fcase(x %in% 4 ~ 99L, !x %in% c(-5, -3) ~ x)), .SDcols = nm1]
Using dplyr
, you can try:
data %>%
mutate_at(vars(a, b, c), ~ replace(., . %in% c(-5, -3, 4), NA))
a b c
1 NA 5 NA
2 -1 NA 2
3 1 2 3
4 NA -4 1
5 NA NA 0
6 NA -1 3
7 5 NA NA
8 2 -1 0
9 1 NA 3
10 NA -2 2
A data.table
version:
library(data.table)
setDT(data)[, lapply(.SD, function(x) replace(x, x %in% c(-5, -3, 4), NA)), .SDcols = c('a', 'b', 'c')]
Output:
a b c
1: 3 3 3
2: -2 3 2
3: 1 -1 3
4: NA -1 1
5: -4 -4 2
6: 1 NA 0
7: 5 3 NA
8: -4 NA 1
9: 5 -2 NA
10: NA NA NA
11: NA 0 0
12: -1 NA 2
13: -1 NA -4
14: NA 0 -4
15: 0 -2 0
16: NA -2 0
17: 1 NA NA
18: 3 3 NA
19: -1 1 NA
20: -1 0 2
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.