[英]How to substitute NA by 0 in 20 columns?
I want to substitute NA by 0 in 20 columns.我想在 20 列中用 0 替换 NA。 I found this approach for 2 columns, however I guess it's not optimal if the number of columns is 20. Is there any alternative and more compact solution?
我发现这种方法适用于 2 列,但是如果列数为 20,我想这不是最佳选择。有没有其他更紧凑的解决方案?
mydata[,c("a", "c")] <-
apply(mydata[,c("a","c")], 2, function(x){replace(x, is.na(x), 0)})
UPDATE: For simplicity lets take this data with 8 columns and substitute NAs in columns b, c, e, f and d更新:为简单起见,让我们用 8 列来获取这些数据,并在 b、c、e、f 和 d 列中替换 NA
a b c d e f g d
1 NA NA 2 3 4 7 6
2 g 3 NA 4 5 4 Y
3 r 4 4 NA t 5 5
The result must be this one:结果一定是这样的:
a b c d e f g d
1 0 0 2 3 4 7 6
2 g 3 NA 4 5 4 Y
3 r 4 4 0 t 5 5
The replace_na
function from tidyr
can be applied over a vector as well as a dataframe ( http://tidyr.tidyverse.org/reference/replace_na.html ).来自
tidyr
的replace_na
函数可以应用于向量和数据帧 ( http://tidyr.tidyverse.org/reference/replace_na.html )。
Use it with a mutate_at
variation from dplyr
to apply it to multiple columns at the same time:用它与
mutate_at
从变化dplyr
在同一时间将其应用到多个列:
my_data %>% mutate_at(vars(b,c,e,f), replace_na, 0)
or或者
my_data %>% mutate_at(c('b','c','e','f'), replace_na, 0)
Another option:另外一个选项:
library(tidyr)
v <- c('b', 'c', 'e', 'f')
replace_na(df, as.list(setNames(rep(0, length(v)), v)))
Which gives:这使:
# a b c d e f g d.1
#1 1 0 0 2 3 4 7 6
#2 2 g 3 NA 4 5 4 Y
#3 3 r 4 4 0 t 5 5
Here is a tidyverse
way to replace NA with different values based on the data type of the column.这是一种根据列的数据类型用不同值替换 NA 的
tidyverse
方法。
library(tidyverse)
dataset %>% mutate_if(is.numeric, replace_na, 0) %>%
mutate_if(is.character, replace_na, "")
We can use NAer
from qdap
to convert the NA to 0. If there are multiple column, loop using lapply
.我们可以使用
NAer
的qdap
将 NA 转换为 0。如果有多列,则使用lapply
循环。
library(qdap)
nm1 <- c('b', 'c', 'e', 'f')
mydata[nm1] <- lapply(mydata[nm1], NAer)
mydata
# a b c d e f g d.1
#1 1 0 0 2 3 4 7 6
#2 2 g 3 NA 4 5 4 Y
#3 3 r 4 4 0 t 5 5
Or using dplyr
或者使用
dplyr
library(dplyr)
mydata %>%
mutate_each_(funs(replace(., which(is.na(.)), 0)), nm1)
# a b c d e f g d.1
#1 1 0 0 2 3 4 7 6
#2 2 g 3 NA 4 5 4 Y
#3 3 r 4 4 0 t 5 5
Another strategy using tidyr::replace_na()
另一种使用
tidyr::replace_na()
策略
library(tidyverse)
df <- read.table(header = T, text = 'a b c d e f g h
1 NA NA 2 3 4 7 6
2 g 3 NA 4 5 4 Y
3 r 4 4 NA t 5 5')
df %>%
mutate(across(everything(), ~replace_na(., 0)))
#> a b c d e f g h
#> 1 1 0 0 2 3 4 7 6
#> 2 2 g 3 0 4 5 4 Y
#> 3 3 r 4 4 0 t 5 5
Created on 2021-08-22 by the reprex package (v2.0.0)由reprex 包( v2.0.0 ) 于 2021 年 8 月 22 日创建
Knowing that replace_na()
accepts a named list for the replace
argument, using purrr::map()
is a good option here to reduce the amount of code.知道
replace_na()
接受replace
参数的命名列表,使用purrr::map()
是减少代码量的好选择。 It is also possible to replace different values in each column using 'map2()'.也可以使用“map2()”替换每列中的不同值。
code:代码:
library(tidyverse)
tbl <-read_table("a b c d e f g d
1 NA NA 2 3 4 7 6
2 g 3 NA 4 5 4 Y
3 r 4 4 NA t 5 5")
#> Warning: Duplicated column names deduplicated: 'd' => 'd_1' [8]
replace_na(tbl, map(tbl, ~0))
#> # A tibble: 3 × 8
#> a b c d e f g d_1
#> <dbl> <chr> <dbl> <dbl> <dbl> <chr> <dbl> <chr>
#> 1 1 0 0 2 3 4 7 6
#> 2 2 g 3 0 4 5 4 Y
#> 3 3 r 4 4 0 t 5 5
#alternative
tbl %>%
replace_na(map(., ~0))
#> # A tibble: 3 × 8
#> a b c d e f g d_1
#> <dbl> <chr> <dbl> <dbl> <dbl> <chr> <dbl> <chr>
#> 1 1 0 0 2 3 4 7 6
#> 2 2 g 3 0 4 5 4 Y
#> 3 3 r 4 4 0 t 5 5
Created on 2021-09-11 by the reprex package (v2.0.0)由reprex 包( v2.0.0 ) 于 2021 年 9 月 11 日创建
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.