简体   繁体   English

具有 NA 值的两列数据帧的总和

[英]Sum of two Columns of Data Frame with NA Values

I have a data frame with some NA values.我有一个带有一些 NA 值的数据框。 I need the sum of two of the columns.我需要两列的总和。 If a value is NA, I need to treat it as zero.如果一个值为 NA,我需要将其视为零。

a  b c d
1  2 3 4
5 NA 7 8

Column e should be the sum of b and c: e 列应该是 b 和 c 的总和:

e
5
7

I have tried a lot of things, and done two dozen searches with no luck.我尝试了很多东西,并且没有运气就做了两打搜索。 It seems like a simple problem.这似乎是一个简单的问题。 Any help would be appreciated!任何帮助,将不胜感激!

dat$e <- rowSums(dat[,c("b", "c")], na.rm=TRUE)
dat
#   a  b c d e
# 1 1  2 3 4 5
# 2 5 NA 7 8 7

dplyr solution, taken from here : dplyr解决方案,取自此处

library(dplyr)
dat %>% 
    rowwise() %>% 
    mutate(e = sum(b, c, na.rm = TRUE))

Here is another solution, with concatenated ifelse() :这是另一个解决方案,连接ifelse()

 dat$e <- ifelse(is.na(dat$b) & is.na(dat$c), dat$e <-0, ifelse(is.na(dat$b), dat$e <- 0 + dat$c, dat$b + dat$c))
 #  a  b c d e
 #1 1  2 3 4 5
 #2 5 NA 7 8 7

Edit, here is another solution that uses with as suggested by @kasterma in the comments, this is much more readable and straightforward:编辑,这里是另一种解决方案是使用with在评论@kasterma的建议,这是可读和直白:

 dat$e <- with(dat, ifelse(is.na(b) & is.na(c ), 0, ifelse(is.na(b), 0 + c, b + c)))

if you want to keep NA if both columns has it you can use:如果你想保留 NA 如果两列都有它,你可以使用:

Data, sample:数据、样本:

dt <- data.table(x = sample(c(NA, 1, 2, 3), 100, replace = T), y = sample(c(NA, 1, 2, 3), 100, replace = T))

Solution:解决方案:

dt[, z := ifelse(is.na(x) & is.na(y), NA_real_, rowSums(.SD, na.rm = T)), .SDcols = c("x", "y")]

(the data.table way) (data.table方式)

I hope that it may help you我希望它可以帮助你

Some cases you have a few columns that are not numeric .在某些情况下,您有几列不是 numeric This approach will serve you both.这种方法将为你们俩服务。 Note that: c_across() for dplyr version 1.0.0 and later请注意: c_across()用于 dplyr 版本 1.0.0 及更高版本

df <- data.frame(
  TEXT = c("text1", "text2"), a = c(1,5), b = c(2, NA), c = c(3,7), d = c(4,8))

df2 <- df %>% 
  rowwise() %>% 
  mutate(e = sum(c_across(a:d), na.rm = TRUE))
# A tibble: 2 x 6
# Rowwise: 
# TEXT        a     b     c     d     e
# <chr>     <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 text1     1     2     3     4    10
# 2 text2     5    NA     7     8    20

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM