[英]dplyr: error with rowwise mutate with NA
I am getting strange errors with row-wise mutate
in dplyr
. 我在
dplyr
中dplyr
按行mutate
奇怪错误。 Here is an example: 这是一个例子:
set.seed(1)
df <- data.frame(a = rnorm(5), b = rnorm(5))
df[2,'b'] <- NA
There is no trouble with sum
, but summary functions are problematic: sum
没什么问题,但摘要功能有问题:
mutate(rowwise(df), sum(a, b, na.rm = T)) # works
mutate(rowwise(df), mean(a, b, na.rm = T))
#! Error: missing value where TRUE/FALSE needed
mutate(rowwise(df), median(a, b, na.rm = T))
#! Error: unused argument (-0.820468384118015)
Now, we can try to NA
in the first column: 现在,我们可以尝试
NA
在第一列:
df <- data.frame(a = rnorm(5), b = rnorm(5))
df[2,'a'] <- NA
mutate(rowwise(df), sum(a, b, na.rm = T)) # works
mutate(rowwise(df), mean(a, b, na.rm = T))
#! no error, but returns `NaN`
mutate(rowwise(df), median(a, b, na.rm = T))
#! Error: unused argument (-0.820468384118015)
I am not sure if I am doing something wrong here. 我不确定在这里是否做错了什么。 I think the expected behavior should be the same as:
我认为预期的行为应与以下内容相同:
as.data.frame(apply(df, 1, mean, na.rm = T)
Thanks! 谢谢!
Your error is that you are calling mean
and median
incorrectly. 您的错误是您错误地调用了
mean
和median
。
While sum
can take any number of arguments and will just add them all, mean
and median
take in only ONE x
argument to take the mean/median of. 尽管
sum
可以采用任意数量的参数,并且只会将它们全部相加,但mean
和median
仅采用一个x
参数来取其平均值/中位数。
Just like if a
and b
were vectors and you wanted the mean of the combined vector you'd use mean(c(a, b))
rather than mean(a,b)
, you do the same here: 就像
a
和b
是向量,并且您想要组合向量的均值一样,您将使用mean(c(a, b))
而不是mean(a,b)
,您可以在此处执行以下操作:
mutate(rowwise(df), mean=mean(c(a, b), na.rm = T), med=median(c(a, b), na.rm=T))
(side note: you are only calculating the mean and median of 2 values at a time here, so the mean equals the median anyway...) (注意:您一次只计算两个值的平均值和中位数,因此无论如何平均值等于中位数...)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.