[英]How to check if NA within replace() function in R?
In my dataset, the duration of a activity is either given in hours (column duration_hours
) or in minutes (column duration_minutes
).在我的数据集中,活动的持续时间以小时(列duration_hours
)或分钟(列duration_minutes
)给出。 If it is given in hours, the duration_minutes
column is empty ( NA
) and vice versa.如果以小时为单位,则duration_minutes
列为空( NA
),反之亦然。
I now want to convert the values given in minutes into hours by dividing them by 60 (minutes).我现在想通过将它们除以 60(分钟)来将以分钟为单位给出的值转换为小时。
To do so I tried this command:为此,我尝试了以下命令:
df <- df %>% mutate(duration_recoded = replace(duration_minutes, !is.na(duration_minutes), duration_minutes / 60))
However, the command produces incorrect results and this warning message is shown:但是,该命令会产生不正确的结果并显示此警告消息:
Warning message:
In x[list] <- values :
number of items to replace is not a multiple of replacement length
Can anybody tell me where my mistake is?谁能告诉我我的错误在哪里?
Here's some sample data:以下是一些示例数据:
df <- structure(list(duration_hours = c(1, NA, 2, NA, 1), duration_minutes = c(NA, 25, NA, 30, NA)), row.names = c(NA, -5L), class = c("tbl_df", "tbl", "data.frame"))
We can make use of the coalesce()
function from the dplyr
package here:我们可以在这里使用dplyr
包中的coalesce()
函数:
library(dplyr)
df <- df %>% mutate(duration_recoded = coalesce(duration_hours, duration_minutes / 60))
This should work because if the duration_hours
be non NA
, then coalesce
would simply grab it and assign it to duration_recorded
.这应该有效,因为如果duration_hours
不是NA
,则coalesce
会简单地抓住它并将其分配给duration_recorded
。 If duration_hours
is actually NA
, then it would pass and instead take duration_minutes
divided by 60.如果duration_hours
实际上是NA
,那么它将通过,而是将duration_minutes
除以 60。
The problem in your code is that duration minutes is a vector and when you divide by 60 you are performing a vector operation.您的代码中的问题是持续时间分钟是一个向量,当您除以 60 时,您正在执行向量运算。 Let's use an example df:让我们以 df 为例:
# A tibble: 7 x 1
duration_minutes
<dbl>
1 10
2 20
3 30
4 NA
5 50
6 NA
7 60
In this case, df$duraction_minutes / 60
results in:在这种情况下, df$duraction_minutes / 60
结果:
0.1666667 0.3333333 0.5000000 NA 0.8333333 NA 1.0000000
That means that you are trying to replace every NA value with a vector of multiple values... That is why your warning message says number of items to replace is not a multiple of replacement length
.这意味着您正在尝试用多个值的向量替换每个 NA 值......这就是为什么您的警告消息说number of items to replace is not a multiple of replacement length
。
You either have to use some function that aggregates multiple values to a single value (such as sum()
, mean()
, first()
, etc) or you have to select a single value to act as a replacement.您要么必须使用某个函数将多个值聚合为一个值(例如sum()
、 mean()
、 first()
等),要么必须选择一个值作为替代。 the coalesce()
function is just finding the first non-missing element. coalesce()
函数只是找到第一个非缺失元素。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.