简体   繁体   English

r: replace_na() 不替换 NAs

[英]r: replace_na() not replacing NAs

I have a variable with NAs called df$salesContribution .我有一个名为df$salesContribution NA 变量。

Using dplyr I've created a statement below, but can't figure out why my df$salesContribution is returning NAs still:使用dplyr我在下面创建了一个语句,但无法弄清楚为什么我的df$salesContribution仍然返回df$salesContribution

df<- df %>% 
  mutate(salesContribution = as.numeric(salesContribution)) %>% 
  replace_na(0)

Is the 0 not registering? 0没有注册吗?

replace_na() will not work if the variable is a factor, and the replacement is not already a level for your factor.如果变量是一个因子,并且替换还不是您的因子的水平,则 replace_na() 将不起作用。 If this is the issue, you can add another level to your factor variable for 0 before running replace_na(), or you can convert the variable to numeric or character first.如果这是问题,您可以在运行 replace_na() 之前为因子变量添加另一个级别为 0,或者您可以先将变量转换为数字或字符。

看起来你想要

df$salesContribution <- df$salesContribution %>% as.numeric() %>% replace_na(0)

I found a blog response elsewhere in which Hadley himself said they wanted to move away from replace_na toward a more SQL adjacent command coalesce().我在别处找到了一篇博客回复,其中 Hadley 本人说他们希望从 replace_na 转向更接近 SQL 的命令 coalesce()。 The solution involves both across and coalesce.解决方案涉及交叉和合并。 In my case, I used a different variable from the dataframe to supply the missing values ad hoc.就我而言,我使用了数据帧中的不同变量来临时提供缺失值。 You can also specify a fixed value.您还可以指定一个固定值。

Here's an example of what I just did in my work:这是我刚刚在工作中所做的一个例子:

Varname1变量名1 Varname2变量名2
1 1 Yes是的 Yes是的
2 2 NA不适用 No
3 3 NA不适用 Yes是的
4 4 No No
df %>%
  mutate(across(Varname1, coalesce, Varname2))
Varname1变量名1 Varname2变量名2
1 1 Yes是的 Yes是的
2 2 No No
3 3 Yes是的 Yes是的
4 4 No No

I don't have enough reputation points to add a comment to someone's answer, but above it was said:我没有足够的声望点来为某人的答案添加评论,但上面说:

replace_na() will not work if the variable is a factor, and the replacement is not >already a level for your factor.如果变量是一个因子,replace_na() 将不起作用,并且替换不是>已经是你的因子的水平。 If this is the issue, you can add another level to >your factor variable for 0 before running replace_na(), or you can convert the >variable to numeric or character first.如果这是问题所在,您可以在运行 replace_na() 之前将另一个级别添加到 > 您的因子变量为 0,或者您可以先将 > 变量转换为数字或字符。

This is true.这是真的。 However, you can work around this with fct_explicit_na() from the forcats package.但是,您可以使用 forcats 包中的fct_explicit_na()解决此问题。

You can do using base replace :您可以使用 base replace

df<- df %>% 
  mutate(salesContribution = replace(as.numeric(salesContribution), which(is.na(salesContribution)), 0)

If using data.table package you could do something like: You could try something like: x[is.na(field_name)][, field_name := replacement_value]如果使用data.table包,您可以执行以下操作:您可以尝试以下操作: x[is.na(field_name)][, field_name := replacement_value]

There should be similar syntax for data.frame as well. data.frame 也应该有类似的语法。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM