简体   繁体   English

用 0 随机替换 dataframe 列中的 1000 个 NA 值,而不覆盖 1

[英]Replace randomly 1000 NA Values in a dataframe column with 0s, without overwriting 1s

I am trying to randomly replace 1000 NA values in a dataframe column with 0s.我正在尝试用 0 随机替换 dataframe 列中的 1000 个 NA 值。 The column is composed only of NAs and 1s and it looks like this:该列仅由 NA 和 1 组成,如下所示:

  Column
1 NA
2 1    
3 NA    
4 NA    
5 NA    
6 1    
7 NA
...

I want it to look something like this:我希望它看起来像这样:

  Column
1 0
2 1    
3 NA    
4 0    
5 NA    
6 1    
7 NA
...

The column I am working with has more than 1000 rows, so there will be space for 0s and NAs in the end.我正在使用的列有 1000 多行,所以最后会有 0 和 NA 的空间。

I tried something like this:我试过这样的事情:

is.na(df_col[sample(seq(nrow(is.na(df_col))), 1000), "Column"]) <- 0

This, however, does not work.但是,这不起作用。 No NA values are replaced.没有 NA 值被替换。 If I take out the is.na()s it works, but the values 1 might get replaced and I do not want that.如果我取出 is.na()s 它可以工作,但值 1 可能会被替换,我不希望这样。 Do you know how to solve this?你知道如何解决这个问题吗?

I am assuming that you want to replace 1,000 NA values rather than choosing 1,000 indices and replacing them if they are NA.我假设您要替换 1,000 个 NA 值,而不是选择 1,000 个索引并在它们为 NA 时替换它们。 The following code finds the indices of NA values, then replaces a random sample of 1,000 of those indices with 0.以下代码查找NA值的索引,然后将 1,000 个这些索引的随机样本替换为 0。

set.seed(123)
df <- tibble(x = rep(c(1, NA), times = 2000))
indices <- which(is.na(df$x))
df[sample(indices, 1000, replace = FALSE), "x"] <- 0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM