简体   繁体   English

在 R 的特定列中用随机小数替换 NA

[英]Replacing NAs with random decimals in a particular column in R

I am trying to replace NA s with random decimals in a particular column in R.我正在尝试用 R 中特定列中的随机小数替换NA However, R generates random decimals with the same trailing fraction and just changes the part before the decimal.但是,R 生成具有相同尾随小数的随机小数,并且只更改小数点前的部分。 The following are the methods I tried:以下是我尝试过的方法:

df_LT$ATC[is.na(df_LT$ATC)]  <- sample(seq(10.2354897,23.78954214), size=sum(is.na(df_LT$ATC)), replace=T)

dplyr dplyr

df_LT <-  df_LT %>%mutate_at(vars(df_LT$ATC), ~replace_na(., sample(10.2354897:23.78954214, size=sum(is.na(ATC)), replace=T)))

Data looks as below数据如下

    A        ATC
    1        11.2356879
    2        42.58974164
    3            NA
    4        34.25382343
    5             NA 

Now, wherever there is a NA in the ATC column I want to add a decimal like the others but in the range 10:23.现在,无论在 ATC 列中有 NA 的地方,我都想像其他小数一样添加一个小数,但在 10:23 范围内。 Hope this explanation will help.希望这个解释会有所帮助。 I may be missing something very obvious.我可能遗漏了一些非常明显的东西。 Thanks for the help in advance.我在这里先向您的帮助表示感谢。

You are using seq or the colon operator : to create your samples, which means you are sampling from following sequence:您正在使用seq或冒号运算符:创建样本,这意味着您正在从以下序列中采样:

seq(10.2354897, 23.78954214)
# [1] 10.23549 11.23549 12.23549 13.23549 14.23549 ....

So the starting value is increased by 1 in each step, leaving the numbers after the decimal points fixed.因此起始值每步增加 1,小数点后的数字保持不变。

If you want to sample random number within the range of these two limits you can do:如果您想在这两个限制范围内对随机数进行采样,您可以执行以下操作:

runif(n = 1, min = 10.2354897, max = 23.78954214)

So for your example this translates to:因此,对于您的示例,这转化为:

df_LT$ATC[is.na(df_LT$ATC)] <- 
  runif(n = sum(is.na(df_LT$ATC)), 10.2354897, 23.78954214)

If you want to add a condition you can do:如果要添加条件,可以执行以下操作:

df_LT$ATC <- 
  ifelse(is.na(df_LT$ATC) & df_LT$A == 3, 
         runif(n = nrow(df_LT), 10.2354897, 23.78954214), 
         df_LT$ATC)

This checks whether ATC is missing and also whether A is equal to 3. If this is fulfille the missing value is replaced with a random number, otherwise the original value (missin or not) is returned.这将检查 ATC 是否丢失以及 A 是否等于 3。如果满足,则将丢失的值替换为随机数,否则返回原始值(是否丢失)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM