[英]How to select n random values from each rows of a dataframe in R?
I have a dataframe我有一个数据框
df= data.frame(a=c(56,23,15,10),
b=c(43,NA,90.7,30.5),
c=c(12,7,10,2),
d=c(1,2,3,4),
e=c(NA,45,2,NA))
I want to select two random non-NA row values from each row and convert the rest to NA我想从每一行中选择两个随机的非 NA 行值并将其余的转换为 NA
Required Output- will differ because of randomness所需输出 - 会因随机性而有所不同
df= data.frame(
a=c(56,NA,15,NA),
b=c(43,NA,NA,NA),
c=c(NA,7,NA,2),
d=c(NA,NA,3,4),
e=c(NA,45,NA,NA))
Code Used使用的代码
I know to select random non-NA value from specific rows我知道从特定行中选择随机非 NA 值
set.seed(2)
sample(which(!is.na(df[1,])),2)
But no idea how to apply it all dataframe and get the required output但不知道如何应用所有数据帧并获得所需的输出
You may write a function to keep n
random values in a row.您可以编写一个函数来连续保存n
随机值。
keep_n_value <- function(x, n) {
x1 <- which(!is.na(x))
x[-sample(x1, n)] <- NA
x
}
Apply the function by row using base R -使用基数 R 按行应用函数 -
set.seed(123)
df[] <- t(apply(df, 1, keep_n_value, 2))
df
# a b c d e
#1 NA NA 12 1 NA
#2 NA NA 7 2 NA
#3 NA 90.7 10 NA NA
#4 NA 30.5 NA 4 NA
Or if you prefer tidyverse
-或者,如果您更喜欢tidyverse
-
purrr::pmap_df(df, ~keep_n_value(c(...), 2))
Base R:基础 R:
You could try column wise apply ( sapply
) and randomly replace two non-NA values to be NA, like:您可以尝试按列应用( sapply
)并随机将两个非 NA 值替换为 NA,例如:
as.data.frame(sapply(df, function(x) replace(x, sample(which(!is.na(x)), 2), NA)))
Example Output:示例输出:
a b c d e
1 56 NA 12 NA NA
2 23 NA NA 2 NA
3 NA NA 10 3 NA
4 NA 30.5 NA NA NA
One option using dplyr
and purrr
could be:使用dplyr
和purrr
一种选择可能是:
df %>%
mutate(pmap_dfr(across(everything()), ~ `[<-`(c(...), !seq_along(c(...)) %in% sample(which(!is.na(c(...))), 2), NA)))
a b c d e
1 56 43.0 NA NA NA
2 23 NA 7 NA NA
3 15 NA NA NA 2
4 NA 30.5 2 NA NA
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.