简体   繁体   English

如何用同一数据集中其他相应多列的值替换多列中的 NA?

[英]How can I replace NAs in multiple columns with the values from other corresponding multiple columns in the same data set?

I am trying replace NA's over multiple columns with corresponding values from other columns in the df.我正在尝试用 df 中其他列的相应值替换多个列上的 NA。


df = data.frame(ID = sample(1000:9999,10),
                Age = sample(18:99,10),
                Gender = sample(c("M","F"),10, replace = TRUE),              
                Test1 = sample(60:100,10),
                Test2 = sample(60:100,10),
                Test3 = sample(60:100,10),
                Test1.x = rep(NA,10),
                Test2.x = rep(NA,10),
                Test3.x = rep(NA,10))

df$Test1[c(2,3,8)] = NA
df$Test2[c(4,10)] = NA
df$Test3[c(1,7)] = NA
df$Test1.x[c(2,3,4,8)] = sample(60:100,4)
df$Test2.x[c(4,9,10)] = sample(60:100,3)
df$Test3.x[c(1,6,7)] = sample(60:100,3)
print(df)
    ID Age Gender Test1 Test2 Test3 Test1.x Test2.x Test3.x
1  7877  40      M    78    70    NA      NA      NA      84
2  6345  54      F    NA    99    61      62      NA      NA
3  9170  41      F    NA    80    96      82      NA      NA
4  2400  83      M   100    NA   100      94      95      NA
5  5920  66      M    77    62    69      NA      NA      NA
6  2569  34      M    99    96    81      NA      NA     100
7  7879  28      M    64    71    NA      NA      NA      90
8  8652  53      F    NA    74    89      95      NA      NA
9  6357  97      F    92    86    83      NA      86      NA
10 1943  45      M    95    NA    98      NA      72      NA

I would like to replace only the NAs in the test scores with the corresponding test.x score, while using str_replace.在使用 str_replace 时,我只想用相应的 test.x 分数替换测试分数中的 NA。 My actual data frame contain more than 3 columns but all the corresponding column names are the same with the ".x" afterwards.我的实际数据框包含超过 3 列,但所有相应的列名都与之后的“.x”相同。

Any ideas to make this quick and easy?有什么想法可以让这个过程变得快速简单吗? I'm struggling between mutating across said columns or using replace_nas.我在跨所述列突变或使用 replace_nas 之间挣扎。

Within dplyr we could use coalesce with across . across dplyr ,我们可以使用coalesce和 cross 。

library(dplyr)

df |>
  mutate(across(starts_with("Test") & !contains(".x"),
                ~ coalesce(., get(paste0(cur_column(), ".x")))))

Output: Output:

     ID Age Gender Test1 Test2 Test3 Test1.x Test2.x Test3.x
1  5022  90      M    94    68    79      NA      NA      79
2  1625  41      M    71    66    89      71      NA      NA
3  6438  86      M    86    94    94      86      NA      NA
4  3249  93      F    74    90    76      68      90      NA
5  7338  70      F    64    63    70      NA      NA      NA
6  9416  27      F    78    74    75      NA      NA      64
7  4374  45      F    82   100    60      NA      NA      60
8  6226  21      F    61    82    63      61      NA      NA
9  5265  97      M    83    83    68      NA      89      NA
10 5441  95      M    70    79    99      NA      79      NA

Using dplyover使用dplyover

library(dplyover)
df <- df %>% 
   mutate(across2(matches("Test\\d+$"), ends_with(".x"),
       coalesce, .names = "{xcol}"))

-output -输出

df
     ID Age Gender Test1 Test2 Test3 Test1.x Test2.x Test3.x
1  7877  40      M    78    70    84      NA      NA      84
2  6345  54      F    62    99    61      62      NA      NA
3  9170  41      F    82    80    96      82      NA      NA
4  2400  83      M   100    95   100      94      95      NA
5  5920  66      M    77    62    69      NA      NA      NA
6  2569  34      M    99    96    81      NA      NA     100
7  7879  28      M    64    71    90      NA      NA      90
8  8652  53      F    95    74    89      95      NA      NA
9  6357  97      F    92    86    83      NA      86      NA
10 1943  45      M    95    72    98      NA      72      NA

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 R:如何使用来自利用其他多列的条件的值替换 dataframe 列中的 NA? - R: How do I replace NAs in a dataframe column with values from conditions leveraging other multiple columns? 对于 R 中的多列,是否有一种优雅的方法可以将 NA 替换为相应列中的值? - Is there an elegant way to replace NAs with values from a corresponding column, for multiple columns, in R? 如何用同一个 dataframe 中的其他列替换多列的值? - How to replace values of multiple columns with other columns within the same dataframe? 从其他多列的值有条件地替换多列的值 - Conditionally replace values of multiple columns, from values of other multiple columns 如何用 dplyr 替换多列中的 NA - How to replace NAs in multiple columns with dplyr dplyr 的 rowwise + replace_NAs:用其他列的值替换多列中的 NA - dplyr's rowwise + replace_NAs: replacing NAs in multiple columns with value from other column 如何用相应列的值替换多列中的 NA - How to replace NA in multiple columns with value from corresponding columns 如何将数据表中的多列设置为同一数据表中不同列的值? - How to set multiple columns in a data table to values from different columns in the same data table? 如何用其他列的值估算多列中的 NA? - How to impute NAs in many columns with the values from other columns? 基于跨多列的另外两个数据框替换 NA - Replace NAs based on another two data frames across multiple columns
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM