在 dplyr 中插入列名

Question

Let's assume I have a data frame with lots of columns: var1 , ..., var100 , and also a matching named vectors of the same length.假设我有一个包含很多列的数据框： var1 ，...， var100 ，还有一个相同长度的匹配命名向量。
I would like to create a function that if in the data frame there are NA's it would pick the data from the named vector.我想创建一个 function ，如果数据框中有 NA，它将从命名向量中选择数据。 This is what I wrote so far:这是我到目前为止写的：

data %>% 
  mutate(var1 = ifelse(is.na(var1), named_vec["var1"], var1),
         var2 = ifelse(is.na(var2), named_vec["var2"], var2),
         ...)

It works, however if I have 100's variable it would be very impractical to write so many conditions.它可以工作，但是如果我有 100 个变量，那么写这么多条件是非常不切实际的。 I then tried this:然后我尝试了这个：

data %>% 
   mutate_if(~ifelse(is.na(.x), named_vec[colnames(.x)], .x))

Error in selected[[i]] <- eval_tidy(.p(column, ...)) : 
  more elements supplied than there are to replace

However this does not work.但是，这不起作用。 Is there a way in dplyr to extract the column name do I can slice the named vector? dplyr 中有没有办法提取列名我可以切片命名向量吗？

Here a small example of data to try这里有一个数据的小例子来试试

data <- data.frame(var1 = c(1, 1, NA, 1),
                   var2 = c(2, NA, NA, 2),
                   var3 = c(3, 3, 3, NA))

named_vec <- c("var1" = 1, "var2" = 2, "var3" = 3)

Answer 1

It may be easier to do this with coalesce使用coalesce可能更容易做到这一点

library(dplyr)
library(purrr)
library(stringr)
nm1 <- str_c('var', 1:3)
data[nm1] <- map_dfc(nm1, ~ coalesce(data[[.x]], named_vec[.x]))
data
#  var1 var2 var3
#1    1    2    3
#2    1    2    3
#3    1    2    3
#4    1    2    3

Or if we replicate the 'named_vec',或者如果我们复制“named_vec”，

data[] <-  coalesce(as.matrix(data), named_vec[col(data)])

Another option is to convert to 'long' format, then do a left_join , coalesce the 'value' columns, and reshape back to 'wide' format另一种选择是转换为“长”格式，然后执行left_join ， coalesce “值”列，然后重新整形为“宽”格式

library(tidyr)
data %>%
   mutate(rn = row_number()) %>%
   pivot_longer(cols = -rn) %>% 
   left_join(enframe(named_vec), by = 'name') %>%
   transmute(rn, name, value = coalesce(value.x, value.y)) %>% 
   pivot_wider(names_from = name, values_from = value) %>% 
   select(-rn)

在 dplyr 中插入列名

问题描述

1 个解决方案

解决方案1
2 2020-05-07 19:16:14

在 dplyr 中插入列名

问题描述

1 个解决方案

解决方案1 2 2020-05-07 19:16:14

解决方案1
2 2020-05-07 19:16:14