简体   繁体   English

在 dplyr 中插入列名

[英]insert column names in dplyr

Let's assume I have a data frame with lots of columns: var1 , ..., var100 , and also a matching named vectors of the same length.假设我有一个包含很多列的数据框: var1 ,..., var100 ,还有一个相同长度的匹配命名向量。
I would like to create a function that if in the data frame there are NA's it would pick the data from the named vector.我想创建一个 function ,如果数据框中有 NA,它将从命名向量中选择数据。 This is what I wrote so far:这是我到目前为止写的:

data %>% 
  mutate(var1 = ifelse(is.na(var1), named_vec["var1"], var1),
         var2 = ifelse(is.na(var2), named_vec["var2"], var2),
         ...)

It works, however if I have 100's variable it would be very impractical to write so many conditions.它可以工作,但是如果我有 100 个变量,那么写这么多条件是非常不切实际的。 I then tried this:然后我尝试了这个:

data %>% 
   mutate_if(~ifelse(is.na(.x), named_vec[colnames(.x)], .x))

Error in selected[[i]] <- eval_tidy(.p(column, ...)) : 
  more elements supplied than there are to replace

However this does not work.但是,这不起作用。 Is there a way in dplyr to extract the column name do I can slice the named vector? dplyr 中有没有办法提取列名我可以切片命名向量吗?

Here a small example of data to try这里有一个数据的小例子来试试

data <- data.frame(var1 = c(1, 1, NA, 1),
                   var2 = c(2, NA, NA, 2),
                   var3 = c(3, 3, 3, NA))

named_vec <- c("var1" = 1, "var2" = 2, "var3" = 3)

It may be easier to do this with coalesce使用coalesce可能更容易做到这一点

library(dplyr)
library(purrr)
library(stringr)
nm1 <- str_c('var', 1:3)
data[nm1] <- map_dfc(nm1, ~ coalesce(data[[.x]], named_vec[.x]))
data
#  var1 var2 var3
#1    1    2    3
#2    1    2    3
#3    1    2    3
#4    1    2    3

Or if we replicate the 'named_vec',或者如果我们复制“named_vec”,

data[] <-  coalesce(as.matrix(data), named_vec[col(data)])

Another option is to convert to 'long' format, then do a left_join , coalesce the 'value' columns, and reshape back to 'wide' format另一种选择是转换为“长”格式,然后执行left_joincoalesce “值”列,然后重新整形为“宽”格式

library(tidyr)
data %>%
   mutate(rn = row_number()) %>%
   pivot_longer(cols = -rn) %>% 
   left_join(enframe(named_vec), by = 'name') %>%
   transmute(rn, name, value = coalesce(value.x, value.y)) %>% 
   pivot_wider(names_from = name, values_from = value) %>% 
   select(-rn)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM