简体   繁体   English

如何将函数应用于具有purrr的成对的列?

[英]How to apply function to pairs of columns with purrr?

I have the following dataframe resulting from a join with dplyr: 我有以下与dplyr联接产生的数据框:

data_frame(id=1:4, a.x = c(1, NA, 3, 4), a.y = c(1, 2, 3, 4), b.x = c(NA, NA, 3, NA), b.y = c(2, 2, NA, 4)) 
# A tibble: 4 x 5
     id   a.x   a.y   b.x   b.y
  <int> <dbl> <dbl> <dbl> <dbl>
1     1     1     1    NA     2
2     2    NA     2    NA     2
3     3     3     3     3    NA
4     4     4     4    NA     4

And I would like to replace all the NAs in the columns ending with .x with the value from the columns ending with .y . 我想用以.y结尾的列中的值替换以.x结尾的列中的所有NA。 Eventually, I would like to achieve this: 最终,我想实现以下目标:

# A tibble: 4 x 5
     id   a.x   a.y   b.x   b.y
  <int> <dbl> <dbl> <dbl> <dbl>
1     1     1     1     2     2
2     2     2     2     2     2
3     3     3     3     3    NA
4     4     4     4     4     4

I tried with purrr something like this: 我用purrr尝试了这样的事情:

data_frame(id=1:4, a.x = c(1, NA, 3, 4), a.y = c(1, 2, 3, 4), b.x = c(NA, NA, 3, NA), b.y = c(2, 2, NA, 4)) %>%
  map2_dfr(.x = ends_with('.y'), .y = ends_with('.x'), ~ case_when(is.na(.x) ~ .y,
                                                                   TRUE ~ .x))

Which is wrong. 哪有错 The documentation is a bit confusing to me, I think the issue here is that .x expects a vector, but how can I pass a list of columns then? 文档对我来说有些混乱,我认为这里的问题是.x需要一个向量,但是我怎样才能传递列列表呢?

A solution. 解决方案。 We can gather the columns, separate by . 我们可以gatherseparate的列. , arrange by columns, fill the value toward up, unite columns, and finally spread the data frame to the original structure. ,按列arrange ,向上fill值, unite列,最后将数据帧spread到原始结构。

library(tidyverse)

dat2 <- dat %>%
  gather(Column, Value, -id) %>%
  separate(Column, into = c("Col1", "Col2")) %>%
  arrange(id, Col1, Col2) %>%
  group_by(id, Col1) %>%
  fill(Value, .direction = "up") %>%
  unite(Column, Col1, Col2, sep = ".") %>%
  spread(Column, Value) %>%
  ungroup()
dat2
## A tibble: 4 x 5
#      id   a.x   a.y   b.x   b.y
# * <int> <dbl> <dbl> <dbl> <dbl>
# 1     1  1.00  1.00  2.00  2.00
# 2     2  2.00  2.00  2.00  2.00
# 3     3  3.00  3.00  3.00 NA   
# 4     4  4.00  4.00  4.00  4.00

Or if the order of the columns in the data frame is good, we can use the transpose function from the package, but be careful that the column types may change after the process. 或者,如果数据框中各列的顺序正确,则可以使用包中的transpose函数,但请注意,此过程后列类型可能会更改。

dat2 <- dat %>%
  data.table::transpose() %>%
  fill(everything(), .direction = 'up') %>%
  data.table::transpose() %>%
  setNames(names(dat))
dat2
#   id a.x a.y b.x b.y
# 1  1   1   1   2   2
# 2  2   2   2   2   2
# 3  3   3   3   3  NA
# 4  4   4   4   4   4 

Or a solution using to create subset that with column names ends_with "x" and "y" first, and then replace the original columns ends with "x". 或使用的解决方案来创建子集,该子集的列名称ends_with “ x”和“ y”,然后将原始列替换为“ x”。

dat_x <- dat %>% select(ends_with("x"))
dat_y <- dat %>% select(ends_with("y"))

dat[, grepl("x$", names(dat))] <- map2(dat_x, dat_y, ~ifelse(is.na(.x), .y, .x)) 
dat
# # A tibble: 4 x 5
#      id   a.x   a.y   b.x   b.y
#   <int> <dbl> <dbl> <dbl> <dbl>
# 1     1  1.00  1.00  2.00  2.00
# 2     2  2.00  2.00  2.00  2.00
# 3     3  3.00  3.00  3.00 NA   
# 4     4  4.00  4.00  4.00  4.00

DATA 数据

dat <- data_frame(id=1:4, a.x = c(1, NA, 3, 4), a.y = c(1, 2, 3, 4), b.x = c(NA, NA, 3, NA), b.y = c(2, 2, NA, 4)) 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM