简体   繁体   English

使用 R 中的新列重复数据框

[英]Repeat dataframe with new column in R

I have a dataframe:我有一个数据框:

my_df <- data.frame(var1 = c(1,2,3,4,5), var2 = c(6,7,8,9,10))
my_df
  var1 var2
1    1    6
2    2    7
3    3    8
4    4    9
5    5   10

I also have a vector:我也有一个向量:

my_vec <- c("a", "b", "c")

I want to repeat the dataframe length(my_vec) times, filling in the values of a new variable with the vector values.我想重复数据length(my_vec)次,用向量值填充新变量的值。 Is there a simple way to do this?有没有一种简单的方法可以做到这一点? If possible, i'd like to do this in a dplyr chain.如果可能的话,我想在 dplyr 链中执行此操作。 Desired output:期望的输出:

  var1 var2 var3
1    1    6    a
2    2    7    a
3    3    8    a
4    4    9    a
5    5   10    a
6    1    6    b
7    2    7    b
8    3    8    b
9    4    9    b
10   5   10    b
11   1    6    c
12   2    7    c
13   3    8    c
14   4    9    c
15   5   10    c

We can use crossing or with expand_grid我们可以使用crossing或与expand_grid

library(tidyr)
crossing(my_df, var3 = my_vec)
#expand_grid(my_df, var3 = my_vec)

If the order is important, use arrange如果顺序很重要,请使用arrange

library(dplyr)
crossing(my_df, var3 = my_vec) %>% 
    arrange(var3)

-output -输出

# A tibble: 15 × 3
    var1  var2 var3 
   <dbl> <dbl> <chr>
 1     1     6 a    
 2     2     7 a    
 3     3     8 a    
 4     4     9 a    
 5     5    10 a    
 6     1     6 b    
 7     2     7 b    
 8     3     8 b    
 9     4     9 b    
10     5    10 b    
11     1     6 c    
12     2     7 c    
13     3     8 c    
14     4     9 c    
15     5    10 c   

Though I don't think this is likely to be the simplest answer in practice, I specifically saw that you wanted a dplyr chain that would solve this, and so I tried to do this without using the pre-existing functions that do this for you.尽管我认为这可能不是实践中最简单的答案,但我特别看到您想要一个可以解决此问题的 dplyr 链,因此我尝试在不使用为您执行此操作的预先存在的函数的情况下执行此操作.
For your example specifically, you could use this chain with the tibble package functions add_column and add_row对于您的示例,您可以将此链与 tibble 包函数add_columnadd_row

my_df %>%
  tibble::add_column(var3 = my_vec[1]) %>%
  tibble::add_row(tibble::add_column(my_df, var3 = my_vec[2])) %>%
  tibble::add_row(tibble::add_column(my_df, var3 = my_vec[3]))

which directly yields直接产生

 var1 var2 var3 1 1 6 a 2 2 7 a 3 3 8 a 4 4 9 a 5 5 10 a 6 1 6 b 7 2 7 b 8 3 8 b 9 4 9 b 10 5 10 b 11 1 6 c 12 2 7 c 13 3 8 c 14 4 9 c 15 5 10 c
Though the principle can be extended a bit, it can still be more adaptable for whatever it is you want to apply this to. 虽然这个原理可以扩展一点,但它仍然可以更适应你想要应用它的任何东西。 So I decided to make a function to do it for you. 所以我决定做一个函数来为你做这件事。
 my_fxn <- function(frame, yourVector, new.col.name = paste0("var", NCOL(frame) + 1)) { require(tidyverse) origcols <- colnames(frame) for (i in 1:length(yourVector)) { intermediateFrame <- tibble::add_column( frame, temp.name = rep_len(yourVector[[i]], nrow(frame)) ) colnames(intermediateFrame) <- append(origcols, new.col.name) if (i == 1) { Frame3 <- intermediateFrame } else { Frame3 <- tibble::add_row(Frame3, intermediateFrame) } } return(Frame3) }

Running my_fxn(my_df, my_vec) should get you the same data frame/table that we got above.运行my_fxn(my_df, my_vec)应该会得到与上面相同的数据框/表。 I also experimented with using a for loop outside a function on its own to do this, but decided that it was getting to be overkill.我还尝试在函数外部单独使用for循环来执行此操作,但认为它变得有点矫枉过正。 That approach is definitely also possible, though.不过,这种方法肯定也是可能的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM