[英]Repeat dataframe with new column in R
I have a dataframe:我有一个数据框:
my_df <- data.frame(var1 = c(1,2,3,4,5), var2 = c(6,7,8,9,10))
my_df
var1 var2
1 1 6
2 2 7
3 3 8
4 4 9
5 5 10
I also have a vector:我也有一个向量:
my_vec <- c("a", "b", "c")
I want to repeat the dataframe length(my_vec)
times, filling in the values of a new variable with the vector values.我想重复数据
length(my_vec)
次,用向量值填充新变量的值。 Is there a simple way to do this?有没有一种简单的方法可以做到这一点? If possible, i'd like to do this in a dplyr chain.
如果可能的话,我想在 dplyr 链中执行此操作。 Desired output:
期望的输出:
var1 var2 var3
1 1 6 a
2 2 7 a
3 3 8 a
4 4 9 a
5 5 10 a
6 1 6 b
7 2 7 b
8 3 8 b
9 4 9 b
10 5 10 b
11 1 6 c
12 2 7 c
13 3 8 c
14 4 9 c
15 5 10 c
We can use crossing
or with expand_grid
我们可以使用
crossing
或与expand_grid
library(tidyr)
crossing(my_df, var3 = my_vec)
#expand_grid(my_df, var3 = my_vec)
If the order is important, use arrange
如果顺序很重要,请使用
arrange
library(dplyr)
crossing(my_df, var3 = my_vec) %>%
arrange(var3)
-output -输出
# A tibble: 15 × 3
var1 var2 var3
<dbl> <dbl> <chr>
1 1 6 a
2 2 7 a
3 3 8 a
4 4 9 a
5 5 10 a
6 1 6 b
7 2 7 b
8 3 8 b
9 4 9 b
10 5 10 b
11 1 6 c
12 2 7 c
13 3 8 c
14 4 9 c
15 5 10 c
Though I don't think this is likely to be the simplest answer in practice, I specifically saw that you wanted a dplyr chain that would solve this, and so I tried to do this without using the pre-existing functions that do this for you.尽管我认为这可能不是实践中最简单的答案,但我特别看到您想要一个可以解决此问题的 dplyr 链,因此我尝试在不使用为您执行此操作的预先存在的函数的情况下执行此操作.
For your example specifically, you could use this chain with the tibble package functions add_column
and add_row
对于您的示例,您可以将此链与 tibble 包函数
add_column
和add_row
my_df %>%
tibble::add_column(var3 = my_vec[1]) %>%
tibble::add_row(tibble::add_column(my_df, var3 = my_vec[2])) %>%
tibble::add_row(tibble::add_column(my_df, var3 = my_vec[3]))
which directly yields直接产生
Though the principle can be extended a bit, it can still be more adaptable for whatever it is you want to apply this to.var1 var2 var3 1 1 6 a 2 2 7 a 3 3 8 a 4 4 9 a 5 5 10 a 6 1 6 b 7 2 7 b 8 3 8 b 9 4 9 b 10 5 10 b 11 1 6 c 12 2 7 c 13 3 8 c 14 4 9 c 15 5 10 c
my_fxn <- function(frame, yourVector, new.col.name = paste0("var", NCOL(frame) + 1)) { require(tidyverse) origcols <- colnames(frame) for (i in 1:length(yourVector)) { intermediateFrame <- tibble::add_column( frame, temp.name = rep_len(yourVector[[i]], nrow(frame)) ) colnames(intermediateFrame) <- append(origcols, new.col.name) if (i == 1) { Frame3 <- intermediateFrame } else { Frame3 <- tibble::add_row(Frame3, intermediateFrame) } } return(Frame3) }
Running my_fxn(my_df, my_vec)
should get you the same data frame/table that we got above.运行
my_fxn(my_df, my_vec)
应该会得到与上面相同的数据框/表。 I also experimented with using a for
loop outside a function on its own to do this, but decided that it was getting to be overkill.我还尝试在函数外部单独使用
for
循环来执行此操作,但认为它变得有点矫枉过正。 That approach is definitely also possible, though.不过,这种方法肯定也是可能的。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.