如何实现提取/分离功能（来自dplyr和tidyr）以将一列分为多个列。基于任意值？

Question

I have a column: 我有一列：

Y = c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20)

I would like to split into multiple columns, based on the positions of the column values. 我想根据列值的位置分为多个列。 For instance, I would like: 例如，我想要：

Y1=c(1,2,3,4,5)
Y2=c(6,7,8,9,10)
Y3=c(11,12,13,14,15)
Y4=c(16,17,18,19,20)

Since I am working with a big data time series set, the divisions will be arbitrary depending on the length of one time period. 由于我使用的是大数据时间序列集，因此根据一个时间段的长度，划分将是任意的。

Answer 1

Not a dplyr solution, but I believe the easiest way would involve using matrices. 这不是dplyr解决方案，但我认为最简单的方法将涉及使用矩阵。

foo = function(data, sep.in=5) {
data.matrix = matrix(data,ncol=5)
data.df = as.data.frame(data.matrix)
return(data.df)
}

I have not tested it but this function should create a data.frame who can be merge to a existing one using cbind() 我没有测试过，但是此函数应该创建一个data.frame，可以使用cbind()合并到现有的cbind()

Answer 2

You can use the base split to split this vector into vectors that are each 5 items long. 您可以使用基本split将此向量拆分为每个长度为5的向量。 You could also use a variable to store this interval length. 您也可以使用变量存储此间隔长度。

Using rep with each = 5 , and creating a sequence programmatically, gets you a sequence of the numbers 1, 2, ... up to the length divided by 5 (in this case, 4), each 5 times consecutively. 对each = 5使用rep并以编程方式创建一个序列，即可得到一个数字1、2，...的序列，其长度除以5（在这种情况下为4），并连续5次。 Then split returns a list of vectors. 然后split返回向量列表。

It's worth noting that a variety of SO posts will recommend you store similar data in lists such as this, rather than creating multiple variables, so I'm leaving it in list form here. 值得注意的是，各种各样的SO帖子都建议您将类似的数据存储在诸如此类的列表中，而不是创建多个变量，因此我将其保留在列表形式中。

Y <- 1:20

breaks <- rep(1:(length(Y) / 5), each = 5)
split(Y, breaks)
#> $`1`
#> [1] 1 2 3 4 5
#> 
#> $`2`
#> [1]  6  7  8  9 10
#> 
#> $`3`
#> [1] 11 12 13 14 15
#> 
#> $`4`
#> [1] 16 17 18 19 20

^{Created on 2019-02-12 by the reprex package (v0.2.1)} ^{由reprex软件包（v0.2.1）创建于2019-02-12}

Answer 3

We can make use of split (writing the commented code as solution) to split the vector into a list of vector s. 我们可以利用split （将注释代码编写为解决方案）将vector split为vector s的list 。

lst <- split(Y, as.integer(gl(length(Y), 5, length(Y))))
lst
#$`1`
#[1] 1 2 3 4 5

#$`2`
#[1]  6  7  8  9 10

#$`3`
#[1] 11 12 13 14 15

#$`4`
#[1] 16 17 18 19 20

Here, the gl create a grouping index by specifying the n , k and length parameters where n - an integer giving the number of levels, k - an integer giving the number of replications, and length -an integer giving the length of the result. 在这里， gl通过指定n ， k和length参数来创建分组索引，其中n给出级别k的整数， k给出重复数的整数，length-给出结果length的整数。

In our case, we want to have 'k' as 5. 在我们的例子中，我们希望'k'为5。

as.integer(gl(length(Y), 5, length(Y)))
#[1] 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4

If we want to have multiple objects in the global environment, use list2env 如果要在全局环境中具有多个对象，请使用list2env

list2env(setNames(lst, paste0("Y", seq_along(lst))), envir = .GlobalEnv)
Y1
#[1] 1 2 3 4 5
Y2
#[1]  6  7  8  9 10
Y3
#[1] 11 12 13 14 15
Y4
#[1] 16 17 18 19 20

Or as the OP mentioned dplyr/tidyr in the question, we can use those packages as well 或者正如OP在问题中提到的dplyr/tidyr ，我们也可以使用这些软件包

library(tidyverse)
tibble(Y) %>%
   group_by(grp = (row_number()-1) %/% 5 + 1) %>% 
   summarise(Y = list(Y)) %>%
   pull(Y)
#[[1]]
#[1] 1 2 3 4 5

#[[2]]
#[1]  6  7  8  9 10

#[[3]]
#[1] 11 12 13 14 15

#[[4]]
#[1] 16 17 18 19 20

data 数据

Y <- 1:20

如何实现提取/分离功能（来自dplyr和tidyr）以将一列分为多个列。基于任意值？

问题描述

3 个解决方案

解决方案1
1 2019-02-12 16:19:26

解决方案2
1 已采纳 2019-02-12 16:48:17

解决方案3
0 2019-02-12 19:52:12

data 数据

如何实现提取/分离功能（来自dplyr和tidyr）以将一列分为多个列。 基于任意值？

问题描述

3 个解决方案

解决方案1 1 2019-02-12 16:19:26

解决方案2 1 已采纳 2019-02-12 16:48:17

解决方案3 0 2019-02-12 19:52:12

data 数据

如何实现提取/分离功能（来自dplyr和tidyr）以将一列分为多个列。基于任意值？

解决方案1
1 2019-02-12 16:19:26

解决方案2
1 已采纳 2019-02-12 16:48:17

解决方案3
0 2019-02-12 19:52:12