简体   繁体   English

在 R 中将一个向量拆分为三个长度不等的向量

[英]Split a vector into three vectors of unequal length in R

a questions from a relative n00b: I'd like to split a vector into three vectors of different lengths, with the values assigned to each vector at random.来自亲戚 n00b 的问题:我想将一个向量分成三个不同长度的向量,并随机分配给每个向量的值。 For example, I'd like to split the vector of length 12 below into vectors of length 2,3, and 7例如,我想将下面长度为 12 的向量拆分为长度为 2,3 和 7 的向量

I can get three equal sized vectors using this:我可以使用这个得到三个相同大小的向量:

test<-1:12
split(test,sample(1:3))

Any suggestions on how to split test into vectors of 2,3, and 7 instead of three vectors of length 4?关于如何将测试拆分为 2,3 和 7 的向量而不是长度为 4 的三个向量的任何建议?

You could use rep to create the indices for each group and then split based on that您可以使用rep为每个组创建索引,然后根据该索引进行拆分

split(1:12, rep(1:3, c(2, 3, 7)))

If you wanted the items to be randomly assigned so that it's not just the first 2 items in the first vector, the next 3 items in the second vector, ..., you could just add call to sample如果您希望随机分配项目,以便它不仅仅是第一个向量中的前 2 个项目,第二个向量中的接下来 3 个项目,...,您可以添加 call to sample

split(1:12, sample(rep(1:3, c(2, 3, 7))))

If you don't have the specific lengths (2,3,7) in mind but just don't want it to be equal length vectors every time then SimonO101's answer is the way to go.如果您没有考虑特定的长度 (2,3,7),但只是不希望它每次都是等长的向量,那么 SimonO101 的答案就是要走的路。

How about using sample slightly differently...如何使用稍微不同的sample ...

set.seed(123)
test<-1:12
split( test , sample(3, 12 , repl = TRUE) )

#$`1`
#[1] 1 6

#$`2`
#[1]  3  7  9 10 12

#$`3`
#[1]  2  4  5  8 11

set.seed(1234)
test<-1:12
split( test , sample(3, 12 , repl = TRUE) )

#$`1`
#[1] 1 7 8

#$`2`
#[1]  2  3  4  6  9 10 12

#$`3`
#[1]  5 11

The first argument in sample is the number of groups to split the vector into. sample的第一个参数是要将向量分成的组数。 The second argument is the number of elements in the vector.第二个参数是向量中的元素数。 This will randomly assign each successive element into one of 3 vectors.这会将每个连续元素随机分配到 3 个向量之一。 For 4 vectors just do split( test , sample(4, 12 , repl = TRUE) ) .对于 4 个向量,只需执行split( test , sample(4, 12 , repl = TRUE) )

It is easier than you think.这比你想象的要容易。 To split the vector in three new randomly chosen sets run the following code:要将向量分成三个新的随机选择的集合,请运行以下代码:

test <- 1:12
split(sample(test), 1:3)

By doing so any time you run your this code you would get a new random distribution in three different sets(perfect for k-fold cross validation).通过在任何时候运行此代码时这样做,您将在三个不同的集合中获得一个新的随机分布(非常适合 k 折交叉验证)。

You get:你得到:

> split(sample(test), 1:3)
$`1`
[1] 5 8 7 3

$`2`
[1]  4  1 10  9

$`3`
[1]  2 11 12  6

> split(sample(test), 1:3)
$`1`
[1] 12  6  4  1

$`2`
[1] 3 8 7 5

$`3`
[1]  9  2 10 11

You could use an auxiliary vector to format the way you want to split your data.您可以使用辅助向量来格式化您想要拆分数据的方式。 Example:例子:

Data <- c(1,2,3,4,5,6)

Format <- c("X","Y","X","Y","Z,"Z")

output <- split(Data,Format)

Will generate the output:将生成输出:

$X
[1] 1 3

$Y
[1] 2 4

$Z
[1] 5 6

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM