[英]R split: Reuse the first ith element in a vector to make split equal
我想将一个向量拆分为子向量,这样元素的重叠仅在此r
函数处理得很好的两个相邻子向量中。
blocks <- function(len, ov, n) {
starts <- unique(sort(c(seq(1, n, len), seq(len-ov+1, n, len))))
ends <- pmin(starts + len - 1, n)
# truncate starts and ends to the first num elements
num <- match(n, ends)
head(data.frame(starts, ends), num)
}
vec = 1:17 # here is my vector
len = 5 # here is the length of each vector
ov = ceiling(len/2) #here is the number of overlapping element in each vector
b <- blocks(len, ov, length(vec)) # is data frame that set the starting and ending of each block
with(b, Map(function(i, j) vec[i:j], starts, ends)) # here is function that prints out the overlapping split
## here is the output below
#[[1]]
#[1] 1 2 3 4 5
#[[2]]
#[1] 3 4 5 6 7
#[[3]]
#[1] 6 7 8 9 10
#[[4]]
#[1] 8 9 10 11 12
#[[5]]
#[1] 11 12 13 14 15
#[[6]]
#[1] 13 14 15 16 17
但是如果每个块的长度为 6,则最后一个块不会达到 6,如下所示
vec = 1:17 # here is my vector
len = 6 # here is the length of each vector
ov = ceiling(len/2) #here is the number of overlapping element in each vector
b <- blocks(len, ov, length(vec)) # is data frame that set the starting and ending of each block
with(b, Map(function(i, j) vec[i:j], starts, ends)) # here is function that prints out the overlapping split
## the block of 6 that I get
#[[1]]
#[1] 1 2 3 4 5 6
#[[2]]
#[1] 4 5 6 7 8 9
#[[3]]
#[1] 7 8 9 10 11 12
#[[4]]
#[1] 10 11 12 13 14 15
#[[5]]
#[1] 13 14 15 16 17
可以看到,当向量耗尽时,第5th
个子向量是 5 个元素。
我想要的是
我希望每个子向量由相同数量的元素组成,包括最后一个子向量,以便最后一个子向量使用第一个元素来加起来它的元素数。 在这种情况下,我在最后一个子向量中有5
元素而不是6
5
元素,那么应该使用向量的第一个元素。
## the block of 6 that I want
#[[1]]
#[1] 1 2 3 4 5 6
#[[2]]
#[1] 4 5 6 7 8 9
#[[3]]
#[1] 7 8 9 10 11 12
#[[4]]
#[1] 10 11 12 13 14 15
#[[5]]
#[1] 13 14 15 16 17 1
你可以试试 for 循环吗?
vec = 1:17 # here is my vector
len = 6 # here is the length of each vector
ov = ceiling(len/2) #here is the number of overlapping element in each vector
tmp <- list()
for(i in 1:len){
mm <- gl(len, ov)
mm_start <- which(mm == i)[1]
mm_end <- mm_start+len-1
if(length(vec) >= mm_end){
tmp[[i]] <- vec[mm_start:mm_end]
}else{
tmp[[i]] <- c(vec, vec[1:(mm_end-length(vec))])[mm_start:mm_end]
}
}
tmp
[[1]]
[1] 1 2 3 4 5 6
[[2]]
[1] 4 5 6 7 8 9
[[3]]
[1] 7 8 9 10 11 12
[[4]]
[1] 10 11 12 13 14 15
[[5]]
[1] 13 14 15 16 17 1
[[6]]
[1] 16 17 1 2 3 4
您可以让序列继续超出向量的最大值并使用模数。 在blocks2
, st
正在使用模中发现的序列的arting点ov
; 我们通过添加向量长度来获得第二列值。 第二列应该只超过vec
最大值一次,因此我们通过 boolean cumsum
子集。
稍后在apply
我们对模max(vec) + 1
上的vec
进行子集; 并添加整数乘法的余数%/%
。
blocks2 <- function(vec, len) {
ov <- ceiling(len/2)
f <- function(vec, len, ov) {
st <- (vec - 1) %% ov == 0
b <- cbind(vec[st], vec[st] + len - 1)
b[cumsum(b[,2] >= max(vec)) <= 1, ]
}
res <- apply(f(vec, len, ov), 1, function(x) {
r <- x[1]:x[2] %% (max(vec) + 1)
add <- x[1]:x[2] %/% (max(vec) + 1)
r + add
})
unname(split(res, col(res)))
}
长度四:
blocks2(vec, 4)
# [[1]]
# [1] 1 2 3 4
#
# [[2]]
# [1] 3 4 5 6
#
# [[3]]
# [1] 5 6 7 8
#
# [[4]]
# [1] 7 8 9 10
#
# [[5]]
# [1] 9 10 11 12
#
# [[6]]
# [1] 11 12 13 14
#
# [[7]]
# [1] 13 14 15 16
#
# [[8]]
# [1] 15 16 17 1
长度五:
blocks2(vec, 5)
# [[1]]
# [1] 1 2 3 4 5
#
# [[2]]
# [1] 4 5 6 7 8
#
# [[3]]
# [1] 7 8 9 10 11
#
# [[4]]
# [1] 10 11 12 13 14
#
# [[5]]
# [1] 13 14 15 16 17
长度六:
blocks2(vec, 6)
# [[1]]
# [1] 1 2 3 4 5 6
#
# [[2]]
# [1] 4 5 6 7 8 9
#
# [[3]]
# [1] 7 8 9 10 11 12
#
# [[4]]
# [1] 10 11 12 13 14 15
#
# [[5]]
# [1] 13 14 15 16 17 1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.