繁体   English   中英

R split:重用向量中的第 i 个元素使 split 相等

[英]R split: Reuse the first ith element in a vector to make split equal

我想将一个向量拆分为子向量,这样元素的重叠仅在此r函数处理得很好的两个相邻子向量中。

blocks <- function(len, ov, n) {

  starts <- unique(sort(c(seq(1, n, len), seq(len-ov+1, n, len))))
  ends <- pmin(starts + len - 1, n)

  # truncate starts and ends to the first num elements
  num <- match(n, ends)
  head(data.frame(starts, ends), num)
}

vec = 1:17 # here is my vector
len = 5 # here is the length of each vector
ov = ceiling(len/2) #here is the number of overlapping element in each vector
b <- blocks(len, ov, length(vec)) # is data frame that set the starting and ending of each block
with(b, Map(function(i, j) vec[i:j], starts, ends)) # here is function that prints out the overlapping split

## here is the output below 
#[[1]]
#[1] 1 2 3 4 5

#[[2]]
#[1] 3 4 5 6 7

#[[3]]
#[1]  6  7  8  9 10

#[[4]]
#[1]  8  9 10 11 12

#[[5]]
#[1] 11 12 13 14 15

#[[6]]
#[1] 13 14 15 16 17

但是如果每个块的长度为 6,则最后一个块不会达到 6,如下所示

vec = 1:17 # here is my vector
len = 6 # here is the length of each vector
ov = ceiling(len/2) #here is the number of overlapping element in each vector
b <- blocks(len, ov, length(vec)) # is data frame that set the starting and ending of each block
with(b, Map(function(i, j) vec[i:j], starts, ends)) # here is function that prints out the overlapping split

## the block of 6 that I get
#[[1]]
#[1] 1 2 3 4 5 6

#[[2]]
#[1] 4 5 6 7 8 9

#[[3]]
#[1]  7  8  9 10 11 12

#[[4]]
#[1] 10 11 12 13 14 15

#[[5]]
#[1] 13 14 15 16 17

可以看到,当向量耗尽时,第5th个子向量是 5 个元素。

我想要的是

我希望每个子向量由相同数量的元素组成,包括最后一个子向量,以便最后一个子向量使用第一个元素来加起来它的元素数。 在这种情况下,我在最后一个子向量中有5元素而不是6 5元素,那么应该使用向量的第一个元素。

## the block of 6 that I want 
#[[1]]
#[1] 1 2 3 4 5 6

#[[2]]
#[1] 4 5 6 7 8 9

#[[3]]
#[1]  7  8  9 10 11 12

#[[4]]
#[1] 10 11 12 13 14 15

#[[5]]
#[1] 13 14 15 16 17 1

你可以试试 for 循环吗?

vec = 1:17 # here is my vector
len = 6 # here is the length of each vector
ov = ceiling(len/2) #here is the number of overlapping element in each vector


tmp <- list()
for(i in 1:len){
  mm <- gl(len, ov)
  mm_start <- which(mm == i)[1]
  mm_end <- mm_start+len-1
  
  if(length(vec) >= mm_end){
  tmp[[i]] <- vec[mm_start:mm_end]
  }else{
  tmp[[i]] <-  c(vec, vec[1:(mm_end-length(vec))])[mm_start:mm_end]
  }
}
tmp
[[1]]
[1] 1 2 3 4 5 6

[[2]]
[1] 4 5 6 7 8 9

[[3]]
[1]  7  8  9 10 11 12

[[4]]
[1] 10 11 12 13 14 15

[[5]]
[1] 13 14 15 16 17  1

[[6]]
[1] 16 17  1  2  3  4

您可以让序列继续超出向量的最大值并使用模数。 blocks2st正在使用模中发现的序列的arting点ov ; 我们通过添加向量长度来获得第二列值。 第二列应该只超过vec最大值一次,因此我们通过 boolean cumsum子集。

稍后在apply我们对模max(vec) + 1上的vec进行子集; 并添加整数乘法的余数%/%

blocks2 <- function(vec, len) {
  ov <- ceiling(len/2)
  f <- function(vec, len, ov) {
    st <- (vec - 1) %% ov == 0
    b <- cbind(vec[st], vec[st] + len - 1)
    b[cumsum(b[,2] >= max(vec)) <= 1, ]
  }
  res <- apply(f(vec, len, ov), 1, function(x) {
    r <- x[1]:x[2] %% (max(vec) + 1)
    add <- x[1]:x[2] %/% (max(vec) + 1)
    r + add
  })
  unname(split(res, col(res)))
}

长度四:

blocks2(vec, 4)
# [[1]]
# [1] 1 2 3 4
# 
# [[2]]
# [1] 3 4 5 6
# 
# [[3]]
# [1] 5 6 7 8
# 
# [[4]]
# [1]  7  8  9 10
# 
# [[5]]
# [1]  9 10 11 12
# 
# [[6]]
# [1] 11 12 13 14
# 
# [[7]]
# [1] 13 14 15 16
# 
# [[8]]
# [1] 15 16 17  1

长度五:

blocks2(vec, 5)
# [[1]]
# [1] 1 2 3 4 5
# 
# [[2]]
# [1] 4 5 6 7 8
# 
# [[3]]
# [1]  7  8  9 10 11
# 
# [[4]]
# [1] 10 11 12 13 14
# 
# [[5]]
# [1] 13 14 15 16 17

长度六:

blocks2(vec, 6)
# [[1]]
# [1] 1 2 3 4 5 6
# 
# [[2]]
# [1] 4 5 6 7 8 9
# 
# [[3]]
# [1]  7  8  9 10 11 12
# 
# [[4]]
# [1] 10 11 12 13 14 15
# 
# [[5]]
# [1] 13 14 15 16 17  1

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM