简体   繁体   English

R在位置处分割数字向量

[英]R split numeric vector at position

I am wondering about the simple task of splitting a vector into two at a certain index: 我想知道在某个索引处将矢量分成两个的简单任务:

splitAt <- function(x, pos){
  list(x[1:pos-1], x[pos:length(x)])
}

a <- c(1, 2, 2, 3)

> splitAt(a, 4)
[[1]]
[1] 1 2 2

[[2]]
[1] 3

My question: There must be some existing function for this, but I can't find it? 我的问题:必须有一些现有的功能,但我找不到它? Is maybe split a possibility? 也许split的可能性? My naive implementation also does not work if pos=0 or pos>length(a) . 如果pos=0pos>length(a)我的天真实现也不起作用。

An improvement would be: 改进将是:

splitAt <- function(x, pos) unname(split(x, cumsum(seq_along(x) %in% pos)))

which can now take a vector of positions: 现在可以采取一个位置向量:

splitAt(a, c(2, 4))
# [[1]]
# [1] 1
# 
# [[2]]
# [1] 2 2
# 
# [[3]]
# [1] 3

And it does behave properly (subjective) if pos <= 0 or pos >= length(x) in the sense that it returns the whole original vector in a single list item. 并且如果pos <= 0pos >= length(x) ,它在单个列表项中返回整个原始向量的意义上它表现得恰当 (主观)。 If you'd like it to error out instead, use stopifnot at the top of the function. 如果您希望错误输出, stopifnot在函数顶部使用stopifnot

I tried to use flodel's answer , but it was too slow in my case with a very large x (and the function has to be called repeatedly). 我尝试使用flodel的答案 ,但在我的情况下使用非常大的x (并且必须重复调用该函数)太慢了。 So I created the following function that is much faster, but also very ugly and doesn't behave properly. 所以我创建了以下功能,这个功能更快,但也非常难看并且行为不正常。 In particular, it doesn't check anything and will return buggy results at least for pos >= length(x) or pos <= 0 (you can add those checks yourself if you're unsure about your inputs and not too concerned about speed), and perhaps some other cases as well, so be careful. 特别是,它不检查任何东西,并且至少对于pos >= length(x)pos <= 0会返回错误结果(如果你不确定你的输入并且不太关心速度,你可以自己添加这些检查),也许还有其他一些案例,所以要小心。

splitAt2 <- function(x, pos) {
    out <- list()
    pos2 <- c(1, pos, length(x)+1)
    for (i in seq_along(pos2[-1])) {
        out[[i]] <- x[pos2[i]:(pos2[i+1]-1)]
    }
    return(out)
}

However, splitAt2 runs about 20 times faster with an x of length 10 6 : 但是, splitAt2运行速度提高约20倍,x长度为10 6

library(microbenchmark)
W <- rnorm(1e6)
splits <- cumsum(rep(1e5, 9))
tm <- microbenchmark(
                     splitAt(W, splits),
                     splitAt2(W, splits),
                     times=10)
tm

Another alternative that might be faster and/or more readable/elegant than flodel's solution : 另一种可能比flodel解决方案更快和/或更易读/更优雅的替代方案

splitAt <- function(x, pos) {
  unname(split(x, findInterval(x, pos)))
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM