简体   繁体   English

R:字符向量的子集

[英]R: subset of character vector

I want to get a subset from a character vector. 我想从字符向量中获取一个子集。 However I want to obtain vector2 containing elements from initial vector between specific elements. 但是我想从特定元素之间的初始向量中获取包含元素的vector2。

vector <- c("a", "", "b", "c","","d", "e")
vector

how to grab all elements between elements "b" and "e" and get vector2? 如何获取元素“b”和“e”之间的所有元素并获取vector2?

#Expected result:
vector2
"c","","d"

Here is one option 这是一个选择

f <- function(x, left, right) {
  idx <- x %in% c(left, right)
  x[as.logical(cumsum(idx) * !idx)]
}

f(vector, "b", "e")
# [1] "c" ""  "d"

The first step is to calculate idx as 第一步是将idx计算为

vector %in% c("b", "e")
# [1] FALSE FALSE  TRUE FALSE FALSE FALSE  TRUE

then calculate the cumulative sum 然后计算累积总和

cumsum(vector %in% c("b", "e"))
# [1] 0 0 1 1 1 1 2

multiply by !vector %in% c("b", "e") which gives 乘以!vector %in% c("b", "e")给出

cumsum(vector %in% c("b", "e")) * !vector %in% c("b", "e")
# [1] 0 0 0 1 1 1 0

convert to this to a logical vector and use it to subset x . 将其转换为逻辑向量并将其用于子集x


For the given example another option is charmatch 对于给定的示例,另一个选项是charmatch

x <- charmatch(c("b", "e"), vector) + c(1, -1)
vector[seq.int(x[1], x[2])]
# [1] "c" ""  "d"

You can also do something like this: 你也可以这样做:

vector <- c("a", "", "b", "c","","d", "e")
vector[seq(which(vector=="b")+1,which(vector=="e")-1)]
#[1] "c" ""  "d"

With negative subscripts: 负下标:

x[-c(1:which(x == 'b'), which(x =='e'):length(x))]
#[1] "c" ""  "d"

In case when e is found before b it returns empty vector: 如果在b之前找到e则返回空向量:

(y <- rev(x))
#[1] "e" "d" ""  "c" "b" ""  "a"
y[-c(1:which(y == 'b'), which(y =='e'):length(y))]
#character(0)

You can also try: 你也可以尝试:

vector[cumsum(vector %in% c("b", "e")) == 1][-1]

[1] "c" ""  "d"

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM