[英]R: how to check whether a vector is ascending/descending
vector1 = c(2, 2, 2, 2, 2, 2)
vector2 = c(2, 2, 3, 3, 3, 3)
vector3 = c(2, 2, 1, 2, 2, 2)
I want to know if the numbers in the vector are ascending/staying the same or descending. 我想知道向量中的数字是上升/保持相同还是下降。 So for
vector1
and vector2
, it should be TRUE
, whereas for vector3
it should be FALSE
. 因此对于
vector1
和vector2
,它应该是TRUE
,而对于vector3
它应该是FALSE
。 Simply put it should return FALSE
if there's a reversion in the vector. 简单地说,如果向量中存在回归,则应该返回
FALSE
。 Is there a quick way to do this without writing a loop? 没有编写循环,有没有快速的方法来做到这一点?
There is a base R
function called is.unsorted
that is ideal for this situation: 有一个名为
is.unsorted
的基本R
函数,适用于这种情况:
!is.unsorted(vector1)
# [1] TRUE
!is.unsorted(vector2)
# [1] TRUE
!is.unsorted(vector3)
# [1] FALSE
This function is very fast as it appeals almost directly to compiled C
code. 这个函数非常快,因为它几乎直接吸引到编译的
C
代码。
My initial thought was to use sort
and identical
, a la identical(sort(vector1), vector1)
, but this is pretty slow; 我最初的想法是使用
sort
和identical
, 一拉 identical(sort(vector1), vector1)
但是这是非常缓慢的; that said, I think this approach can be extended to more flexible situations. 那说,我认为这种方法可以扩展到更灵活的情况。
If speed was really crucial, we could skip some of the overhead of is.unsorted
and call the internal function directly: 如果速度真的很重要,我们可以跳过
is.unsorted
一些开销并直接调用内部函数:
.Internal(is.unsorted(vector1, FALSE))
(the FALSE
passes FALSE
to the argument strictly
). (该
FALSE
传递FALSE
到参数strictly
)。 This offered a ~4x speed-up on a small vector. 这为小矢量提供了~4倍的加速。
To get a sense of just how fast the final option is, here's a benchmark: 为了了解最终选项的速度,这里有一个基准:
library(microbenchmark)
set.seed(10101)
srtd <- sort(sample(1e6, rep = TRUE)) # a sorted test case
unsr <- sample(1e6, rep = TRUE) #an unsorted test case
microbenchmark(times = 1000L,
josilber = {all(diff(srtd) >= 0)
all(diff(unsr) >= 0)},
mikec = {identical(sort(srtd), srtd)
identical(sort(unsr), unsr)},
baser = {!is.unsorted(srtd)
!is.unsorted(unsr)},
intern = {!.Internal(is.unsorted(srtd, FALSE))
!.Internal(is.unsorted(unsr, FALSE))})
Results on my machine: 在我的机器上的结果:
# Unit: microseconds
# expr min lq mean median uq max neval cld
# josilber 30349.108 30737.6440 34550.6599 34113.5970 34964.171 155283.320 1000 c
# mikec 93167.836 94183.8865 97119.4493 94852.7530 97528.859 229692.328 1000 d
# baser 1089.670 1168.7400 1322.9341 1296.7375 1347.946 6301.866 1000 b
# intern 514.816 532.4405 576.2867 560.5955 566.236 2456.237 1000 a
So calling the internal function directly (caveat: you need to be sure your vector is perfectly clean--no NA
s, etc.) gives you ~2x speed versus the base R
function, which is in turn ~30x faster than using diff
, which is in turn ~2x as fast as my initial choice. 所以直接调用内部函数(需要注意的是:你需要确保你的矢量非常干净 - 没有
NA
等等)比基本R
函数提供~2倍的速度,这比使用diff
快约30倍,这反过来又是我最初选择的2倍。
You can diff
to compute the differences between elements and all
to check if they are all non-negative: 您可以使用
diff
来计算元素和all
元素之间的差异,以检查它们是否都是非负数:
all(diff(vector1) >= 0)
# [1] TRUE
all(diff(vector2) >= 0)
# [1] TRUE
all(diff(vector3) >= 0)
# [1] FALSE
The above code checks if all the vectors are non-decreasing, and you could replace >= 0
with <= 0
to check if they're non-increasing. 上面的代码检查所有向量是否都是非递减的,您可以用
<= 0
替换>= 0
<= 0
来检查它们是否不增加。 If instead your goal is to identify vectors that are either non-decreasing or non-increasing (aka they don't have an increasing and a decreasing step in the same vector), there's a simple modification: 相反,如果您的目标是识别非递减或不递增的向量(也就是说它们在同一向量中没有增加和减少的步骤),则有一个简单的修改:
!all(c(-1, 1) %in% sign(diff(vector1)))
# [1] TRUE
!all(c(-1, 1) %in% sign(diff(vector2)))
# [1] TRUE
!all(c(-1, 1) %in% sign(diff(vector3)))
# [1] FALSE
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.