简体   繁体   English

检查数组是否已排序的最快方法

[英]Fastest way to check if an array is sorted

Considering there is an array returned from a function which is of very large size.考虑到有一个从非常大的函数返回的数组。

What will be the fastest approach to test if the array is sorted?测试数组是否已排序的fastest方法是什么?

A simplest approach will be:一个最简单的方法是:

/// <summary>
/// Determines if int array is sorted from 0 -> Max
/// </summary>
public static bool IsSorted(int[] arr)
{
for (int i = 1; i < arr.Length; i++)
{
    if (arr[i - 1] > arr[i])
    {
    return false;
    }
}
return true;
}

You will have to visit each element of the array to see if anything is unsorted.您必须访问数组的每个元素以查看是否有未排序的内容。

Your O(n) approach is about as fast as it gets, without any special knowledge about the likely state of the array.你的 O(n) 方法是尽可能快的,没有任何关于数组可能状态的特殊知识。

Your code specifically tests if the array is sorted with smaller values at lower indices .您的代码专门测试数组是否在较低的索引处使用较小的值进行排序。 If that is not what you intend, your if becomes slightly more complex.如果这不是你想要的,你的if会变得稍微复杂一些。 Your code comment does suggest that is what you're after.您的代码注释确实表明这就是您所追求的。

If you were to have special knowledge of the probable state (say, you know it's generally sorted but new data might be added to the end), you can optimize the order in which you visit array elements to allow the test to fail faster when the array is unsorted.如果您对可能的状态有特殊的了解(例如,您知道它通常已排序但新数据可能会添加到末尾),您可以优化访问数组元素的顺序,以便在出现以下情况时测试更快地失败数组未排序。

You can leverage knowledge of the hardware architecture to check multiple parts of the array in parallel by partitioning the array, first comparing the boundaries of the partition (fail fast check) and then running one array partition per core on a separate thread (no more than 1 thread per CPU core).您可以利用硬件体系结构的知识通过对阵列进行分区来并行检查阵列的多个部分,首先比较分区的边界(快速检查失败),然后在单独的线程上为每个内核运行一个阵列分区(不超过每个 CPU 内核 1 个线程)。 Note though that if a array partition is much smaller than the size of a cache line, the threads will tend to compete with each other for access to the memory containing the array.但请注意,如果数组分区远小于缓存行的大小,线程将倾向于相互竞争以访问包含该数组的内存。 Multithreading will only be very efficient for fairly large arrays.多线程只会对相当大的数组非常有效。

Faster approach, platform target: Any CPU, Prefer 32-bit.更快的方法,平台目标:任何 CPU,首选 32 位。
A sorted array with 512 elements: ~25% faster.一个包含 512 个元素的排序数组:快 25%。

static bool isSorted(int[] a)
{
    int j = a.Length - 1;
    if (j < 1) return true;
    int ai = a[0], i = 1;
    while (i <= j && ai <= (ai = a[i])) i++;
    return i > j;
}

Target: x64, same array: ~40% faster.目标:x64,相同的阵列:快 40%。

static bool isSorted(int[] a)
{
    int i = a.Length - 1;
    if (i <= 0) return true;
    if ((i & 1) > 0) { if (a[i] < a[i - 1]) return false; i--; }
    for (int ai = a[i]; i > 0; i -= 2)
        if (ai < (ai = a[i - 1]) || ai < (ai = a[i - 2])) return false;
    return a[0] <= a[1];
}

Forgot one, marginally slower than my first code block.忘记了,比我的第一个代码块慢一点。

static bool isSorted(int[] a)
{
    int i = a.Length - 1; if (i < 1) return true;
    int ai = a[i--]; while (i >= 0 && ai >= (ai = a[i])) i--;
    return i < 0;
}

Measuring it (see greybeard's comment).测量它(见灰胡子的评论)。

using System;                                  //  ????????? DEBUG ?????????
using sw = System.Diagnostics.Stopwatch;       //  static bool abc()    
class Program                                  //  {   // a <= b <= c ?  
{                                              //      int a=4,b=7,c=9;  
    static void Main()                         //      int i = 1;  
    {                                          //      if (a <= (a = b))  
        //abc();                               //      {  
        int i = 512;                           //          i++;  
        int[] a = new int[i--];                //          if (a <= (a = c))
        while (i > 0) a[i] = i--;              //          {    
        sw sw = sw.StartNew();                 //              i++;  
        for (i = 10000000; i > 0; i--)         //          }  
            isSorted(a);                       //      }  
        sw.Stop();                             //      return i > 2;  
        Console.Write(sw.ElapsedMilliseconds); //  }  
        Console.Read();                        //  static bool ABC();
    }                                          //  {
                                               //      int[]a={4,7,9};    
    static bool isSorted(int[] a) // OP Cannon //      int i=1,j=2,ai=a[0]; 
    {                                          //  L0: if(i<=j)    
        for (int i = 1; i < a.Length; i++)     //        if(ai<=(ai=a[i]))  
            if (a[i - 1] > a[i]) return false; //          {i++;goto L0;}  
        return true;                           //      return i > j;  
    }                                          //  }  
}

Target: x64.目标:x64。 Four cores/threads.四核/线程。 A sorted array with 100,000 elements: ~55%.具有 100,000 个元素的排序数组:~55%。

static readonly object _locker = new object();
static bool isSorted(int[] a)  // a.Length > 3
{
    bool b = true;
    Parallel.For(0, 4, k =>
    {
        int i = 0, j = a.Length, ai = 0;
        if (k == 0) { j /= 4; ai = a[0]; }                        // 0 1
        if (k == 1) { j /= 2; i = j / 2; ai = a[i]; }             // 1 2
        if (k == 2) { i = j - 1; ai = a[i]; j = j / 2 + j / 4; }  // 4 3
        if (k == 3) { i = j - j / 4; ai = a[i]; j = j / 2; }      // 3 2
        if (k < 2)
            while (b && i <= j)
            {
                if (ai <= (ai = a[i + 1]) && ai <= (ai = a[i + 2])) i += 2;
                else lock (_locker) b = false;
            }
        else
            while (b && i >= j)
            {
                if (ai >= (ai = a[i - 1]) && ai >= (ai = a[i - 2])) i -= 2;
                else lock (_locker) b = false;
            }
    });
    return b;
}

1,000,000 items? 1,000,000 件物品?

if (k < 2)
    while (b && i < j)
        if (ai <= (ai = a[i + 1]) && ai <= (ai = a[i + 2]) &&
            ai <= (ai = a[i + 3]) && ai <= (ai = a[i + 4])) i += 4;
        else lock (_locker) b = false;
else
    while (b && i > j)
        if (ai >= (ai = a[i - 1]) && ai >= (ai = a[i - 2]) &&
            ai >= (ai = a[i - 3]) && ai >= (ai = a[i - 4])) i -= 4;
        else lock (_locker) b = false;

Let's forget percentages.让我们忘记百分比。
Original: 0.77 ns/item, now: 0.22 ns/item.原始:0.77 ns/项,现在:0.22 ns/项。
2,000,000 items? 2,000,000 件物品? Four cores: 4 times faster.四核:快 4 倍。

Linq solution.林克解决方案。

public static bool IsSorted<T>(IEnumerable<T> list) where T:IComparable<T>
{
    var y = list.First();
    return list.Skip(1).All(x =>
    {
        bool b = y.CompareTo(x) < 0;
        y = x;
        return b;
    });
}

Here is my version of the function IsSorted这是我的 IsSorted 函数版本

public static bool IsSorted(int[] arr)
{               
    int last = arr.Length - 1;
    if (last < 1) return true;

    int i = 0;

    while(i < last && arr[i] <= arr[i + 1])
        i++;

    return i == last;
}

While this function is a bit faster than in the question, it will do fewer assignments and comparisons than anything has been posted so far.虽然这个函数比问题中的要快一点,但它的分配和比较比迄今为止发布的任何东西都要少。 In the worst case, it does 2n+1 comparisons.在最坏的情况下,它会进行 2n+1 次比较。 It still can be improved if you can make a reasonable assumption about the nature of the data like minimum data size or array contains even number of elements.如果您可以对数据的性质做出合理的假设,例如最小数据大小或数组包含偶数个元素,它仍然可以改进。

The only improvement i can think of is check both ends of the array at the same time, this little change will do it in half time...我能想到的唯一改进是同时检查数组的两端,这个小小的改变将在半场时间完成......

public static bool IsSorted(int[] arr)
{
int l = arr.Length;
for (int i = 1; i < l/2 + 1 ; i++)
{
    if (arr[i - 1] > arr[i] || arr[l-i] < arr[l-i-1])
    {
    return false;
    }
}
return true;
}

This is what I came up with and find works better particularly with greater sized arrays.这就是我想出来的,发现效果更好,尤其是对于更大尺寸的数组。 The function is recursive and will be called for the very first time, say in a while loop like this该函数是递归的,将在第一次被调用,比如在这样的 while 循环中

while( isSorted( yourArray, 0 )

The if statement checks if the bounds of the array have been reached. if 语句检查是否已达到数组的边界。

The else if statement will call itself recursively and break at any time when the condition becomes false else if 语句将递归调用自身并在条件变为假时随时中断

 public static bool IsSorted(int[] arr, int index)
    {
        if (index >= arr.Length - 1)
        {
            return true;
        }
        else if ((arr[index] <= arr[ index + 1]) && IsSorted(arr, index + 1))
        {
            return true;
        }
        else
        {
            return false;
        }
    }

If the order doesn't matter(descending or ascending).如果顺序无关紧要(降序或升序)。

private bool IsSorted<T>(T[] values) where T:IComparable<T>
{
    if (values == null || values.Length == 0) return true;

    int sortOrder = 0;

    for (int i = 0; i < values.Length - 1; i++)
    {
        int newSortOrder = values[i].CompareTo(values[i + 1]);

        if (sortOrder == 0) sortOrder = newSortOrder;

        if (newSortOrder != 0 && sortOrder != newSortOrder) return false;
    }

    return true;
}

The question that comes to my mind is "why"?我想到的问题是“为什么”?

Is it to avoid re-sorting an already-sorted list?是为了避免重新排序已经排序的列表吗? If yes, just use Timsort (standard in Python and Java).如果是,只需使用 Timsort(Python 和 Java 中的标准)。 It's quite good at taking advantage of an array/list being already sorted, or almost sorted.它非常擅长利用已经排序或几乎排序的数组/列表。 Despite how good Timsort is at this, it's better to not sort inside a loop.尽管 Timsort 在这方面做得很好,但最好不要在循环内排序。

Another alternative is to use a datastructure that is innately sorted, like a treap, red-black tree or AVL tree.另一种选择是使用先天排序的数据结构,如收获树、红黑树或 AVL 树。 These are good alternatives to sorting inside a loop.这些是在循环内排序的不错选择。

This might not be the fastest but it's the complete solution.这可能不是最快的,但它是完整的解决方案。 Every value with index lower than i is checked against the current value at i .索引低于 i 的每个值都根据i处的当前值进行检查。 This is written in php but can easily be translated into c# or javascript这是用 php 编写的,但可以很容易地翻译成 c# 或 javascript

for ($i = 1; $i < $tot; $i++) {
        for ($j = 0; $j <= $i; $j++) {
            //Check all previous values with indexes lower than $i
            if ($chekASCSort[$i - $j] > $chekASCSort[$i]) {
                return false;
            }
        }
    }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM