简体   繁体   English

Array.Sort是如何在.NET中实现的?

[英]how was Array.Sort implemented in .NET?

I am using structures in my programming and I sort the structure according to a value in the structure using IComparer . 我在编程中使用结构,并使用IComparer根据结构中的值对结构进行排序。

How did Microsoft implement the Array.Sort() method? Microsoft如何实现Array.Sort()方法? Is there any documentation (references) for this? 有没有这方面的文件(参考)? Is it the same for all types of Sort() in Visual Basic? 对于Visual Basic中的所有类型的Sort() ,它是否相同?

This is a simple example for what I want. 这是我想要的一个简单的例子。

Dim MyArray(6) As Integer
    MyArray(0) = 1
    MyArray(1) = 45
    MyArray(2) = 45
   ' Some Code.....
    '.........
    '..........
    MyArray(3) = 1
    MyArray(4) = 10
    ' Some Code.....
    '.........
    '..........
    MyArray(5) = 1
    MyArray(6) = 57

    Array.Sort(MyArray)

Array.Sort() will sort this array as: (1 1 1 10 45 45 57) Array.Sort()将此数组排序为: (1 1 1 10 45 45 57)

How does number 1 get sorted? 1号怎么排序? Is it bringing to the end the first one or keeps the old one in the same index? 它是将第一个结束还是保留在同一个索引中?

In my original example (before sorting), MyArray(0) = 1 and after sorting MyArray(0) = 1 . 在我的原始示例中(排序之前), MyArray(0) = 1并且在排序MyArray(0) = 1

Is this the same original 1 or this another 1 (the newest one added to the array) moved to that position? 这是相同的原始1或另一个1(添加到阵列的最新的一个)移动到那个位置?

In case the MyArray(0) = 1 after sorting should be MyArray(5) = 1 before sorting. 如果排序后MyArray(0) = 1 ,则排序前应为MyArray(5) = 1

It uses the Quicksort algorithm, which is not stable when implemented efficiently (in place). 它使用Quicksort算法,当有效实施(就地)时,该算法不稳定。 Meaning that it doesn't guarantee that values which are equal retain their prior relative position after sorting. 这意味着它不能保证相等的值在排序后保持其先前的相对位置。

For example, if you have a bunch of points: 例如,如果你有一堆点:

Point[] points = new Point[]
{
   new Point(0, 1),
   new Point(0, 2),
   new Point(0, 3),
   new Point(1, 1),
   new Point(1, 2),
   new Point(1, 3)
};

And you sort these points by x-coordinate only , using this comparer: 并且您使用此比较器按x坐标对这些点进行排序:

private int CompareByX(Point a, Point b)
{
    return a.X - b.X;
}

It will only guarantee that the points are sorted by their x-coordinate, meaning you could easily end up with a mixed up order (when looking at the y-coordinate): 它只能保证点按x坐标排序,这意味着你可以很容易地得到一个混合顺序(当看y坐标时):

Point(0, 3)
Point(0, 2)
Point(0, 1)
Point(1, 3)
Point(1, 2)
Point(1, 1)

[Edit] [编辑]

This doesn't mean that the sorting algorithm is non-deterministic (random). 这并不意味着排序算法是非确定性的(随机的)。 For same input data, you will get same output data on each run. 对于相同的输入数据,您将在每次运行时获得相同的输出数据。 You can also predict the actual way it will be reorganized if you examine the algorithm precisely, but it is unnecessary. 如果精确检查算法,也可以预测重组的实际方式,但这是不必要的。 It is sufficient just to know that this happens when using the sort routine. 只需知道在使用sort例程时就会发生这种情况就足够了。

Here is a working example for your problem, try changing the test data sizes (first line in Main ) and watch how the array gets reorganized on each run: 下面是您的问题的一个工作示例,尝试更改测试数据大小( Main第一行)并观察每次运行时如何重组数组:

class Program
{
    static void Main()
    {
        Point[] points = CreateTestData(1, 4).ToArray();
        DisplayItems("Before", points);
        Array.Sort(points, CompareByX);
        DisplayItems("After", points);
        Console.ReadLine();
    }

    private class Point
    {
        public int X { get; private set; }
        public int Y { get; private set; }
        public override string ToString()
        { return string.Format("({0},{1})", X, Y); }
        public Point(int x, int y)
        { X = x; Y = y; }
    }

    private static int CompareByX(Point a, Point b)
    { return a.X - b.X; }

    private static IEnumerable<Point> CreateTestData(int maxX, int maxY)
    {
        for (int x = 0; x <= 1; x++)
            for (int y = 0; y <= 4; y++)
                yield return new Point(x, y);
    }

    private static void DisplayItems(string msg, Point[] points)
    {
        Console.WriteLine(msg);
        foreach (Point p in points)
            Console.WriteLine(p.ToString());
        Console.WriteLine();
    }
}

Of course, if you extend the comparer delegate to include the Y coordinate, you will not have this problem: 当然,如果扩展比较器委托以包含Y坐标,则不会出现此问题:

    private static int CompareByX(Point a, Point b)
    {
         if (a.X == b.X) 
            return a.Y - b.Y;
         else
            return a.X - b.X;
    }

Array.Sort is an unstable sort, so the order of elements which are the same is undefined and not conserved. Array.Sort是一种不稳定的排序,因此相同元素的顺序是未定义的而不是守恒的。 The article on Array.Sort in MSDN states: MSDN中有关Array.Sort的文章指出:

This method uses the QuickSort algorithm. 此方法使用QuickSort算法。 This implementation performs an unstable sort; 此实现执行不稳定的排序; that is, if two elements are equal, their order might not be preserved. 也就是说,如果两个元素相等,则可能不会保留它们的顺序。 In contrast, a stable sort preserves the order of elements that are equal. 相反,稳定的排序保留了相等元素的顺序。

LINQ's OrderBy methods on the other hand are stable. 另一方面,LINQ的OrderBy方法是稳定的。 The article on OrderBy in the MSDN states: MSDN中关于OrderBy的文章指出:

This method performs a stable sort; 该方法执行稳定的排序; that is, if the keys of two elements are equal, the order of the elements is preserved. 也就是说,如果两个元素的键相等,则保留元素的顺序。 In contrast, an unstable sort does not preserve the order of elements that have the same key. 相反,不稳定的排序不会保留具有相同键的元素的顺序。

使用.Net Reflector并亲自查看...从方法名称看起来它们使用的是QuickSort算法:System.Array + SorterObjectArray.QuickSort

Array.Sort(), like most built-in sorters, uses a QuickSort implementation in a helper class behind the scenes. 与大多数内置分类器一样,Array.Sort()在幕后的助手类中使用QuickSort实现。 The sort is relatively efficient, and customizable using the IComparable and IComparer interfaces, but it's unstable; 排序相对有效,可以使用IComparable和IComparer接口进行定制,但它不稳定; the three 1s in your example may end up in a different relative order than they were before the sort. 您的示例中的三个1可能会以与排序之前不同的相对顺序结束。 You can see this if you use a more complex structure: 如果使用更复杂的结构,可以看到这个:

struct TestStruct
{
   int a;
   int b;
}

...

//As declared, this array is already sorted by both "a" and "b" properties
var myStructAray = new [] {new TestStruct{a=1,b=1}, new TestStruct{a=1,b=2}, new TestStruct{a=1,b=3});

//QuickSorts myStructArray based on the comparison of the lambda for each element
var newArray = Array.Sort(myStructArray, x=>x.a); 

//newArray may have a different order as myStructArray at this time
for(var i=0;i<myStructArray.Count();i++)
{
   //NUnit assertion; will almost surely fail given a sufficient array length
   Assert.AreEqual(myStructArray[i].b, newArray[i].b);
}

First of all, let's address several issues in your current plan with regards to best practices for .Net (VB or C#): 首先,让我们解决当前计划中有关.Net(VB或C#)最佳实践的几个问题:

  1. Prefer Class over Structure unless you have a good reason to do otherwise 除非你有充分的理由不这样做,否则选择Class over Class
  2. Avoid using Arrays 避免使用数组
  3. You can build that array as a one-liner: Dim MyArray() As Integer = {1, 45, 45, 1, 10, 1, 57} 您可以将该数组构建为单行: Dim MyArray() As Integer = {1, 45, 45, 1, 10, 1, 57}

As to your question of whether it's the "same" value 1, the answer is that it depends on how you look at it. 至于你是否是“相同”值1的问题,答案是它取决于你如何看待它。 For the general case, the answer is whether or not the sorting algorithm is considered stable . 对于一般情况,答案是排序算法是否被认为是稳定的 .Net's sorting algorithm in not stable. .Net的排序算法不稳定。

For this specific case, you're asking the wrong question. 对于这个特定情况,你问的是错误的问题。 1 is 1 is 1. There is no distinction between them. 1是1是1.它们之间没有区别。 If you feel like it matters, I challenge you to provide code to detect a difference between any two of the "1s" from that list in your original code (aside from array index). 如果您觉得这很重要,我会挑战您提供代码来检测原始代码中该列表中任何两个“1”之间的差异(除了数组索引)。

Other answers are based on old documentation, so here is an updated answer. 其他答案基于旧文档,所以这是一个更新的答案。 According to the latest documentation (emphasis mine): 根据最新文档 (强调我的):

The .NET Framework 4 and earlier versions used only the Quicksort algorithm. .NET Framework 4及更早版本使用Quicksort算法。 Now, Array.Sort uses the introspective sort (introsort) algorithm as follows: 现在, Array.Sort使用内省排序(introsort)算法 ,如下所示:

  • If the partition size is fewer than 16 elements, it uses an insertion sort algorithm. 如果分区大小少于16个元素,则使用插入排序算法。

  • If the number of partitions exceeds 2 * Log N , where N is the range of the input array, it uses a Heapsort algorithm. 如果分区数超过2 * Log N ,其中N是输入数组的范围,则它使用Heapsort算法。

  • Otherwise, it uses a Quicksort algorithm. 否则,它使用Quicksort算法。

It is still an unstable sort. 它仍然是一种不稳定的类型。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM