簡體   English   中英

使用 LINQ 進行字母數字排序

[英]Alphanumeric sorting using LINQ

我有一個string[] ,其中每個元素都以某個數值結尾。

string[] partNumbers = new string[] 
{ 
    "ABC10", "ABC1","ABC2", "ABC11","ABC10", "AB1", "AB2", "Ab11" 
};

我正在嘗試使用LINQ按如下方式對上述數組進行排序,但沒有得到預期的結果。

var result = partNumbers.OrderBy(x => x);

實際結果:

AB1
抗體11
AB2
ABC1
ABC10
ABC10
ABC11
ABC2

預期結果

AB1
AB2
AB11
..

這是因為字符串的默認排序是標准字母數字字典(詞典)排序,而 ABC11 將在 ABC2 之前,因為排序總是從左到右進行。

為了得到你想要的東西,你需要在 order by 子句中填充數字部分,例如:

 var result = partNumbers.OrderBy(x => PadNumbers(x));

其中PadNumbers可以定義為:

public static string PadNumbers(string input)
{
    return Regex.Replace(input, "[0-9]+", match => match.Value.PadLeft(10, '0'));
}

這將為出現在輸入字符串中的任何數字(或多個數字)填充零,以便OrderBy看到:

ABC0000000010
ABC0000000001
...
AB0000000011

填充僅發生在用於比較的鍵上。 結果中保留了原始字符串(無填充)。

請注意,此方法假定輸入中的數字有最大位數。

可以在Dave Koelle的站點上找到“正常工作”的字母數字排序方法的正確實現。 C# 版本在這里

如果您想使用 LINQ 和Dave Koelle 之類的自定義比較器按特定屬性對對象列表進行排序,您可以執行以下操作:

...

items = items.OrderBy(x => x.property, new AlphanumComparator()).ToList();

...

您還必須更改 Dave 的類以繼承自System.Collections.Generic.IComparer<object>而不是基本的IComparer因此類簽名變為:

...

public class AlphanumComparator : System.Collections.Generic.IComparer<object>
{

    ...

就我個人而言,我更喜歡James McCormack的實現,因為它實現了 IDisposable,盡管我的基准測試表明它稍微慢一些。

您可以使用 PInvoke 來獲得快速和良好的結果:

class AlphanumericComparer : IComparer<string>
{
    [DllImport("shlwapi.dll", CharSet = CharSet.Unicode)]
    static extern int StrCmpLogicalW(string s1, string s2);

    public int Compare(string x, string y) => StrCmpLogicalW(x, y);
}

您可以像上面的答案中的AlphanumComparatorFast一樣使用它。

您可以PInvokeStrCmpLogicalW (windows 函數)來執行此操作。 請參見此處: C# 中的自然排序順序

public class AlphanumComparatorFast : IComparer
{
    List<string> GetList(string s1)
    {
        List<string> SB1 = new List<string>();
        string st1, st2, st3;
        st1 = "";
        bool flag = char.IsDigit(s1[0]);
        foreach (char c in s1)
        {
            if (flag != char.IsDigit(c) || c=='\'')
            {
                if(st1!="")
                SB1.Add(st1);
                st1 = "";
                flag = char.IsDigit(c);
            }
            if (char.IsDigit(c))
            {
                st1 += c;
            }
            if (char.IsLetter(c))
            {
                st1 += c;
            }


        }
        SB1.Add(st1);
        return SB1;
    }



    public int Compare(object x, object y)
    {
        string s1 = x as string;
        if (s1 == null)
        {
            return 0;
        }
        string s2 = y as string;
        if (s2 == null)
        {
            return 0;
        }
        if (s1 == s2)
        {
            return 0;
        }
        int len1 = s1.Length;
        int len2 = s2.Length;
        int marker1 = 0;
        int marker2 = 0;

        // Walk through two the strings with two markers.
        List<string> str1 = GetList(s1);
        List<string> str2 = GetList(s2);
        while (str1.Count != str2.Count)
        {
            if (str1.Count < str2.Count)
            {
                str1.Add("");
            }
            else
            {
                str2.Add("");
            }
        }
        int x1 = 0; int res = 0; int x2 = 0; string y2 = "";
        bool status = false;
        string y1 = ""; bool s1Status = false; bool s2Status = false;
        //s1status ==false then string ele int;
        //s2status ==false then string ele int;
        int result = 0;
        for (int i = 0; i < str1.Count && i < str2.Count; i++)
        {
            status = int.TryParse(str1[i].ToString(), out res);
            if (res == 0)
            {
                y1 = str1[i].ToString();
                s1Status = false;
            }
            else
            {
                x1 = Convert.ToInt32(str1[i].ToString());
                s1Status = true;
            }

            status = int.TryParse(str2[i].ToString(), out res);
            if (res == 0)
            {
                y2 = str2[i].ToString();
                s2Status = false;
            }
            else
            {
                x2 = Convert.ToInt32(str2[i].ToString());
                s2Status = true;
            }
            //checking --the data comparision
            if(!s2Status && !s1Status )    //both are strings
            {
                result = str1[i].CompareTo(str2[i]);
            }
            else if (s2Status && s1Status) //both are intergers
            {
                if (x1 == x2)
                {
                    if (str1[i].ToString().Length < str2[i].ToString().Length)
                    {
                        result = 1;
                    }
                    else if (str1[i].ToString().Length > str2[i].ToString().Length)
                        result = -1;
                    else
                        result = 0;
                }
                else
                {
                    int st1ZeroCount=str1[i].ToString().Trim().Length- str1[i].ToString().TrimStart(new char[]{'0'}).Length;
                    int st2ZeroCount = str2[i].ToString().Trim().Length - str2[i].ToString().TrimStart(new char[] { '0' }).Length;
                    if (st1ZeroCount > st2ZeroCount)
                        result = -1;
                    else if (st1ZeroCount < st2ZeroCount)
                        result = 1;
                    else
                    result = x1.CompareTo(x2);

                }
            }
            else
            {
                result = str1[i].CompareTo(str2[i]);
            }
            if (result == 0)
            {
                continue;
            }
            else
                break;

        }
        return result;
    }
}

這個類的用法:

    List<string> marks = new List<string>();
                marks.Add("M'00Z1");
                marks.Add("M'0A27");
                marks.Add("M'00Z0");
marks.Add("0000A27");
                marks.Add("100Z0");

    string[] Markings = marks.ToArray();

                Array.Sort(Markings, new AlphanumComparatorFast());

不管是小字符還是大寫字符,它看起來都像按字典順序排序。

您可以嘗試在該 lambda 中使用一些自定義表達式來執行此操作。

在 .NET 中沒有自然的方法來做到這一點, 但看看這篇關於自然排序的博客文章

您可以將其放入擴展方法中並使用它代替 OrderBy

對於那些喜歡通用方法的人,將AlphanumComparator稍微調整為 Dave Koelle : AlphanumComparator

第一步(我將類重命名為非縮寫並采用 TCompareType 泛型類型參數):

 public class AlphanumericComparator<TCompareType> : IComparer<TCompareType>

接下來的調整是導入以下命名空間:

using System.Collections.Generic;

我們將 Compare 方法的簽名從 object 更改為 TCompareType:

    public int Compare(TCompareType x, TCompareType y)
    { .... no further modifications

現在我們可以為 AlphanumericComparator 指定正確的類型。 (我認為它實際上應該被稱為 AlphanumericComparer),當我們使用它時。

我的代碼中的示例用法:

   if (result.SearchResults.Any()) {
            result.SearchResults = result.SearchResults.OrderBy(item => item.Code, new AlphanumericComparator<string>()).ToList();
        }

現在您有一個字母數字比較器(比較器),它接受通用參數並可用於不同類型。

這是使用比較器的擴展方法:

            /// <summary>
        /// Returns an ordered collection by key selector (property expression) using alpha numeric comparer
        /// </summary>
        /// <typeparam name="T">The item type in the ienumerable</typeparam>
        /// <typeparam name="TKey">The type of the key selector (property to order by)</typeparam>
        /// <param name="coll">The source ienumerable</param>
        /// <param name="keySelector">The key selector, use a member expression in lambda expression</param>
        /// <returns></returns>
        public static IEnumerable<T> OrderByMember<T, TKey>(this IEnumerable<T> coll, Func<T, TKey> keySelector)
        {
            var result = coll.OrderBy(keySelector, new AlphanumericComparer<TKey>());
            return result;
        }

由於開頭的字符數是可變的,正則表達式會有所幫助:

var re = new  Regex(@"\d+$"); // finds the consecutive digits at the end of the string
var result = partNumbers.OrderBy(x => int.Parse(re.Match(x).Value));

如果有固定數量的前綴字符,那么您可以使用Substring方法從相關字符開始提取:

// parses the string as a number starting from the 5th character
var result = partNumbers.OrderBy(x => int.Parse(x.Substring(4)));

如果數字可能包含小數分隔符或千位分隔符,則正則表達式也需要允許這些字符:

var re = new Regex(@"[\d,]*\.?\d+$");
var result = partNumbers.OrderBy(x => double.Parse(x.Substring(4)));

如果正則表達式或Substring返回的Substring可能無法被int.Parse / double.Parse解析,則使用相關的TryParse變體:

var re = new  Regex(@"\d+$"); // finds the consecutive digits at the end of the string
var result = partNumbers.OrderBy(x => {
    int? parsed = null;
    if (int.TryParse(re.Match(x).Value, out var temp)) {
        parsed = temp;
    }
    return parsed;
});

只是在這里擴展@Nathan 的回答

var maxStringLength = partNumbers.Max(x => x).Count();
var result = partNumbers.OrderBy(x => PadNumbers(x, maxStringLength));

然后將參數傳遞給 PadNumbers 函數將是動態的。

public static string PadNumbers(string input, int maxStringLength)
{
    return Regex.Replace(input, "[0-9]+", match => match.Value.PadLeft(maxStringLength, '0'));
}

看起來 Dave Koelle 的代碼鏈接已經失效。 我從 WebArchive 獲得了最新版本。

/*
 * The Alphanum Algorithm is an improved sorting algorithm for strings
 * containing numbers.  Instead of sorting numbers in ASCII order like
 * a standard sort, this algorithm sorts numbers in numeric order.
 *
 * The Alphanum Algorithm is discussed at http://www.DaveKoelle.com
 *
 * Based on the Java implementation of Dave Koelle's Alphanum algorithm.
 * Contributed by Jonathan Ruckwood <jonathan.ruckwood@gmail.com>
 *
 * Adapted by Dominik Hurnaus <dominik.hurnaus@gmail.com> to
 *   - correctly sort words where one word starts with another word
 *   - have slightly better performance
 *
 * Released under the MIT License - https://opensource.org/licenses/MIT
 *
 * Permission is hereby granted, free of charge, to any person obtaining
 * a copy of this software and associated documentation files (the "Software"),
 * to deal in the Software without restriction, including without limitation
 * the rights to use, copy, modify, merge, publish, distribute, sublicense,
 * and/or sell copies of the Software, and to permit persons to whom the
 * Software is furnished to do so, subject to the following conditions:
 *
 * The above copyright notice and this permission notice shall be included
 * in all copies or substantial portions of the Software.
 *
 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
 * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
 * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
 * IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
 * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
 * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
 * USE OR OTHER DEALINGS IN THE SOFTWARE.
 *
 */
using System;
using System.Collections;
using System.Text;

/*
 * Please compare against the latest Java version at http://www.DaveKoelle.com
 * to see the most recent modifications
 */
namespace AlphanumComparator
{
    public class AlphanumComparator : IComparer
    {
        private enum ChunkType {Alphanumeric, Numeric};
        private bool InChunk(char ch, char otherCh)
        {
            ChunkType type = ChunkType.Alphanumeric;

            if (char.IsDigit(otherCh))
            {
                type = ChunkType.Numeric;
            }

            if ((type == ChunkType.Alphanumeric && char.IsDigit(ch))
                || (type == ChunkType.Numeric && !char.IsDigit(ch)))
            {
                return false;
            }

            return true;
        }

        public int Compare(object x, object y)
        {
            String s1 = x as string;
            String s2 = y as string;
            if (s1 == null || s2 == null)
            {
                return 0;
            }

            int thisMarker = 0, thisNumericChunk = 0;
            int thatMarker = 0, thatNumericChunk = 0;

            while ((thisMarker < s1.Length) || (thatMarker < s2.Length))
            {
                if (thisMarker >= s1.Length)
                {
                    return -1;
                }
                else if (thatMarker >= s2.Length)
                {
                    return 1;
                }
                char thisCh = s1[thisMarker];
                char thatCh = s2[thatMarker];

                StringBuilder thisChunk = new StringBuilder();
                StringBuilder thatChunk = new StringBuilder();

                while ((thisMarker < s1.Length) && (thisChunk.Length==0 ||InChunk(thisCh, thisChunk[0])))
                {
                    thisChunk.Append(thisCh);
                    thisMarker++;

                    if (thisMarker < s1.Length)
                    {
                        thisCh = s1[thisMarker];
                    }
                }

                while ((thatMarker < s2.Length) && (thatChunk.Length==0 ||InChunk(thatCh, thatChunk[0])))
                {
                    thatChunk.Append(thatCh);
                    thatMarker++;

                    if (thatMarker < s2.Length)
                    {
                        thatCh = s2[thatMarker];
                    }
                }

                int result = 0;
                // If both chunks contain numeric characters, sort them numerically
                if (char.IsDigit(thisChunk[0]) && char.IsDigit(thatChunk[0]))
                {
                    thisNumericChunk = Convert.ToInt32(thisChunk.ToString());
                    thatNumericChunk = Convert.ToInt32(thatChunk.ToString());

                    if (thisNumericChunk < thatNumericChunk)
                    {
                        result = -1;
                    }

                    if (thisNumericChunk > thatNumericChunk)
                    {
                        result = 1;
                    }
                }
                else
                {
                    result = thisChunk.ToString().CompareTo(thatChunk.ToString());
                }

                if (result != 0)
                {
                    return result;
                }
            }

            return 0;
        }
    }
}

我不知道如何在 LINQ 中做到這一點,但也許您喜歡這種方式:

Array.Sort(partNumbers, new AlphanumComparatorFast());

// 顯示結果

foreach (string h in partNumbers )
{
Console.WriteLine(h);
}

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM