![](/img/trans.png)
[英]SQL OrderBy clause for alphanumeric data in LINQ not sorting as expected
[英]Alphanumeric sorting using LINQ
我有一個string[]
,其中每個元素都以某個數值結尾。
string[] partNumbers = new string[]
{
"ABC10", "ABC1","ABC2", "ABC11","ABC10", "AB1", "AB2", "Ab11"
};
我正在嘗試使用LINQ
按如下方式對上述數組進行排序,但沒有得到預期的結果。
var result = partNumbers.OrderBy(x => x);
實際結果:
AB1
抗體11
AB2
ABC1
ABC10
ABC10
ABC11
ABC2
預期結果
AB1
AB2
AB11
..
這是因為字符串的默認排序是標准字母數字字典(詞典)排序,而 ABC11 將在 ABC2 之前,因為排序總是從左到右進行。
為了得到你想要的東西,你需要在 order by 子句中填充數字部分,例如:
var result = partNumbers.OrderBy(x => PadNumbers(x));
其中PadNumbers
可以定義為:
public static string PadNumbers(string input)
{
return Regex.Replace(input, "[0-9]+", match => match.Value.PadLeft(10, '0'));
}
這將為出現在輸入字符串中的任何數字(或多個數字)填充零,以便OrderBy
看到:
ABC0000000010
ABC0000000001
...
AB0000000011
填充僅發生在用於比較的鍵上。 結果中保留了原始字符串(無填充)。
請注意,此方法假定輸入中的數字有最大位數。
可以在Dave Koelle的站點上找到“正常工作”的字母數字排序方法的正確實現。 C# 版本在這里。
如果您想使用 LINQ 和Dave Koelle 之類的自定義比較器按特定屬性對對象列表進行排序,您可以執行以下操作:
...
items = items.OrderBy(x => x.property, new AlphanumComparator()).ToList();
...
您還必須更改 Dave 的類以繼承自System.Collections.Generic.IComparer<object>
而不是基本的IComparer
因此類簽名變為:
...
public class AlphanumComparator : System.Collections.Generic.IComparer<object>
{
...
就我個人而言,我更喜歡James McCormack的實現,因為它實現了 IDisposable,盡管我的基准測試表明它稍微慢一些。
您可以使用 PInvoke 來獲得快速和良好的結果:
class AlphanumericComparer : IComparer<string>
{
[DllImport("shlwapi.dll", CharSet = CharSet.Unicode)]
static extern int StrCmpLogicalW(string s1, string s2);
public int Compare(string x, string y) => StrCmpLogicalW(x, y);
}
您可以像上面的答案中的AlphanumComparatorFast
一樣使用它。
您可以PInvoke到StrCmpLogicalW
(windows 函數)來執行此操作。 請參見此處: C# 中的自然排序順序
public class AlphanumComparatorFast : IComparer
{
List<string> GetList(string s1)
{
List<string> SB1 = new List<string>();
string st1, st2, st3;
st1 = "";
bool flag = char.IsDigit(s1[0]);
foreach (char c in s1)
{
if (flag != char.IsDigit(c) || c=='\'')
{
if(st1!="")
SB1.Add(st1);
st1 = "";
flag = char.IsDigit(c);
}
if (char.IsDigit(c))
{
st1 += c;
}
if (char.IsLetter(c))
{
st1 += c;
}
}
SB1.Add(st1);
return SB1;
}
public int Compare(object x, object y)
{
string s1 = x as string;
if (s1 == null)
{
return 0;
}
string s2 = y as string;
if (s2 == null)
{
return 0;
}
if (s1 == s2)
{
return 0;
}
int len1 = s1.Length;
int len2 = s2.Length;
int marker1 = 0;
int marker2 = 0;
// Walk through two the strings with two markers.
List<string> str1 = GetList(s1);
List<string> str2 = GetList(s2);
while (str1.Count != str2.Count)
{
if (str1.Count < str2.Count)
{
str1.Add("");
}
else
{
str2.Add("");
}
}
int x1 = 0; int res = 0; int x2 = 0; string y2 = "";
bool status = false;
string y1 = ""; bool s1Status = false; bool s2Status = false;
//s1status ==false then string ele int;
//s2status ==false then string ele int;
int result = 0;
for (int i = 0; i < str1.Count && i < str2.Count; i++)
{
status = int.TryParse(str1[i].ToString(), out res);
if (res == 0)
{
y1 = str1[i].ToString();
s1Status = false;
}
else
{
x1 = Convert.ToInt32(str1[i].ToString());
s1Status = true;
}
status = int.TryParse(str2[i].ToString(), out res);
if (res == 0)
{
y2 = str2[i].ToString();
s2Status = false;
}
else
{
x2 = Convert.ToInt32(str2[i].ToString());
s2Status = true;
}
//checking --the data comparision
if(!s2Status && !s1Status ) //both are strings
{
result = str1[i].CompareTo(str2[i]);
}
else if (s2Status && s1Status) //both are intergers
{
if (x1 == x2)
{
if (str1[i].ToString().Length < str2[i].ToString().Length)
{
result = 1;
}
else if (str1[i].ToString().Length > str2[i].ToString().Length)
result = -1;
else
result = 0;
}
else
{
int st1ZeroCount=str1[i].ToString().Trim().Length- str1[i].ToString().TrimStart(new char[]{'0'}).Length;
int st2ZeroCount = str2[i].ToString().Trim().Length - str2[i].ToString().TrimStart(new char[] { '0' }).Length;
if (st1ZeroCount > st2ZeroCount)
result = -1;
else if (st1ZeroCount < st2ZeroCount)
result = 1;
else
result = x1.CompareTo(x2);
}
}
else
{
result = str1[i].CompareTo(str2[i]);
}
if (result == 0)
{
continue;
}
else
break;
}
return result;
}
}
這個類的用法:
List<string> marks = new List<string>();
marks.Add("M'00Z1");
marks.Add("M'0A27");
marks.Add("M'00Z0");
marks.Add("0000A27");
marks.Add("100Z0");
string[] Markings = marks.ToArray();
Array.Sort(Markings, new AlphanumComparatorFast());
不管是小字符還是大寫字符,它看起來都像按字典順序排序。
您可以嘗試在該 lambda 中使用一些自定義表達式來執行此操作。
在 .NET 中沒有自然的方法來做到這一點, 但看看這篇關於自然排序的博客文章
您可以將其放入擴展方法中並使用它代替 OrderBy
對於那些喜歡通用方法的人,將AlphanumComparator稍微調整為 Dave Koelle : AlphanumComparator 。
第一步(我將類重命名為非縮寫並采用 TCompareType 泛型類型參數):
public class AlphanumericComparator<TCompareType> : IComparer<TCompareType>
接下來的調整是導入以下命名空間:
using System.Collections.Generic;
我們將 Compare 方法的簽名從 object 更改為 TCompareType:
public int Compare(TCompareType x, TCompareType y)
{ .... no further modifications
現在我們可以為 AlphanumericComparator 指定正確的類型。 (我認為它實際上應該被稱為 AlphanumericComparer),當我們使用它時。
我的代碼中的示例用法:
if (result.SearchResults.Any()) {
result.SearchResults = result.SearchResults.OrderBy(item => item.Code, new AlphanumericComparator<string>()).ToList();
}
現在您有一個字母數字比較器(比較器),它接受通用參數並可用於不同類型。
這是使用比較器的擴展方法:
/// <summary>
/// Returns an ordered collection by key selector (property expression) using alpha numeric comparer
/// </summary>
/// <typeparam name="T">The item type in the ienumerable</typeparam>
/// <typeparam name="TKey">The type of the key selector (property to order by)</typeparam>
/// <param name="coll">The source ienumerable</param>
/// <param name="keySelector">The key selector, use a member expression in lambda expression</param>
/// <returns></returns>
public static IEnumerable<T> OrderByMember<T, TKey>(this IEnumerable<T> coll, Func<T, TKey> keySelector)
{
var result = coll.OrderBy(keySelector, new AlphanumericComparer<TKey>());
return result;
}
由於開頭的字符數是可變的,正則表達式會有所幫助:
var re = new Regex(@"\d+$"); // finds the consecutive digits at the end of the string
var result = partNumbers.OrderBy(x => int.Parse(re.Match(x).Value));
如果有固定數量的前綴字符,那么您可以使用Substring
方法從相關字符開始提取:
// parses the string as a number starting from the 5th character
var result = partNumbers.OrderBy(x => int.Parse(x.Substring(4)));
如果數字可能包含小數分隔符或千位分隔符,則正則表達式也需要允許這些字符:
var re = new Regex(@"[\d,]*\.?\d+$");
var result = partNumbers.OrderBy(x => double.Parse(x.Substring(4)));
如果正則表達式或Substring
返回的Substring
可能無法被int.Parse
/ double.Parse
解析,則使用相關的TryParse
變體:
var re = new Regex(@"\d+$"); // finds the consecutive digits at the end of the string
var result = partNumbers.OrderBy(x => {
int? parsed = null;
if (int.TryParse(re.Match(x).Value, out var temp)) {
parsed = temp;
}
return parsed;
});
只是在這里擴展@Nathan 的回答。
var maxStringLength = partNumbers.Max(x => x).Count();
var result = partNumbers.OrderBy(x => PadNumbers(x, maxStringLength));
然后將參數傳遞給 PadNumbers 函數將是動態的。
public static string PadNumbers(string input, int maxStringLength)
{
return Regex.Replace(input, "[0-9]+", match => match.Value.PadLeft(maxStringLength, '0'));
}
看起來 Dave Koelle 的代碼鏈接已經失效。 我從 WebArchive 獲得了最新版本。
/*
* The Alphanum Algorithm is an improved sorting algorithm for strings
* containing numbers. Instead of sorting numbers in ASCII order like
* a standard sort, this algorithm sorts numbers in numeric order.
*
* The Alphanum Algorithm is discussed at http://www.DaveKoelle.com
*
* Based on the Java implementation of Dave Koelle's Alphanum algorithm.
* Contributed by Jonathan Ruckwood <jonathan.ruckwood@gmail.com>
*
* Adapted by Dominik Hurnaus <dominik.hurnaus@gmail.com> to
* - correctly sort words where one word starts with another word
* - have slightly better performance
*
* Released under the MIT License - https://opensource.org/licenses/MIT
*
* Permission is hereby granted, free of charge, to any person obtaining
* a copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included
* in all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
* IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
* DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
* OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
* USE OR OTHER DEALINGS IN THE SOFTWARE.
*
*/
using System;
using System.Collections;
using System.Text;
/*
* Please compare against the latest Java version at http://www.DaveKoelle.com
* to see the most recent modifications
*/
namespace AlphanumComparator
{
public class AlphanumComparator : IComparer
{
private enum ChunkType {Alphanumeric, Numeric};
private bool InChunk(char ch, char otherCh)
{
ChunkType type = ChunkType.Alphanumeric;
if (char.IsDigit(otherCh))
{
type = ChunkType.Numeric;
}
if ((type == ChunkType.Alphanumeric && char.IsDigit(ch))
|| (type == ChunkType.Numeric && !char.IsDigit(ch)))
{
return false;
}
return true;
}
public int Compare(object x, object y)
{
String s1 = x as string;
String s2 = y as string;
if (s1 == null || s2 == null)
{
return 0;
}
int thisMarker = 0, thisNumericChunk = 0;
int thatMarker = 0, thatNumericChunk = 0;
while ((thisMarker < s1.Length) || (thatMarker < s2.Length))
{
if (thisMarker >= s1.Length)
{
return -1;
}
else if (thatMarker >= s2.Length)
{
return 1;
}
char thisCh = s1[thisMarker];
char thatCh = s2[thatMarker];
StringBuilder thisChunk = new StringBuilder();
StringBuilder thatChunk = new StringBuilder();
while ((thisMarker < s1.Length) && (thisChunk.Length==0 ||InChunk(thisCh, thisChunk[0])))
{
thisChunk.Append(thisCh);
thisMarker++;
if (thisMarker < s1.Length)
{
thisCh = s1[thisMarker];
}
}
while ((thatMarker < s2.Length) && (thatChunk.Length==0 ||InChunk(thatCh, thatChunk[0])))
{
thatChunk.Append(thatCh);
thatMarker++;
if (thatMarker < s2.Length)
{
thatCh = s2[thatMarker];
}
}
int result = 0;
// If both chunks contain numeric characters, sort them numerically
if (char.IsDigit(thisChunk[0]) && char.IsDigit(thatChunk[0]))
{
thisNumericChunk = Convert.ToInt32(thisChunk.ToString());
thatNumericChunk = Convert.ToInt32(thatChunk.ToString());
if (thisNumericChunk < thatNumericChunk)
{
result = -1;
}
if (thisNumericChunk > thatNumericChunk)
{
result = 1;
}
}
else
{
result = thisChunk.ToString().CompareTo(thatChunk.ToString());
}
if (result != 0)
{
return result;
}
}
return 0;
}
}
}
我不知道如何在 LINQ 中做到這一點,但也許您喜歡這種方式:
Array.Sort(partNumbers, new AlphanumComparatorFast());
// 顯示結果
foreach (string h in partNumbers )
{
Console.WriteLine(h);
}
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.