简体   繁体   English

是什么决定了Lists包含非字母数字字符的Collections.sort中的排序顺序?

[英]What determines sort order in Collections.sort where List contains non-alphanumeric characters?

I have code that sorts an ArrayList of elements based on one attribute called 'title' which is of type String. 我有基于一个名为“ title”的属性(其类型为String)对元素的ArrayList进行排序的代码。 The code uses Collator like this: 代码使用像这样的Collat​​or:

Collator( Collator collator = Collator.getInstance(); ).

I have two objects with title "@a" and the other object has title "#a" 我有两个标题为“ @a”的对象,另一个对象的标题为“ #a”

I pass these objects as a List and call 我将这些对象作为列表传递并调用

Collections.sort(list,comparator)

This gives the order as 这给出的顺序为

"@a" "#a"

Why is "#a" appearing last even though its ASCII value is less than "@a" ? 即使“ #a”的ASCII值小于“ @a”,为什么也最后出现?

Based on one of your comments, you're using a collator to sort your titles. 根据您的评论之一,您正在使用整理器对标题进行排序。 Why you didn't say that in your question is beyond me. 为什么您不说您的问题超出了我的范围。

Anyway, the collator sorts Strings according to locale preferences. 无论如何,整理器都会根据语言环境首选项对字符串进行排序。 It doesn't sort in lexicographic order. 它不是按字典顺序排序的。 And the collator you're using considers that the right order is the one you observe. 并且您使用的整理器认为正确的顺序就是您观察到的顺序。 If you want lexicographical order, you should not use a collator. 如果要按字典顺序排序,则不应使用排序规则。

Also note that a collator is always associated to a locale. 另请注意,整理程序始终与语言环境相关联。 The javadoc of Collator.getInstance() method says: Collator.getInstance()方法的Javadoc说:

Gets the Collator for the current default locale. 获取当前默认语言环境的整理器。

No, the output is "#a","@a". 不,输出为“ #a”,“ @ a”。 Which is absolutely right. 绝对正确。

Why is # appearing last even though its ASCII value is less than @ ? 为什么#的ASCII值小于@却最后出现?

My clean-room implementation: 我的无尘室实施:

final List<String> list = Arrays.asList("@a", "#a");
Collections.sort(list);
System.out.println(list);

Output: 输出:

[#a, @a] [#a,@a]

This code doesn't reproduce your problem. 此代码不会重现您的问题。

For reference: 以供参考:
'#' is 0x23 '#'是0x23
'@' is 0x40 '@'是0x40

Everything looks normal. 一切看起来都很正常。


EDIT: new code following your comment "The code uses Collator but its used as Collator collator = Collator.getInstance(); not specific to any locale." 编辑:在您的注释之后的新代码“该代码使用了Collat​​or,但它用作Collator collator = Collator.getInstance();不特定于任何语言环境。” :

final List<String> list = Arrays.asList("@a", "#a");
final Collator c = Collator.getInstance();

Collections.sort(list, c);
System.out.println(list);

Output: 输出:

[@a, #a] [@a,#a]

This reproduces your problem. 重现您的问题。

If I use Collator.getInstance() to sort the ASCII table, this is the output I get: 如果我使用Collator.getInstance()对ASCII表进行排序,则输出如下:

-, _, ,, ;, :, !, ?, /, ., `, ^, ', ", (, ), [, ], {, }, @, $, *, \\, &, #, %, +, <, =, >, |, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, a, A, b, B, c, C, d, D, e, E, f, F, g, G, h, H, i, I, j, J, k, K, l, L, m, M, n, N, o, O, p, P, q, Q, r, R, s, S, t, T, u, U, v, V, w, W, x, X, y, Y, z, Z -,_,,,;,:,!,?,/,。,`,^,',“,(,),[,],{,},@,$,*,\\,&,#, %,+,<,=,>,|,0,1,2,3,4,5,6,7,8,9,a,A,b,B,c,C,d,D,e, E,f,F,g,G,h,H,i,I,j,J,k,K,l,L,m,M,n,N,o,O,p,P,q,Q, r,R,s,S,t,T,u,U,v,V,w,W,x,X,y,Y,z,Z

You can see this is quite different from the ASCII collating order: 您可以看到这与ASCII整理顺序大不相同:

", #, $, %, &, ', (, ), *, +, ,, -, ., /, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, :, ;, <, =, >, ?, @, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y, Z, [, \\, ], ^, _, `, a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z, {, |, } “,#,$,%,&,',(,),*,+,,,--。,/,0,1,2,3,4,5,6,7,8,9,9,:, ;,<,=,>,?,@,A,B,C,D,E,F,G,H,I,J,K,L,M,N,O,P,Q,R,S, T,U,V,W,X,Y,Z,[,\\,],^,_,`,a,b,c,d,e,f,g,h,i,j,k,l, m,n,o,p,q,r,s,t,u,v,w,x,y,z,{,|,}

For OP's interest, this is the code used to create this output: 为了OP的利益,这是用于创建此输出的代码:

final List<String> list = new ArrayList<String>();
final Collator col = Collator.getInstance();

for (char c = '!'; c < '~'; c++)
{
  list.add(c+"");
}

Collections.sort(list, col);
System.out.println(list);

You normally provide a Comparator when you want the order to be different to normal. 如果您希望订购的商品与正常商品不同,通常可以提供一个比较器。

List<String> words = new ArrayList<>();
words.add("#a");
words.add("@a");
Collections.sort(words);
System.out.println("Natural order: " + words);

Collections.sort(words, Collections.reverseOrder());
System.out.println("Reverse natural order: " + words);

prints 版画

Natural order: [#a, @a]
Reverse natural order: [@a, #a]

So if the order is the reverse of ASCII, its because that what you defined in your Comparator to be the order (whether you are aware that you did this or not) 因此,如果顺序是ASCII的相反顺序,则是因为您在比较器中定义的顺序是顺序(无论您是否知道这样做)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM