简体   繁体   English

如何通过codePoint对JavaScript字符串进行排序?

[英]How do I sort JavaScript strings by codePoint?

I am looking to sort objects by a string field that contains unicode characters. 我期待通过包含unicode字符的字符串字段对对象进行排序。 However, I want to sort the strings by code point, not by locale. 但是,我想按代码点排序字符串,而不是按语言环境排序。 So, here is an example, where JavaScript sorts objects so that \Ⓑ and b are both considered the same character. 所以,这是一个例子,JavaScript对对象进行排序,以便\Ⓑb都被认为是相同的字符。

Incorrect sort order: 排序顺序不正确:

> [{name: 'a'}, {name: 'b'}, {name: 'd'}, {name: '\u24B7'}].sort((a,b)=> a.name.localeCompare(b.name))
[ { name: 'a' }, { name: 'b' }, { name: 'Ⓑ' }, { name: 'd' } ]

However, this is not what I want. 但是,这不是我想要的。 I want the following sort order, where they are considered different characters. 我想要以下排序顺序,它们被认为是不同的字符。 This is the default behavior when comparing strings and not including a comparator function. 这是比较字符串且不包括比较器函数时的默认行为。

Correct sorting order (notice that b and \Ⓑ are no longer considered the same sort character): 正确的排序顺序(注意b\Ⓑ不再被认为是相同的排序字符):

> ['a','b','\u24B7','d'].sort()
[ 'a', 'b', 'd', 'Ⓑ' ]

In the real application, the strings will be more than one character and may contain multiple unicode chars and we want them sorted according to unicode number (ie- code point). 在实际应用程序中,字符串将是多个字符,并且可能包含多个unicode字符,我们希望它们根据unicode编号(即代码点)进行排序。

My question: is there a simple way to sort by code point for strings? 我的问题:是否有一种简单的方法来按字符串的代码点排序? I'd rather not re-implement a custom comparator for this. 我宁愿不为此重新实现自定义比较器。

I usually do it like this: 我通常这样做:

let cmp = (a, b) => a > b ? 1 : a < b ? -1 : 0;

objects.sort((a, b) => cmp(a.name, b.name));

or rather 更确切地说

let sortBy = (a, f) => a.sort((x, y) => cmp(f(x), f(y)));

sortBy(objects, x => x.name);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM