简体   繁体   中英

How to sort ArrayList alphabetically by fields that contain non-english characters? Java

I have an ArrayList of custom objects that I want to sort alphabetically. The problem is that the field I want to sort it by sometimes contains non-english characters, like á or é . I wanted to do this by using Collections.sort() , but with this method, the items with non-english strings in their fields don't get sorted correctly.

This is what I've tried:

public List<Video> sortDatabase(List<Video> videos){
        List<Video> sortedList = new ArrayList<>(videos);
        Collections.sort(sortedList, new Comparator<Video>() {
            public int compare(Video video, Video t1) {
                return video.getTitle().compareTo(t1.getTitle());
        return orderedList;

I am using Android Studio with minSdkVersion 21

I've found a good answer here . This is how it looks like in my case:

public List<Video> sortDatabase(List<Video> videos){
        List<Video> sortedList = new ArrayList<>(videos);
        orderedList.sort(new VideoComparator());
        return orderedList;

private static class VideoComparator implements Comparator<Video>{
            Collator spCollator = Collator.getInstance(new Locale("hu", "HU"));
            public int compare (Video e1, Video e2){
                return spCollator.compare(e1.getTitle(), e2.getTitle());

In my case, the Locale had to be constructed with ("hu", "HU") . You can check all supported languages here

This solution works fine, however, I am trying to use this for an Android application where this method only works with API level 24 or higher. So if anybody has a different solution, I will make that the accepted answer.

but with this method, the items with non-english (sic) strings in their fields don't get sorted correctly.

Are you sure that's the problem? Could it be that maybe you do not understand what is involved in sorting? I am pretty sure that the problem is the Unicode values of the characters. Characters are organized in character sets. The entire English language characters have a Unicode value lower or higher than characters from other languages, thus giving the impression that the sorting went wrong.

Maybe you should take a look at the Unicode chart to fully understand what I am saying.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM