简体   繁体   中英

Compare and sort strings Java

I have array of strings: 15MB,12MB, 1TB,1GB. I want to compare them lexicographically by just following the rule that MB are smaller than GB and TB. So at the end I want to get: 12MB,15MB,1GB,1TB. I found a way to compare the letters:

 final static String ORDER="MGT";

public int compare(String o1, String o2) {
       int pos1 = 0;
       int pos2 = 0;
       for (int i = 0; i < Math.min(o1.length(), o2.length()) && pos1 == pos2; i++) {
          pos1 = ORDER.indexOf(o1.charAt(i));
          pos2 = ORDER.indexOf(o2.charAt(i));
       }

       if (pos1 == pos2 && o1.length() != o2.length()) {
           return o1.length() - o2.length();
       }

       return pos1  - pos2  ;
    }

I'm thinking of splitting the string by numbers and letter but then how can I sort them by their letters "MB.." and then by their numbers. Do I use two comparators or something else?

it will be much easier to compare if you first convert data to a common unit (eg MB). if values are same after this conversion then you should apply lexicographical sorting, it may look like this:

private int convertToMegaBytes(String s) {

    char c = s.charAt(s.length() - 2);

    if(c == 'G')
        return 1024 * Integer.parseInt(s.substring(0, s.length() - 2));
    if(c == 'T')
        return 1024 * 1024 * Integer.parseInt(s.substring(0, s.length() - 2));

    return Integer.parseInt(s.substring(0, s.length() - 2));

}

final static String ORDER = "MGT";

public int compare(String o1, String o2) {
    int v = convertToMegaBytes(o1)  - convertToMegaBytes(o2);
    // if values are equal then compare lexicographically
    return v == 0 ? ORDER.indexOf(o1.charAt(o1.length() - 2)) - ORDER.indexOf(o2.charAt(o2.length() - 2)) : v;
}

This might do the trick. The compare method gets the number of bytes that each String represents as a long (10KB becomes 10000) and then compares those. The getSizeOfString method turns a String into a long that is representative of the number of bytes that it represents.

  public int compare(String o1, String o2) {
    long size1 = getSizeOfString(o1);
    long size2 = getSizeOfString(o2);
    return Long.compare(size1, size2);
  }

  private long getSizeOfString(String sizeString) {
    Pattern validSizePattern = Pattern.compile("(\\d+)([KMG])B");
    Matcher matcher = validSizePattern.matcher(sizeString);
    matcher.find();
    long size = Long.valueOf(matcher.group(1));

    switch (matcher.group(2)) {
      case "K":
        size *= 1024;
        break;
      case "M":
        size *= (1024 * 1024);
        break;
      case "G":
        size *= (1024 * 1024 * 1024);
        break;
    }
    return size;
  }

This now sorts first on units and then on values within units. This was changed to reflect the last comment by the OP.

import java.util.*;

enum Memory {
   B(1), KB(2), MB(3), GB(4), TB(5);
   public long val;

   private Memory(long val) {
      this.val = val;
   }
}

public class MemorySort {
   public static void main(String[] args) {
      List<String> memory = Arrays.asList("122003B",
            "1TB",
            "2KB",
            "100000MB",
            "1027MB",
            "2024GB");

      Comparator<String> units = Comparator.comparing(
            a -> Memory.valueOf(a.replaceAll("\\d+", "")).val);

      Comparator<String> values = Comparator.comparing(
            a -> Integer.parseInt(a.replaceAll("[A-Z]+", "")));

      Collections.sort(memory, units.thenComparing(values));
      System.out.println(memory);
   }
}


The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM