简体   繁体   English

比较和排序字符串Java

[英]Compare and sort strings Java

I have array of strings: 15MB,12MB, 1TB,1GB. 我有一系列字符串:15MB,12MB,1TB,1GB。 I want to compare them lexicographically by just following the rule that MB are smaller than GB and TB. 我想通过遵循MB小于GB和TB的规则来按字典顺序对它们进行比较。 So at the end I want to get: 12MB,15MB,1GB,1TB. 所以最后我想得到:12MB,15MB,1GB,1TB。 I found a way to compare the letters: 我找到了一种比较字母的方法:

 final static String ORDER="MGT";

public int compare(String o1, String o2) {
       int pos1 = 0;
       int pos2 = 0;
       for (int i = 0; i < Math.min(o1.length(), o2.length()) && pos1 == pos2; i++) {
          pos1 = ORDER.indexOf(o1.charAt(i));
          pos2 = ORDER.indexOf(o2.charAt(i));
       }

       if (pos1 == pos2 && o1.length() != o2.length()) {
           return o1.length() - o2.length();
       }

       return pos1  - pos2  ;
    }

I'm thinking of splitting the string by numbers and letter but then how can I sort them by their letters "MB.." and then by their numbers. 我正在考虑用数字和字母分割字符串,但是我怎样才能用字母“MB ......”然后用它们的数字对它们进行排序。 Do I use two comparators or something else? 我是否使用两个比较器或其他东西?

it will be much easier to compare if you first convert data to a common unit (eg MB). 如果您首先将数据转换为公共单位(例如MB),则比较容易得多。 if values are same after this conversion then you should apply lexicographical sorting, it may look like this: 如果在此转换后值相同,那么您应该应用词典排序,它可能如下所示:

private int convertToMegaBytes(String s) {

    char c = s.charAt(s.length() - 2);

    if(c == 'G')
        return 1024 * Integer.parseInt(s.substring(0, s.length() - 2));
    if(c == 'T')
        return 1024 * 1024 * Integer.parseInt(s.substring(0, s.length() - 2));

    return Integer.parseInt(s.substring(0, s.length() - 2));

}

final static String ORDER = "MGT";

public int compare(String o1, String o2) {
    int v = convertToMegaBytes(o1)  - convertToMegaBytes(o2);
    // if values are equal then compare lexicographically
    return v == 0 ? ORDER.indexOf(o1.charAt(o1.length() - 2)) - ORDER.indexOf(o2.charAt(o2.length() - 2)) : v;
}

This might do the trick. 这可能会成功。 The compare method gets the number of bytes that each String represents as a long (10KB becomes 10000) and then compares those. compare方法获取每个String表示为long的字节数(10KB变为10000),然后比较这些字节。 The getSizeOfString method turns a String into a long that is representative of the number of bytes that it represents. getSizeOfString方法将String转换为long,表示它所代表的字节数。

  public int compare(String o1, String o2) {
    long size1 = getSizeOfString(o1);
    long size2 = getSizeOfString(o2);
    return Long.compare(size1, size2);
  }

  private long getSizeOfString(String sizeString) {
    Pattern validSizePattern = Pattern.compile("(\\d+)([KMG])B");
    Matcher matcher = validSizePattern.matcher(sizeString);
    matcher.find();
    long size = Long.valueOf(matcher.group(1));

    switch (matcher.group(2)) {
      case "K":
        size *= 1024;
        break;
      case "M":
        size *= (1024 * 1024);
        break;
      case "G":
        size *= (1024 * 1024 * 1024);
        break;
    }
    return size;
  }

This now sorts first on units and then on values within units. 现在,首先对单位进行排序,然后对单位内的值进行排序。 This was changed to reflect the last comment by the OP. 这被改变以反映OP的最后评论。

import java.util.*;

enum Memory {
   B(1), KB(2), MB(3), GB(4), TB(5);
   public long val;

   private Memory(long val) {
      this.val = val;
   }
}

public class MemorySort {
   public static void main(String[] args) {
      List<String> memory = Arrays.asList("122003B",
            "1TB",
            "2KB",
            "100000MB",
            "1027MB",
            "2024GB");

      Comparator<String> units = Comparator.comparing(
            a -> Memory.valueOf(a.replaceAll("\\d+", "")).val);

      Comparator<String> values = Comparator.comparing(
            a -> Integer.parseInt(a.replaceAll("[A-Z]+", "")));

      Collections.sort(memory, units.thenComparing(values));
      System.out.println(memory);
   }
}


声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM