简体   繁体   中英

Sorting array of strings that contain number

I'm implementing some code for my college and I have to sort two classes by its name. So, I started using Java's compareTo for Strings, but it wasn't doing it correctly. For example, I have these two names TEST-6 and TEST-10 . But, the result was TEST-10 ahead of TEST-6 .

I've searched and got this solution:

private int compare(String o1, String o2) {
    return extractInt(o1) - extractInt(o2);
}
private int extractInt(String s) {
    String num = s.replaceAll("\\D", "");
    // return 0 if no digits found
    return num.isEmpty() ? 0 : Integer.parseInt(num);
}

But my strings could assume any form. And when I tried this test: TEST-6 and TEST10 ) the result was TEST-6 ahead of TEST10 , but what I expect is TEST10 then TEST-6 .

The expected result should be normal string comparison, but comparing the full number when it is needed. So if substrings before numbers are equal, the number is compared, if not, keep string comparison. Or something like this:

TE
TES-100
TEST-1
TEST-6
TESTT-0
TEXT-2
109

You can do something like that:

list.sort(Comparator.comparing(YourClass::removeNumbers).thenComparing(YourClass::keepNumbers));

These are two methods:

private static String removeNumbers(String s) {
    return s.replaceAll("\\d", "");
}

private static Integer keepNumbers(String s) {
    String number = s.replaceAll("\\D", "");
    if (!number.isEmpty()) {
        return Integer.parseInt(number);
    }
    return 0;
}

For following data:

List<String> list = new ArrayList<>();
list.add("TEXT-2");
list.add("TEST-6");
list.add("TEST-1");
list.add("109");
list.add("TE");
list.add("TESTT-0");
list.add("TES-100");

This is the sorting result:

[109, TE, TES-100, TEST-1, TEST-6, TESTT-0, TEXT-2]

Here's a compare method that we're using to sort strings that can contain multiple numbers at any location (eg strings like "TEST-10.5" or "TEST-42-Subsection-3" ):

boolean isDigit( char c ) {
  return '0' <= c && c <= '9';
}

int compare( String left, String right, Collator collator ) {
  if ( left == null || right == null ) {
    return left == right ? 0 : ( left == null ? -1 : 1 );
  }

  String s1 = left.trim();
  String s2 = right.trim();

  int l1 = s1.length();
  int l2 = s2.length();
  int i1 = 0;
  int i2 = 0;
  while ( i1 < l1 && i2 < l2 ) {
    boolean isSectionNumeric = isDigit( s1.charAt( i1 ) );
    if ( isSectionNumeric != isDigit( s2.charAt( i2 ) ) ) {
      // one of the strings now enters a digit section and one is in a text section so we're done 
      //switch to -1 : 1 if you want numbers before text
      return isSectionNumeric ? 1 : -1;
    }

    // read next section
    int start1 = i1;
    int start2 = i2;
    for ( ++i1; i1 < l1 && isDigit( s1.charAt( i1 ) ) == isSectionNumeric; ++i1 ){/* no operation */}
    for ( ++i2; i2 < l2 && isDigit( s2.charAt( i2 ) ) == isSectionNumeric; ++i2 ){/* no operation */}
    String section1 = s1.substring( start1, i1 );
    String section2 = s2.substring( start2, i2 );

    // compare the sections:
    int result =
        isSectionNumeric ? Long.valueOf( section1 ).compareTo( Long.valueOf( section2 ) )
      : collator == null ? section1.trim().compareTo( section2.trim() )
      :                    collator.compare( section1.trim(), section2.trim() );

    if ( result != 0 ) {
      return result;
    }

    if ( isSectionNumeric ) {
      // skip whitespace
      for (; i1 < l1 && Character.isWhitespace( s1.charAt( i1 ) ); ++i1 ){/* no operation */}
      for (; i2 < l2 && Character.isWhitespace( s2.charAt( i2 ) ); ++i2 ){/* no operation */}
    }
  }

  // we've reached the end of both strings but they still are equal, so let's do a "normal" comparison
  if ( i1 == l1 && i2 == l2 ) {      
    return collator == null ? left.compareTo( right ) : collator.compare( left, right );
  }

  // we've reached the end of only one string, so the other must either be greater or smaller
  return ( i1 == l1 )? -1 : 1;
}

The idea is to "split" the strings into "text" and numeric sections and to compare the sections one by one. Decimal numbers would be supported in that the integer, decimal point and fraction parts would be 3 sections that are compared individually.

This would basically be similar to splitting a string into an array of substring and comparing the elements at each corresponding index. You then have the following situations:

  • both elements are texts: do a normal string comparison
  • both elements represent numbers: parse and compare the numbers
  • one element is a text and the other represents a number: decide which one is greater
  • we've reached the end of both strings but all elements are equal: we could be done or do a "normal" comparison on the entire strings to get an order if possible
  • we've reached the end of only one string and they are still equal: the longer one is reported to be greater (must be because there's more content;) )

Note that this is just our way of doing it and there are others as well (eg ones that don't skip whitespace).

if i am right,the problem is with your character '-',by using string.replace("-","") and then you can proceed with the normal sorting,have the string as it is for sorting,hopefully it should work as you expect.

String num = s.replaceAll("\\D", "").replace("-","");

if you won't have any negative values it should work,even then apply the regex for checking is it a negative number or string contains the '-'.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM