简体   繁体   中英

How to split String for a digit and letters in java

Test data are eg

1a, 12a, 1ab, 12ab, 123a, 123abc

so if as input we have:

String input = "1a";

The output will be

String number = "1";
String letter = "a";

Like you can notice in this String there are sometimes 1-3digits(0-9) and sometimes 1-3letters(AZ).

My first attempt:

I tried to use .substring()

But it will work only if there would've been for example always the same amount of digits or letters

My second attempt was:

.split(" ");

But it will work only if there will be a space or any other sign between.

PS. Thanks for a response in answers. I checked most of your answers and they all work. The question now which one is the best?

If your string sequence starts with digits and ends with letters, then the below code will work.


int asciRepresentation, startCharIndex = -1;
    for(int i = 0; i < str.length(); i++) {
        asciRepresentation = (int) str.charAt(i);
        if (asciRepresentation > 47 && asciRepresentation < 58)
            strB.append(str.charAt(i));
        else {
            startCharIndex = i;
            break;
        }
    }
    System.out.println(strB.toString());
    if (startCharIndex != -1)
        System.out.println(str.substring(startCharIndex, str.length()));

A simple solution without regular expressions: Find the index of the first Letter and split the string at this position.

private String[] splitString(String s) {
  // returns an OptionalInt with the value of the index of the first Letter
  OptionalInt firstLetterIndex = IntStream.range(0, s.length())
    .filter(i -> Character.isLetter(s.charAt(i)))
    .findFirst();

  // Default if there is no letter, only numbers
  String numbers = s;
  String letters = "";
  // if there are letters, split the string at the first letter
  if(firstLetterIndex.isPresent()) {
    numbers = s.substring(0, firstLetterIndex.getAsInt());
    letters = s.substring(firstLetterIndex.getAsInt());
  }

  return new String[] {numbers, letters};
}

Gives you:

splitString("123abc") 
returns ["123", "abc"]

splitString("123") 
returns ["123", ""]

splitString("abc") 
returns ["", "abc"]

You can use regex:

String str = "1a, 12a, 1ab, 12ab, 123a, 123abc";
Pattern p = Pattern.compile("(?<digit>\\d{1,3})(?<letter>[a-z]{1,3})");
Matcher m = p.matcher(str);

while (m.find()){
    System.out.println(m.group("digit")+"/"+m.group("letter"));
}
// Ouput:
// 1/a
// 12/a
// 1/ab...

Below you have my proposal. Works correctly for mentioned test data

( 1a, 12a, 1ab, 12ab, 123a, 123abc )

Solution:

public ArrayList<String> split(String text) {

Pattern pattern = Pattern.compile("(\\d+)([a-zA-Z]+)");
Matcher matcher = pattern.matcher(text);
ArrayList<String> result = new ArrayList<>();

if (matcher.find() && matcher.groupCount() == 2) {
  result.add(matcher.group(1));
  result.add(matcher.group(2));
}
return result;
}

Solution:

(also take a look at the edit that I made at the end of my answer)

"\\b(\\d{1,3})([a-z]{1,3})(?=,*|\\b)"

Example:

String s = "1a, 12a, 1ab, 12ab, 123a, 123abc";
Pattern p = Pattern.compile("\\b(\\d{1,3})([a-z]{1,3})(?=,*|\\b)");
Matcher m = p.matcher(s);
while(m.find()) {
    System.out.println("Group: "+ m.group() + ", letters: " + m.group(1) + ", digits: " + m.group(2));
}

Output that you get:

Group: 1a, letters: 1, digits: a
Group: 12a, letters: 12, digits: a
Group: 1ab, letters: 1, digits: ab
Group: 12ab, letters: 12, digits: ab
Group: 123a, letters: 123, digits: a
Group: 123abc, letters: 123, digits: abc

Explanation:

\\\\b(\\\\d{1,3})([az]{1,3})(?=,*|\\\\b) whole regex

\\\\b - word boundary

\\\\d{1,3} - digit, from one to three times

[az]{1,3} - characters from a to z from one to three times

(?=,*|\\\\b) - this is positive lookahead, you say that after these letters you want to be present , or word boundary, but you don't want them to be present in the matching group (called with m.group() )

() - matching groups are in parenthesis - in my regex I used two matching groups: #1: (\\\\d{1,3}) #2: ([az]{1,3}) (they are printed with m.group(1) and m.group(2) )

If you're not very familiar to regular expressions syntax yet, you might want to have a look at Java API Documentation of class Pattern . There is a list of available uses of regular expressions. It's worth giving regular expressions a try, as it might save a lot of your time when working with Strings in the future.


Edit:

Actually this regex can be changed to:

(?<=\\\\b)(\\\\d{1,3})([az]{1,3})(?=\\\\b)

There is a positive lookbehind (?<=\\\\b) - it means that you want digits to preceded by word boundary (including commas in the lookahead and lookbehind was redundant so I deleted it).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM