简体   繁体   English

如何在字母和数字(或数字和字母)之间分割字符串?

[英]How to split a string between letters and digits (or between digits and letters)?

I'm trying to work out a way of splitting up a string in java that follows a pattern like so: 我正在尝试找出一种在java中遵循以下模式的字符串拆分方法:

String a = "123abc345def";

The results from this should be the following: 结果应为:

x[0] = "123";
x[1] = "abc";
x[2] = "345";
x[3] = "def";

However I'm completely stumped as to how I can achieve this. 但是我对如何实现这一目标感到完全困惑。 Please can someone help me out? 有人可以帮我吗? I have tried searching online for a similar problem, however it's very difficult to phrase it correctly in a search. 我尝试过在线搜索类似的问题,但是很难在搜索中正确地表达它的意思。

Please note: The number of letters & numbers may vary (eg There could be a string like so '1234a5bcdef') 请注意:字母和数字的数量可能会有所不同(例如,可能有一个字符串,例如“ 1234a5bcdef”)

You could try to split on (?<=\\D)(?=\\d)|(?<=\\d)(?=\\D) , like: 您可以尝试分割(?<=\\D)(?=\\d)|(?<=\\d)(?=\\D) ,例如:

str.split("(?<=\\D)(?=\\d)|(?<=\\d)(?=\\D)");

It matches positions between a number and not-a-number (in any order). 它匹配数字和非数字之间的位置(以任何顺序)。

  • (?<=\\D)(?=\\d) - matches a position between a non-digit ( \\D ) and a digit ( \\d ) (?<=\\D)(?=\\d) -匹配非数字( \\D )和数字( \\d )之间的位置
  • (?<=\\d)(?=\\D) - matches a position between a digit and a non-digit. (?<=\\d)(?=\\D) -匹配数字和非数字之间的位置。

How about: 怎么样:

private List<String> Parse(String str) {
    List<String> output = new ArrayList<String>();
    Matcher match = Pattern.compile("[0-9]+|[a-z]+|[A-Z]+").matcher(str);
    while (match.find()) {
        output.add(match.group());
    }
    return output;
}

You can try this: 您可以尝试以下方法:

Pattern p = Pattern.compile("[a-z]+|\\d+");
Matcher m = p.matcher("123abc345def");
ArrayList<String> allMatches = new ArrayList<>();
while (m.find()) {
    allMatches.add(m.group());
}

The result (allMatches) will be: 结果(allMatches)将是:

["123", "abc", "345", "def"]

使用两种不同的模式: [0-9]*[a-zA-Z]*并分别将其分割两次。

If you are looking for solution without using Java String functionality (ie split , match , etc.) then the following should help: 如果您在不使用Java String功能(例如splitmatch等)的情况下寻找解决方案,则以下内容应会有所帮助:

List<String> splitString(String string) {
        List<String> list = new ArrayList<String>();
        String token = "";
        char curr;
        for (int e = 0; e < string.length() + 1; e++) {
            if (e == 0)
                curr = string.charAt(0);
            else {
                curr = string.charAt(--e);
            }

            if (isNumber(curr)) {
                while (e < string.length() && isNumber(string.charAt(e))) {
                    token += string.charAt(e++);
                }
                list.add(token);
                token = "";
            } else {
                while (e < string.length() && !isNumber(string.charAt(e))) {
                    token += string.charAt(e++);
                }
                list.add(token);
                token = "";
            }

        }

        return list;
    }

boolean isNumber(char c) {
        return c >= '0' && c <= '9';
    }

This solution will split numbers and 'words', where 'words' are strings that don't contain numbers. 此解决方案将数字和“单词”分开,其中“单词”是不包含数字的字符串。 However, if you like to have only 'words' containing English letters then you can easily modify it by adding more conditions (like isNumber method call) depending on your requirements (for example you may wish to skip words that contain non English letters). 但是,如果您只希望包含英文字母的“单词”,则可以根据需要添加更多条件(例如isNumber方法调用)来轻松地对其进行修改(例如,您可能希望跳过包含非英文字母的单词)。 Also note that the splitString method returns ArrayList which later can be converted to String array. 还要注意, splitString方法返回ArrayList ,以后可以将其转换为String数组。

I was doing this sort of thing for mission critical code. 我正在为关键任务代码执行此类操作。 Like every fraction of a second counts because I need to process 180k entries in an unnoticeable amount of time. 就像每一分之一秒一样,因为我需要在不明显的时间内处理180k条目。 So I skipped the regex and split altogether and allowed for inline processing of each element (though adding them to an ArrayList<String> would be fine). 因此,我跳过了正则表达式并完全拆分,并允许对每个元素进行内联处理(尽管将它们添加到ArrayList<String>会很好)。 If you want to do this exact thing but need it to be something like 20x faster... 如果您想做这件精确的事情,但需要将其速度提高20倍左右...

void parseGroups(String text) {
    int last = 0;
    int state = 0;
    for (int i = 0, s = text.length(); i < s; i++) {
        switch (text.charAt(i)) {
            case '0':
            case '1':
            case '2':
            case '3':
            case '4':
            case '5':
            case '6':
            case '7':
            case '8':
            case '9':
                if (state == 2) {
                    processElement(text.substring(last, i));
                    last = i;
                }
                state = 1;
                break;
            default:
                if (state == 1) {
                    processElement(text.substring(last, i));
                    last = i;
                }
                state = 2;
                break;
        }
    }
    processElement(text.substring(last));
}

Didn't use Java for ages, so just some pseudo code, that should help get you started (faster for me than looking up everything :) ). 很久没有使用Java了,所以只是一些伪代码,应该可以帮助您入门(对我而言,比查找一切都快:))。

 string a = "123abc345def";
 string[] result;
 while(a.Length > 0)
 {
      string part;
      if((part = a.Match(/\d+/)).Length) // match digits
           ;
      else if((part = a.Match(/\a+/)).Length) // match letters
           ;
      else
           break; // something invalid - neither digit nor letter
      result.append(part);
      a = a.SubStr(part.Length - 1); // remove the part we've found
 }

这个"d+|D+"不会代替繁琐的工作: "(?<=\\\\D)(?=\\\\d)|(?<=\\\\d)(?=\\\\D)"

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM