简体   繁体   English

在Java中使用Regex拆分字符串

[英]Splitting a string using Regex in Java

Would anyone be able to assist me with some regex. 有人能帮助我一些正则表达式。

I want to split the following string into a number, string number 我想将以下字符串拆分为数字,字符串编号

"810LN15"

1 method requires 810 to be returned, another requires LN and another should return 15. 1方法需要返回810,另一个需要LN,另一个应返回15。

The only real solution to this is using regex as the numbers will grow in length 对此唯一真正的解决方案是使用正则表达式,因为数字的长度会增长

What regex can I used to accomodate this? 我可以用什么正则表达式来容纳这个?

String.split won't give you the desired result, which I guess would be "810", "LN", "15", since it would have to look for a token to split at and would strip that token. String.split不会给你想要的结果,我想这将是“810”,“LN”,“15”,因为它必须寻找分裂的标记并将剥离该标记。

Try Pattern and Matcher instead, using this regex: (\\d+)|([a-zA-Z]+) , which would match any sequence of numbers and letters and get distinct number/text groups (ie "AA810LN15QQ12345" would result in the groups "AA", "810", "LN", "15", "QQ" and "12345"). 尝试使用PatternMatcher ,使用此正则表达式: (\\d+)|([a-zA-Z]+) ,它将匹配任何数字和字母序列,并获得不同的数字/文本组(即“AA810LN15QQ12345”将导致组“AA”,“810”,“LN”,“15”,“QQ”和“12345”)。

Example: 例:

Pattern p = Pattern.compile("(\\d+)|([a-zA-Z]+)");
Matcher m = p.matcher("810LN15");
List<String> tokens = new LinkedList<String>();
while(m.find())
{
  String token = m.group( 1 ); //group 0 is always the entire match   
  tokens.add(token);
}
//now iterate through 'tokens' and check whether you have a number or text

In Java, as in most regex flavors (Python being a notable exception), the split() regex isn't required to consume any characters when it finds a match. 在Java中,与大多数正则表达式一样(Python是一个值得注意的例外), split()正则表达式在找到匹配项时不需要使用任何字符。 Here I've used lookaheads and lookbehinds to match any position that has a digit one side of it and a non-digit on the other: 在这里,我使用了前瞻和后视来匹配任何具有数字一侧的位置和另一个非数字的位置:

String source = "810LN15";
String[] parts = source.split("(?<=\\d)(?=\\D)|(?<=\\D)(?=\\d)");
System.out.println(Arrays.toString(parts));

output: 输出:

[810, LN, 15]

(\\\\d+)([a-zA-Z]+)(\\\\d+) should do the trick. (\\\\d+)([a-zA-Z]+)(\\\\d+)应该可以解决问题。 The first capture group will be the first number, the second capture group will be the letters in between and the third capture group will be the second number. 第一个捕获组将是第一个数字,第二个捕获组将是中间的字母,第三个捕获组将是第二个数字。 The double backslashes are for java. 双反斜杠适用于java。

This gives you the exact thing you guys are looking for 这给了你们正在寻找的确切的东西

        Pattern p = Pattern.compile("(([a-zA-Z]+)|(\\d+))|((\\d+)|([a-zA-Z]+))");
        Matcher m = p.matcher("810LN15");
        List<Object> tokens = new LinkedList<Object>();
        while(m.find())
        {
          String token = m.group( 1 ); 
          tokens.add(token);
        }
        System.out.println(tokens);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM