简体   繁体   English

Java Regex Help:在空格上拆分字符串,“=>”和逗号

[英]Java Regex Help: Splitting String on spaces, “=>”, and commas

I need to split a string on any of the following sequences: 我需要在以下任何序列上拆分字符串:

1 or more spaces 1个或多个空格
0 or more spaces, followed by a comma, followed by 0 or more spaces, 0个或更多空格,后跟逗号,后跟0或更多空格,
0 or more spaces, followed by "=>", followed by 0 or more spaces 0个或多个空格,后跟“=>”,后跟0或更多空格

Haven't had experience doing Java regexs before, so I'm a little confused. 之前没有使用Java正则表达式的经验,所以我有点困惑。 Thanks! 谢谢!

Example: 例:
add r10,r12 => r10 添加r10,r12 => r10
store r10 => r1 store r10 => r1

Just create regex matching any of your three cases and pass it into split method: 只需创建匹配任意三种情况的正则表达式并将其传递给split方法:

string.split("\\s*(=>|,|\\s)\\s*");

Regex here means literally 正则表达式在字面上意味着

  1. Zero or more whitespaces ( \\\\s* ) 零个或多个空格( \\\\s*
  2. Arrow, or comma, or whitespace ( =>|,|\\\\s ) 箭头,逗号或空格( =>|,|\\\\s
  3. Zero or more whitespaces ( \\\\s* ) 零个或多个空格( \\\\s*

You can replace whitespace \\\\s (detects spaces, tabs, line breaks, etc) with plain space character 您可以使用普通空格字符替换空格\\\\s (检测空格,制表符,换行符等) if necessary. 如有必要。

Strictly translated 严格翻译

For simplicity, I'm going to interpret you indication of "space" ( 为简单起见,我将解释你对“空间”的指示( ) as "any whitespace" ( \\s ). )作为“任何空白”( \\s )。

Translating your spec more or less "word for word" is to delimit on any of: 翻译您的规范或多或少“逐字逐句”是划分以下任何一个:

  • 1 or more spaces 1个或多个空格
    • \\s+
  • 0 or more spaces ( \\s* ), followed by a comma ( , ), followed by 0 or more spaces ( \\s* ) 0个或多个空格( \\s* ),后跟逗号( , ),后跟0或更多空格( \\s*
    • \\s*,\\s*
  • 0 or more spaces ( \\s* ), followed by a "=>" ( => ), followed by 0 or more spaces ( \\s* ) 0个或更多空格( \\s* ),后跟“=>”( => ),后跟0个或更多个空格( \\s*
    • \\s*=>\\s*

To match any of the above: (\\s+|\\s*,\\s*|\\s*=>\\s*) 要匹配上述任何一项: (\\s+|\\s*,\\s*|\\s*=>\\s*)

Reduced form 缩小形式

However, your spec can be "reduced" to: 但是,您的规范可以“减少”为:

  • 0 or more spaces 0个或更多空格
    • \\s* , \\s*
  • followed by either a space, comma, or "=>" 后跟空格,逗号或“=>”
    • (\\s|,|=>)
  • followed by 0 or more spaces 然后是0或更多的空格
    • \\s*

Put it all together: \\s*(\\s|,|=>)\\s* 把它们放在一起: \\s*(\\s|,|=>)\\s*

The reduced form gets around some corner cases with the strictly translated form that makes some unexpected empty "matches". 简化形式绕过一些极端情况,严格翻译形式,使一些意想不到的空“匹配”。

Code

Here's some code: 这是一些代码:

import java.util.regex.Pattern;

public class Temp {

    // Strictly translated form:
    //private static final String REGEX = "(\\s+|\\s*,\\s*|\\s*=>\\s*)";

    // "Reduced" form:
    private static final String REGEX = "\\s*(\\s|=>|,)\\s*";

    private static final String INPUT =
        "one two,three=>four , five   six   => seven,=>";

    public static void main(final String[] args) {
        final Pattern p = Pattern.compile(REGEX);
        final String[] items = p.split(INPUT);
        // Shorthand for above:
        // final String[] items = INPUT.split(REGEX);
        for(final String s : items) {
            System.out.println("Match: '"+s+"'");
        }
    }
}

Output: 输出:

Match: 'one'
Match: 'two'
Match: 'three'
Match: 'four'
Match: 'five'
Match: 'six'
Match: 'seven'
String[] splitArray = subjectString.split(" *(,|=>| ) *");

应该这样做。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM