简体   繁体   English

将单词从字符串拆分为数组,但如果在斜线之间则不拆分

[英]Split words from string into array but not if they are inbetween slashes

I have this code: 我有以下代码:

String path; 
path = main.getInput(); // lets say getInput() is "Hello \Wo rld\"
args = path.split("\\s+");

for (int i = 0; i < args.length; i++) {
     System.out.println(args[i]);
}

Is there a way to split the string so that the words are split and put into an array, but only if they are not in between two backslashes, so that "Wo rld" will be one word and not two? 有没有一种方法可以拆分字符串,以便拆分单词并将其放入数组,但前提是它们不在两个反斜杠之间,这样“ Wo rld”将是一个单词而不是两个单词?

You could try splitting only on spaces that are followed by an even number of backslashes. 您可以尝试仅在后跟偶数个反斜杠的空格上拆分。 Raw regex: 原始正则表达式:

\s+(?=(?:[^\\]*\\[^\\]*\\)*[^\\]*$)

Java escaped regex: Java转义的正则表达式:

\\s+(?=(?:[^\\\\]*\\\\[^\\\\]*\\\\)*[^\\\\]*$)

ideone demo ideone演示

Try this one: 试试这个:

String s = "John Hello \\Wo rld\\ our world";
Pattern pattern = Pattern.compile("(\\\\.*?\\\\)|(\\S+)");
Matcher m = pattern.matcher(s);
while (m.find( )) {
    if(m.group(1) != null){
        System.out.println(m.group(1));
    }
    else{
        System.out.println(m.group(2));
    }
}

Output: 输出:

John
Hello
\Wo rld\
our
world

If it doesn't have to be regex then you can use this simple parser and get your result in one iteration . 如果不必是正则表达式,那么您可以使用这个简单的解析器并在一次迭代中获得结果。

public static List<String> spaceSplit(String str) {
    List<String> tokens = new ArrayList<>();

    StringBuilder sb = new StringBuilder();
    boolean insideEscaped = false; //flag to check if I can split on space 

    for (char ch : str.toCharArray()) {

        if (ch == '\\') 
            insideEscaped = !insideEscaped;

        // we need to split only on spaces which are not in "escaped" area
        if (ch == ' ' && !insideEscaped) {
            if (sb.length() > 0) {
                tokens.add(sb.toString());
                sb.delete(0, sb.length());
            }
        } else //and add characters that are not spaces from between \
            sb.append(ch);
    }
    if (sb.length() > 0)
        tokens.add(sb.toString());

    return tokens;
}

Usage: 用法:

for (String s : spaceSplit("hello \\wo rld\\"))
    System.out.println(s);

Output: 输出:

hello
\wo rld\

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM