简体   繁体   English

按空格而不是包含空格的双引号(“)分隔Java String

[英]Split Java String by space and not by double quotation (") that includes space

I want split a phrase on spaces, and not spaces within a quoted string (ie, a string within pair of double quotation marks " ). 我想在空格而不是带引号的字符串(即,双引号对“中的字符串" )中分割短语。

For example: 例如:

software term "on the fly" and "synchrony"

Should be split into these 5 segments: 应分为以下5个部分:

software  
term   
on the fly  
and  
synchrony

So how could I implement this in java? 那么如何在Java中实现呢?

This regex achieves the split for you, and cleans up any delimiting quotes: 此正则表达式可为您实现拆分,并清除所有定界引号:

String[] terms = input.split("\"?( |$)(?=(([^\"]*\"){2})*[^\"]*$)\"?");

It works by splitting on a space, but only if it is followed by an even number of quotes. 它通过在一个空格上分割来工作,但前提是它后面要加上偶数个引号。
The quotes themselves are consumed, so they don't end up in the output, by including them optionally in the split term. 引号本身已被消耗,因此它们不会最终出现在输出中,只需将其包括在拆分项中即可。
The term ( |$) was needed to capture the trailing quote. 需要使用( |$)来捕获尾随报价。

Note that if the first term could be quoted, you'll need to clean up that leading quote first: 请注意,如果可以引用第一项,则需要先清除该前导报价:

String[] terms = input.replaceAll("^\"", "").split("\"?( |$)(?=(([^\"]*\"){2})*[^\"]*$)\"?");

Test code: 测试代码:

String input = "software term \"on the fly\" and \"synchron\"";
String[] terms = input.split("\"?( |$)(?=(([^\"]*\"){2})*[^\"]*$)\"?");
System.out.println(Arrays.toString(terms));

Output: 输出:

[software, term, on the fly, and, synchron]
    String str = "software term \"on the fly\" and \"synchron\"";
    String[] arr = str.split("\""); // split on quote first
    List<String> res = new LinkedList<>();
    for(int i=0; i<arr.length; i++) {
        arr[i] = arr[i].trim();
        if ("".equals(arr[i])) {
            continue;
        }
        if (i % 2 == 0) {
            String[] tmp = arr[i].split("\\s+"); // second, split on spaces (when needed)
            for (String t : tmp) {
                res.add(t);
            }
        } else {
            res.add("\"" + arr[i] + "\""); // return the quote back to place
        }

    }
    System.out.println(res.toString());

OUTPUT 输出值

[software, term, "on the fly", and, "synchron"]

alternative to the previous post: 替代上一篇文章:

    boolean quoted = false;
    for(String q : str.split("\"")) {
        if(quoted)
            System.out.println(q.trim());
        else
            for(String s : q.split(" "))
                if(!s.trim().isEmpty())
                    System.out.println(s.trim());
        quoted = !quoted;
    }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM