简体   繁体   English

在 Java 字符串中使用 split

[英]Using split in Java String

I need to split a String with dot '.'我需要用点'.'分割一个字符串but with one catch as explained below For example, if a String is like this但是有一个如下解释的捕获例如,如果一个字符串是这样的

   String str = "A.B.C"

then, splitting with dot, will give A,B and C .然后,用点分割,将得到A,B and C

But if the some part is marked with single inverted comma, then split should ignore it但是如果某个部分用单引号标记,那么 split 应该忽略它

String str = "A.B.'C.D'"

then, splitting with dot, should give A,B and CD .然后,用点分割,应该给A,B and CD

How can I achieve this?我怎样才能做到这一点?

If the String is always in the given format, you could try : \\\\.(?![A-Za-z]') as regex如果字符串始终采用给定格式,您可以尝试 : \\\\.(?![A-Za-z]')作为正则表达式

demo here演示在这里

First, split at ' and afterwards, if any of the split results end in .首先,在'和之后拆分,如果任何拆分结果以. , split at . ,分裂于. as well again.再次。

"A.B.'C.D'"
=>
"A.B.", "C.D"
=> "A", "B", "C.D"

Java 8 Example Java 8 示例

public static void main(String[] args) {
    final String str = "A.B.'C.D'";
    final List<String> result = new ArrayList<>();

    for (String singleQuoteSplitResultArrayElement : str.split("'")) {
        if (singleQuoteSplitResultArrayElement.endsWith(".")) {
            Collections.addAll(result, singleQuoteSplitResultArrayElement.split("\\."));
        } else {
            result.add(singleQuoteSplitResultArrayElement);
        }
    }

    System.out.println(result.stream().collect(Collectors.joining(", ")));
}

What you can do is as follows - will work with single letter and multiple letter tokens:您可以做的如下 - 将使用单个字母和多个字母标记:

String input = "A.B.'C.D'";
//                                              | not following capital letter(s) and '
//                                              |           | dot (escaped)
//                                              |           |  | not followed by 
//                                              |           |  | capital letter(s) and '
System.out.println(Arrays.toString(input.split("(?<![A-Z]+?')\\.(?![A-Z]+?')")));

Output输出

[A, B, 'C.D']

Note笔记

If you want it case-insensitive, prepend (?i) to the Pattern : (?i)(?<![AZ]+?')\\\\.(?![AZ]+?')")如果您希望它不区分大小写,请在Pattern加上(?i)(?i)(?<![AZ]+?')\\\\.(?![AZ]+?')")

I don't know of a method in the standard library that does this.我不知道标准库中有什么方法可以做到这一点。 It is not too difficult to write yourself, though:不过,自己编写并不太难:

public static String[] splitByDots(String s)
{
    List<String> ss = new ArrayList<>();
    boolean inString = false;
    int start = 0;

    for (int p = 0; p < s.length(); p++) {
        char ch = s.charAt(p);
        if (ch == '\'') {
            inString = !inString;
        }
        else if (ch == '.') { 
            if (!inString) {
                ss.add(s.substring(start, p));
                start = p + 1;
            }
        }
    }

    ss.add(s.substring(start));
    return ss.toArray(new String[ss.size()]);
}

If you want to trim whitespace or remove the quote characters, you will have to tweak the above code a bit, but otherwise it does what you asked for.如果要修剪空格或删除引号字符,则必须稍微调整上面的代码,否则它会按您的要求执行。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM