简体   繁体   English

在Java中使用正则表达式将字符串分成两个用冒号分隔的字符串,并忽略标记和引号内的冒号

[英]Split string into two separated by colon with regex in Java and ignore colons within tags and quotes

I have been trying to figure out a Java RegEx for some while now that would split something like the following into two pieces: 我一直在试图弄清楚Java RegEx,现在它会将类似以下的内容分为两部分:
l&<6:98>9"hello:world"-45:&<78:89>"hedhed:hdeh"+56 it should be split at the colon after "-45" ignoring all colons inside tags and quotes. l&<6:98>9"hello:world"-45:&<78:89>"hedhed:hdeh"+56应该在“ -45”之后在冒号处分割,忽略标记和引号内的所有冒号。 Neither of the sides must not necessarily contain any tags or quotes. 双方都不必一定包含任何标签或引号。

Help would be greatly appreciated :) 帮助将不胜感激:)

This would be a starting point for a parsing function: 这将是解析函数的起点:

/** example: findCharIndex(subject, ':'); */
public static int findCharIndex(String subject, char findChar)
{
    boolean insideQuotes = false;
    boolean insideTags = false;
    for (int index = 0; index < subject.length(); index++)
    {
        char ch = subject.charAt(index);
        if (ch == '"')
            insideQuotes = !insideQuotes;
        else if (!insideQuotes)
        {
            if (ch == '<')
                insideTags = true;
            else if (insideTags && ch == '>')
                insideTags = false;
        }
        if (!insideQuotes && !insideTags && ch == findChar)
            return index;
    }
    return -1;
}

执行匹配比拆分更容易。

(?:[^"<:]|"[^"]*"|<[^>]*)*

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM