简体   繁体   English

Java中的组和模式匹配

[英]group and pattern matching in java

How I can differentiate this string with Matcher and Pattern class? 如何使用Matcher和Pattern类来区分此字符串?

I tried like this 我尝试过这样

String question="A: this is data i want first  B: this is data i want second  C: this is data i want  third A: this is data i want first  B: this is data i want second  C: this is data i want  third ";

Pattern pattern = Pattern.compile("A:(.*?)B:(.*?)C:(.*?)A:", Pattern.DOTALL | Pattern.MULTILINE);           
Matcher m = pattern.matcher(question);
while (m.find()) {
    m.group(1);
    m.group(2);
    m.group(3);
}

This is a bit of a hack but works if you don't find a better answer: 这有点hack,但是如果您找不到更好的答案,则可以使用:

Use this regex: 使用此正则表达式:

A:(.*?)B:(.*?)C:(.*?)(?=A:)

But you'll have to append a delimiter to the string (your question variable): 但是您必须在字符串(您的问题变量)后面添加定界符:

Matcher m = pattern.matcher(question + "A:");

Used with println: 与println一起使用:

while (m.find()) {
    System.out.println(m.group(1));
    System.out.println(m.group(2));
    System.out.println(m.group(3));
}

It outputs: 它输出:

this is data i want first 这是我首先要的数据
this is data i want second 这是我想要的数据
this is data i want third 这是我想要的数据第三
this is data i want first 这是我首先要的数据
this is data i want second 这是我想要的数据
this is data i want third 这是我想要的数据第三

As this requires a context, you could use a parser like ANTLR or you could code your own solution. 由于这需要上下文,因此可以使用ANTLR这样的解析器,也可以编写自己的解决方案。

I would go with something like: 我会喜欢这样的:

SplitterStringMatcher matcher = new SplitterStringMatcher {
    private char delimiter = 'A';
    // return the count of characters matched, 0 if none
    @Override public int matches(String str, int pos) {
        if (str.length() > pos + 1
                && str.charAt(pos) == delimiter
                && str.charAt(pos + 1) == ':') {
            if (++delimiter == 'D') { delimiter = 'A'; }
            return 2;
        }
        return 0;
    }
}

String[] strs = Splitter.split(question, matcher);

Then you implement the Splitter ... it must split the input at every position matches() returns a value greater than 0 and skipping the number of characters returned. 然后实现Splitter ...它必须在每个位置拆分输入, matches()返回一个大于0的值,并跳过返回的字符数。

You can also improve the matcher to match blanks before the delimiter letter and after the ':' . 您还可以改进匹配器以在定界符之前和':'之后匹配空格。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM