简体   繁体   中英

Returning substring without markers using Pattern&Matcher

I want to use Pattern and Matcher to return the following string as multiple variables.

    ArrayList <Pattern> pArray = new ArrayList <Pattern>();
    pArray.add(Pattern.compile("\\[[0-9]{2}/[0-9]{2}/[0-9]{2} [0-9]{2}:[0-9]{2}\\]"));
    pArray.add(Pattern.compile("\\[\\d{1,5}\\]"));
    pArray.add(Pattern.compile("\\[[a-zA-Z[^#0-9]]+\\]"));
    pArray.add(Pattern.compile("\\[#.+\\]"));
    pArray.add(Pattern.compile("\\[[0-9]{10}\\]"));
    Matcher iMatcher;
    String infoString = "[03/12/13 10:00][30][John Smith][5554215445][#Comment]";
    for (int i = 0 ; i < pArray.size() ; i++)
    {
        //out.println(pArray.get(i).toString());
        iMatcher = pArray.get(i).matcher(infoString);

        while (dateMatcher.find())
        {
                String found = iMatcher.group();
                out.println(found.substring(1, found.length()-1));
        }
    }
}

the program outputs:

[03/12/13 10:00]

[30]

[John Smith]

[\#Comment]

[5554215445]

The only thing I need is to have the program not print the brackets and the # character. I can easily avoid printing the brackets using substrings inside the loop but I cannot avoid the # character. # is only a comment indentifier in the string.

Can this be done inside the loop?

How about this?

public static void main(String[] args) {
    String infoString = "[03/12/13 10:00][30][John Smith][5554215445][#Comment]";
    final Pattern pattern = Pattern.compile("\\[#?(.+?)\\]");
    final Matcher matcher = pattern.matcher(infoString);
    while (matcher.find()) {
        System.out.println(matcher.group(1));
    }
}

You just need to make the .+ non greedy and it will match everything between square brackets. We then use a match group to grab what we want rather than using the whole matched pattern, a match group is represented by (pattern) . The #? matches a hash before the match group so that it doesn't get into the group.

The match group is retreived using matcher.group(1) .

Output:

03/12/13 10:00
30
John Smith
5554215445
Comment

Use lookaheads. ie change all your \\\\[ (in your regex) with positive lookbehind:

(?<=\\[)

and then change all your \\\\] (in your regex) with positive lookahead:

(?=\\])

finally change \\\\[# (in your regex) with positive lookbehind:

(?<=\\[#)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM