简体   繁体   中英

Extract String from a within a String using a Regular Expression


I have a very large String containing within it some markers like:

{codecitation class="brush: java; gutter: true;" width="700px"}

I'd need to collect all the markers contained in the long String. The difficulty I find in this task is that the markers all contain different parameter values. The only thing they have in common is the initial part that is:

{codecitation class="brush: [VARIABLE PART] }

Do you have any suggestion to collect all the markers in Java using a Regular Expression ?

Use pattern matching to find the markers as below. I hope this will help.

String xmlString = "{codecitation class=\"brush: java; gutter: true;\" width=\"700px\"}efasf{codecitation class=\"brush: java; gutter: true;\" width=\"700px\"}";
Pattern pattern = Pattern.compile("(\\{codecitation)([0-9 a-z A-Z \":;=]{0,})(\\})");
Matcher matcher = pattern.matcher(xmlString);

while (matcher.find()) {
    System.out.println(matcher.group());
}

I guess you are particularly interested in the brush: java; and gutter: true; parts.

Maybe this snippet helps:

package test;

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class CodecitationParserTest {

    public static void main(String[] args) {
        String testString = "{codecitation class=\"brush: java; gutter: true;\" width=\"700px\"}";
        Pattern codecitationPattern = Pattern
                .compile("\\{codecitation class=[\"]([^\"]*)[\"][^}]*\\}");
        Matcher matcher = codecitationPattern.matcher(testString);

        Pattern attributePattern = Pattern
                .compile("\\s*([^:]*): ([^;]*);(.*)$");
        Matcher attributeMatcher;
        while (matcher.find()) {
            System.out.println(matcher.group(1));
            attributeMatcher = attributePattern.matcher(matcher.group(1));
            while (attributeMatcher.find()) {
                System.out.println(attributeMatcher.group(1) + "->"
                        + attributeMatcher.group(2));
                attributeMatcher = attributePattern.matcher(attributeMatcher
                        .group(3));
            }
        }
    }

}

The codecitationPattern extracts the content of the class attribute of a codecitation element. The attributePattern extracts the first key and value and the rest, so you can apply it recursively.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM