简体   繁体   中英

Regex: extract String from String

I need a regex that makes it possible to extract a part out of String. I get this String by parsing a XML-Document with DOM. Then I am looking for the "§regex" part in this String and now I try do extract the value of it. eg "([A-ZÄÖÜ]{1,3}[- ][AZ]{1,2}[1-9][0-9]{0,3})" from the rest.

The Problem is, I don´t know how to make sure the extracted part ends with a ")" This regex needs to work for every value given. The goal is to write only the Value in brackets after the "§regex=" including the brackets into a String.

<UML:TaggedValue tag="description" value=" random Text §regex=([A-ZÄÖÜ]{1,3}[- ][A-Z]{1,2}[1-9][0-9]{0,3}) random text"/>

private List<String> findRegex() {
    List<String> forReturn = new ArrayList<String>();
    for (String str : attDescription) {
        if (str.contains("§regex=")) {
            String s = str.replaceAll(regex);
            forReturn.add(s);
        }
    }
    return forReturn;
}

attDescription is a list which contains all Attributes found in the XML-Document parsed.

So far i tried this regex: ".*(§regex=)(.*)[)$].*", "$2" but this cuts off the ")" and does not delete the text infront of the searched part. Even with the help of this http://docs.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html I really don´t understand how to get what I need.

It seems to work for me (with this example anyway) if I use this in place of String s = str.replaceAll(regex);

String s = str.replaceAll( ".*§regex=(\\(.*\\)).*", "$1" );

It's just looking for a substring enclosed by parentheses following §regex= .

This seems to work:

String s = str.replaceAll(".*§regex=\\((.*)[)].*", "$1");

Note:

  • Escape the leading bracket
  • The $ inside a character class is a literal $ - ignore it, because your regex should always end with a bracket
  • No need to capture the fixed text

Test code, noting that this works with brackets in/around the regex:

String str = "random Text §regex=(([A-ZÄÖÜ]{1,3}[- ][A-Z]{1,2}[1-9][0-9]{0,3})) random text";
String s = str.replaceAll(".*§regex=\\((.*)[)].*", "$1");
System.out.println(s);

Output:

([A-ZÄÖÜ]{1,3}[- ][A-Z]{1,2}[1-9][0-9]{0,3})

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM