简体   繁体   中英

Java regex pattern matcher

I have a string of the following format:

String name = "A|DescA+B|DescB+C|DescC+...X|DescX+"

So the repeating pattern is ?|?+ , and I don't know how many there will be. The part I want to extract is the part before |...so for my example I want to extract a list (an ArrayList for example) that will contain:

[A, B, C, ... X]

I have tried the following pattern:

(.+)\\|.*\\+

but that doesn't work the way I want it to? Any suggestions?

To convert this into a list you can do like this:

String name = "A|DescA+B|DescB+C|DescC+X|DescX+";
Matcher m = Pattern.compile("([^|]+)\\|.*?\\+").matcher(name);
List<String> matches = new ArrayList<String>();
while (m.find()) {
    matches.add(m.group(1));
}

This gives you the list:

[A, B, C, X]

Note the ? in the middle, that prevents the second part of the regex to consume the entire string, since it makes the * lazy instead of greedy .

You are consuming any character ( . ) and that includes the | so, the parser goes on munching everything, and once it's done taking any char, it looks for | , but there's nothing left.

So, try to match any character but | like this:

"([^|]+)\\|.*\\+"

And if it fits, make sure your all-but-| is at the beginning of the string using ^ and that there's a + at the end of the string with $ :

"^([^|]+)\\|.*\\+$"

UPDATE: Tim Pietzcker makes a good point: since you are already matching until you find a | , you could just as well match the rest of the string and be done with it:

"^([^|]+).*\\+$"

UPDATE2: By the way, if you want to simply get the first part of the string, you can simplify things with:

myString.split("\\|")[0]

Another idea: Find all characters between + (or start of string) and | :

List<String> matchList = new ArrayList<String>();
Pattern regex = Pattern.compile("(?<=^|[+])[^|]+");
Matcher regexMatcher = regex.matcher(subjectString);
    while (regexMatcher.find()) {
        matchList.add(regexMatcher.group());
    } 

我认为最简单的解决方案是用\\\\+分割,然后对每个部分应用(.+?)\\\\|.*模式提取所需的组。

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM