简体   繁体   中英

Regular Expression in Java. Splitting a string using pattern and matcher

I am trying to get all the matching groups in my string. My regular expression is "(?<?')/|/(?!') ". I am trying to split the string using regular expression pattern and matcher. string needs to be split by using /, but '/'(surrounded by ') this needs to be skipped. for example "One/Two/Three'/'3/Four" needs to be split as ["One", "Two", "Three'/'3", "Four"] but not using.split method.

I am currently the below

      // String to be scanned to find the pattern.
      String line = "Test1/Test2/Tt";
      String pattern = "(?<!')/|/(?!')";

      // Create a Pattern object
      Pattern r = Pattern.compile(pattern);

      // Now create matcher object.
      Matcher m = r.matcher(line);
      
      if (m.matches()) {
         System.out.println("Found value: " + m.group(0) );
        
      } else {
         System.out.println("NO MATCH");
      }

But it always saying "NO MATCH". where i am doing wrong? and how to fix that?

Thanks in advance

To get the matches without using split, you might use

[^'/]+(?:'/'[^'/]*)*

Explanation

  • [^'/]+ Match 1+ times any char except ' or /
  • (?: Non capture group
    • '/'[^'/]* Match '/' followed by optionally matching any char except ' or /
  • )* Close group and optionally repeat it

Regex demo | Java demo

String regex = "[^'/]+(?:'/'[^'/]*)*";
String string = "One/Two/Three'/'3/Four";

Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(string);

while (matcher.find()) {
    System.out.println(matcher.group(0));
}

Output

One
Two
Three'/'3
Four

Edit

If you do not want to split don't you might also use a pattern to not match / but only when surrounded by single quotes

[^/]+(?:(?<=')/(?=')[^/]*)*

Regex demo

Try this.

String line = "One/Two/Three'/'3/Four";
Pattern pattern = Pattern.compile("('/'|[^/])+");
Matcher m = pattern.matcher(line);
while (m.find())
    System.out.println(m.group());

output:

One
Two
Three'/'3
Four

Try something like this:

  String line = "One/Two/Three'/'3/Four";
  String pattern = "([^/]+'/'\d)|[^/]+";

  Pattern r = Pattern.compile(pattern);
  Matcher m = r.matcher(line);
  
  boolean found = false;
  while(m.find()) {
     System.out.println("Found value: " + m.group() );
     found = true;
  } 
  if(!found) {
     System.out.println("NO MATCH");
  }

Output:

Found value: One
Found value: Two
Found value: Three'/'3
Found value: Four

Here is simple pattern matching all desired / , so you can split by them:

(?<=[^'])\/(?=')|(?<=')\/(?=[^'])|(?<=[^'])\/(?=[^'])

The logic is as follows: we have 4 cases:

  1. / is sorrounded by ' , ie `'/'

  2. / is preceeded by ' , ie '/

  3. / is followed by ' , ie /'

  4. / is sorrounded by characters other than '

You want only exclude 1. case. So we need to write regex for three cases, so I have written three similair regexes and used alternation.

Explanation of the first part (other two are analogical):

(?<=[^']) - positiva lookbehind, assert what preceeds is differnt frim ' (negated character class [^']

\/ - match / literally

(?=') - positiva lookahead, assert what follows is ' \

Demo with some more edge cases

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM