简体   繁体   中英

Java regex (java.util.regex). Search for dollar sign

I have a search string. When it contains a dollar symbol, I want to capture all characters thereafter, but not include the dot, or a subsequent dollar symbol.. The latter would constitute a subsequent match. So for either of these search strings...:

"/bla/$V_N.$XYZ.bla";
"/bla/$V_N.$XYZ;

I would want to return:

  • V_N
  • XYZ

If the search string contains percent symbols, I also want to return what's between the pair of % symbols.

The following regex seems do the trick for that.

 "%([^%]*?)%";

Inferring:

  • Start and end with a %,
  • Have a capture group - the ()
  • have a character class containing anything except a % symbol, (caret infers not a character)
  • repeated - but not greedily *?

Where some languages allow %1 , %2 , for capture groups, Java uses backslash\number syntax instead. So, this string compiles and generates output.

I suspect the dollar symbol and dot need escaping, as they are special symbols:

  • $ is usually end of string
  • . is a meta sequence for any character.

I have tried using double backslash symbols.. \

  • Both as character classes.eg [^\\.\\$%]
  • and using OR'd notation %|\\$

in attempts to combine this logic and can't seem to get anything to play ball.

I wonder if another pair of eyes can see how to solve this conundrum!

My attempts so far:

import java.util.ArrayList;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
class Main {
  public static void main(String[] args) {
        String search = "/bla/$V_N.$XYZ.bla";
        String pattern = "([%\\$])([^%\\.\\$]*?)\\1?";
  /* Either % or $ in first capture group ([%\\$])
   * Second capture group - anything except %, dot or dollar sign
   * non greedy group ( *?)
   * then a backreference to an optional first capture group \\1?
   * Have to use two \, since you escape \ in a Java string.
   */
        Pattern r = Pattern.compile(pattern);
        Matcher m = r.matcher(search);
        List<String> results = new ArrayList<String>();
          while (m.find()) 
        { 
          for (int i = 0; i<= m.groupCount(); i++) {
                results.add(m.group(i));
          }
        }
        for (String result : results) {
          System.out.println(result);
        }
  }
}

The following links may be helpful:

You may use

String search = "/bla/$V_N.$XYZ.bla";
String pattern = "[%$]([^%.$]*)";
Matcher matcher = Pattern.compile(pattern).matcher(search);
while (matcher.find()){
    System.out.println(matcher.group(1)); 
} // => V_N, XYZ

See the Java demo and the regex demo .

NOTE

  • You do not need an optional \1? at the end of the pattern. As it is optional, it does not restrict match context and is redundant (as the negated character class cannot already match neither $ nor % )
  • [%$]([^%.$]*) matches % or $ , then captures into Group 1 any zero or more chars other than % , . and $ . You only need Group 1 value, hence, matcher.group(1) is used.
  • In a character class , neither . nor $ are special, thus, they do not need escaping in [%.$] or [%$] .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM