简体   繁体   中英

Regex when pattern involves dollar sign ($)

I'm running into a bit of an issue when it comes to matching sub-patterns that involve the dollar sign. For example, consider the following chunk of text:

(en $) foo
oof ($).
ofo (env. 80 $US)

I'm using the following regex :

Pattern p = Pattern.compile(
            "\\([\\p{InARABIC}\\s]+\\)|\\([\\p{InBasic_Latin}\\s?\\$]+\\)|\\)([\\p{InARABIC}\\s]+)\\(",
            Pattern.CASE_INSENSITIVE);

public String replace(String text) {
    Matcher m = p.matcher(text);
        String replacement = m.replaceAll(match -> {
            if (m.group(1) == null) {
                return m.group();
            } else {
                return "(" + match.group(1) + ")";
            }
        });
        return replacement;
    }

but can't match text containing $

This code is similar to replaceAll(regex, replacement) . Problem is that $ isn't only special in regex argument, but also in replacement where it can be used as reference to match from groups like $x (where x is group ID) or ${groupName} if your regex has (?<groupName>subregex) .

This allows us to write code like

String doubled = "abc".replaceAll(".", "$0$0");
System.out.println(doubled); //prints: aabbcc

which will replace each character with its two copies since each character will be matched by . and placed in group 0, so $0$0 represents two repetitions of that matched character.

But in your case you have $ in your text , so when it is matched you are replacing it with itself, so you are using in replacement $ without any information about group ID (or group name) which results in IllegalArgumentException: Illegal group reference .

Solution is to escape that $ in replacement part . You can do it manually, with \\ , but it is better to use method designed for that purpose Matcher#quoteReplacement (in case regex will evolve and you will need to escape more things, this method should evolve along with regex engine which should save you some trouble later)

So try changing your code to

public String replace(String text) {
    Matcher m = p.matcher(text);
        String replacement = m.replaceAll(match -> {
            if (m.group(1) == null) {
                return Matcher.quoteReplacement(m.group());
                //     ^^^^^^^^^^^^^^^^^^^^^^^^
            } else {
                return Matcher.quoteReplacement("(" + match.group(1) + ")");
                //     ^^^^^^^^^^^^^^^^^^^^^^^^
            }
        });
        return replacement;
    }
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM