简体   繁体   中英

Escape special characters in a text when text is either enclosed in double quotes or not

I am writing a regex to escape a few special characters including double quote from the input.

input can be enclosed in double quotes and those should be not escaped.

Ex of input :

"te(st", te(st, te"st 

expected outputs :

"te\(st", te\(st, te\"st

Code used :

String regex = "^\".*\"$";
    String value = "\"strin'g\"";
    Pattern SPECIAL_REGEX_CHARS = Pattern.compile("[()'"\\[\\]*]");

    if (Pattern.matches(regex, value)){
        String val = value.substring(1, value.length() -1);
        String replaceAll = SPECIAL_REGEX_CHARS.matcher(val).replaceAll("\\\\$0");
        replaceAll = "\""+replaceAll+"\"";
        System.out.println(replaceAll);
    }else {
        String replaceAll = SPECIAL_REGEX_CHARS.matcher(value).replaceAll("\\\\$0");
        System.out.println(replaceAll);
    }

1 - checking if the text is enclosed in double quotes. if yes, escape the special characters in the text that is enclosed in double quotes.

2 - else . escape special characters in the text.

any regex expression which can combine #1 and #2 ?

Regards, Anil

Simple solution with one escaping regex only

You may use the if (s.startsWith("\\"") && s.endsWith("\\"")) to check if a string has both leading and trailing " , and if it does, you can then trim out the leading and trailing " with replaceAll("^\\"|\\"$", "") , then escape using your escaping regex, and then add " back. Else, just escape the characters in your set.

String SPECIAL_REGEX_CHARS = "[()'\"\\[\\]*]";
String s = "\"te(st\""; // => "te\(st"
String result;
if (s.startsWith("\"") && s.endsWith("\"")) {
    result = "\"" + s.replaceAll("^\"|\"$", "").replaceAll(SPECIAL_REGEX_CHARS, "\\\\$0") + "\"";
}
else {
    result = s.replaceAll(SPECIAL_REGEX_CHARS, "\\\\$0");
}
System.out.println(result.toString());

See another IDEONE demo

Alternative solution with appendReplacement "callback"

Here is how I would do that with one regex using an alternation:

String SPECIAL_REGEX_CHARS = "[()'\"\\[\\]*]";
//String s = "\"te(st\""; // => "te\(st"
//String s = "te(st"; // => te\(st
String s = "te\"st"; // => te\"st
StringBuffer result = new StringBuffer();
Matcher m = Pattern.compile("(?s)\"(.*)\"|(.*)").matcher(s);
if (m.matches()) {
    if (m.group(1) == null) { // we have no quotes around
        m.appendReplacement(result, m.group(2).replaceAll(SPECIAL_REGEX_CHARS, "\\\\\\\\$0"));
    }
    else {
        m.appendReplacement(result, "\"" + m.group(1).replaceAll(SPECIAL_REGEX_CHARS, "\\\\\\\\$0") + "\"");
    }
}
m.appendTail(result);
System.out.println(result.toString());

See IDEONE demo

Main points:

  • The Matcher#addReplacement() with Matcher#appendTail() allow manipulating groups.
  • Using (?s)\\"(.*)\\"|(.*) regex with 2 alternative branches: ".*" matching a string starting with " and ending with " (note that (?s) is a DOTALL inline modifier allowing matching strings with newline sequences) or a .* alternative just matching all other strings.
  • If the 1st alternative is matched, we just replace the selected special characters in the first capture group, and then add the " on both ends.
  • If the second alternative is matched, just add the escaping symbol in the whole Group 2.
  • To replace with a literal backslash, you need \\\\\\\\\\\\\\\\ in the replacement pattern.

You can use a negative lookbehind and lookahead :

System.out.println(value.replaceAll("([()'\\[\\]*]|(?<!^)\"(?!$))", "\\\\$0"));

This is essentially saying: escape anything in character class [()'\\[\\]*] , or any " not preceded by beginning-of-string or followed by end-of-string.

The only catch is that a leading and trailing quote will be ignored regardless of whether it has a corresponding quote at the other end. If that's a problem, you can chain these replacements to escape an unmatched leading or trailing quote:

.replaceAll("^\".*[^\"]$", "\\\\$0")
.replaceAll("(^[^\"].*)(\"$)", "$1\\\\$2")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM