简体   繁体   中英

Java Replace double quote which is not in pair

I have a string like

"ABC def" xxy"u

I want to replace the double quote which is not in pair.

So in above example I want to replace only xxy"u double quote not first two which is in pair.

Output should be in this format.

"ABC def" xxy\"u

Should work with every non pair double quote - "111" "222" "333" "4 so here " before 4 should be replaced with \\"

Thanks in advance.

It would be great if it will detect the actual pair too, instead of last double quote. EX: "AAA" "bbb" "CCC "DDD" -> should be replaced by "AAA" "bbb" \\"CCC "DDD"

This is what I am using

    int totalCountOfDQ = countOccurence(s, '"');
    int lastIndexOfDQ = s.lastIndexOf('"');
    if(totalCountOfDQ % 2 == 1){
        String start = s.substring(0, lastIndexOfDQ);
        String end = s.substring(lastIndexOfDQ+1);
        s = start + "\\\"" + end;
    }

and it is working for my example Thought it is not working "4 "111" "222" for correctly

You can try the next:

private static final Pattern REGEX_PATTERN =
        Pattern.compile("\\B\"\\w*( \\w*)*\"\\B");

private static String replaceNotPairs(String input) {
    StringBuffer sb = new StringBuffer();
    Matcher matcher = REGEX_PATTERN.matcher(input);
    int start = 0;
    int last = 0;
    while (matcher.find()) {
        start = matcher.start();
        sb.append(input.substring(last, start).replace("\"", "\\\""));
        last = matcher.end();
        sb.append(matcher.group());
    }
    sb.append(input.substring(last).replace("\"", "\\\""));
    return sb.toString();
}

eg:

public static void main(String[] args) {
    System.out.printf("src: %s%nout: %s%n%n",
            "\"ABC def\" xxy\"u",
            replaceNotPairs("\"ABC def\" xxy\"u"));
    System.out.printf("src: %s%nout: %s%n%n",
            "\"111\" \"222\" \"333\" \"4",
            replaceNotPairs("\"111\" \"222\" \"333\" \"4"));
    System.out.printf("src: %s%nout: %s%n%n",
            "\"AAA\" \"bbb\" \"CCC \"DDD\"",
            replaceNotPairs("\"AAA\" \"bbb\" \"CCC \"DDD\""));
    System.out.printf("src: %s%nout: %s%n%n",
            "\"4 \"111\" \"222\"",
            replaceNotPairs("\"4 \"111\" \"222\""));
    System.out.printf("src: %s%nout: %s%n%n",
            "\"11\" \"2 \"333\"",
            replaceNotPairs("\"11\" \"2 \"333\""));
}

The output for the example input:

src: "ABC def" xxy"u
out: "ABC def" xxy\"u

src: "111" "222" "333" "4
out: "111" "222" "333" \"4

src: "AAA" "bbb" "CCC "DDD"
out: "AAA" "bbb" \"CCC "DDD"

src: "4 "111" "222"
out: \"4 "111" "222"

src: "11" "2 "333"
out: "11" \"2 "333"

See the explanation for the regex:

\B\"\w*( \w*)*\"\B

正则表达式可视化

(from http://rick.measham.id.au/paste/explain.pl?regex ):

NODE                     EXPLANATION
----------------------------------------------------------------------------
  \B                       the boundary between two word chars (\w)
                           or two non-word chars (\W)
----------------------------------------------------------------------------
  \"                       '"'
----------------------------------------------------------------------------
  \w*                      word characters (a-z, A-Z, 0-9, _) (0 or
                           more times (matching the most amount
                           possible))
----------------------------------------------------------------------------
  (                        group and capture to \1 (0 or more times
                           (matching the most amount possible)):
----------------------------------------------------------------------------
                             ' '
----------------------------------------------------------------------------
    \w*                      word characters (a-z, A-Z, 0-9, _) (0 or
                             more times (matching the most amount
                             possible))
----------------------------------------------------------------------------
  )*                       end of \1 (NOTE: because you are using a
                           quantifier on this capture, only the LAST
                           repetition of the captured pattern will be
                           stored in \1)
----------------------------------------------------------------------------
  \"                       '"'
----------------------------------------------------------------------------
  \B                       the boundary between two word chars (\w)
                           or two non-word chars (\W)

Do you mean this algorithm?

Count the number of double quotes. If there is an even number, do nothing. If there is an odd number, replace the last double quote with a \\"

I would suggest using a regex matching to check for this.

Pattern myPattern = Pattern.compile("\".*\"");
Pattern myPattern1 = Pattern.compile("\"([^\"]*)$");
var input=yourString;//assign your string to a new variable
input=input.replaceAll(myPattern,' match ');//replace all portions in " with your own string
if(input.matches("\"")) {
   yourString.replaceAll(myPattern1,/\\/);//if there is a dangling ", replace with a \ in your original string
}

Without using a loop following code should work:

String s = "\"111 \" \" 222\" \" 333\" \"4";
// s.replaceAll("[^\"]+", "").length() gives count of " in String
if (s.replaceAll("[^\"]+", "").length() % 2 == 1) {
    int i = s.lastIndexOf('"');
    s = s.substring(0, i) + "\\\"" + s.substring(i+1);
}
System.out.println(s); // "111 " " 222" " 333" \"4

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM