简体   繁体   中英

Why does GSON parse both “\n” and “\\n” as newline?

I have the following code:

public static void main(String[] args) {
    String key = "myjsonkey";
    String baseJson = "{\"" + key + "\":\"my json %svalue\"}";

    String inBackslashAndN = String.format(baseJson, "\\n");
    String inNewline = String.format(baseJson, "\n");

    String outBackslashAndN = valueFromJson(key, inBackslashAndN);
    String outNewLine = valueFromJson(key, inNewline);

    System.out.print("\nInput strings matching: ");
    System.out.println(inBackslashAndN.equals(inNewline));
    System.out.print("Output strings matching: ");
    System.out.println(outBackslashAndN.equals(outNewLine));
}

private static String valueFromJson(String key, String jsonStr) {
    System.out.println("\nINPUT: " + jsonStr);
    JsonObject json = new JsonParser().parse(jsonStr).getAsJsonObject();
    String output = json.get(key).getAsString();
    System.out.println("\nOUTPUT: " + output);
    return output;
}

Output:

INPUT: {"myjsonkey":"my json \nvalue"}

OUTPUT: my json 
value

INPUT: {"myjsonkey":"my json 
value"}

OUTPUT: my json 
value

Input strings matching: false
Output strings matching: true

My question is: Why does JSON parse both "\\n" and "\\\\n" as newline and is there a way to force different parsing of these two without changing the original data?

I am using gson 2.7

EDIT: I am aware that "\\n" is processed into the new line control character and the "\\\\n" is the sequence of the character 'backslash' and the character 'n' in Java. My question remains the same.

JSON does not support literal newlines inside strings. source: http://json.org/

A newline must be represented as \\n . GSON most likely accepts either an already escaped slash + n or a literal newline and normalizes to slash + n inside the JSON representation, which when converted back to a string parses the slash + n into a literal newline again.

\\n being the line feed control character, and \\\\n two characters, backslash and letter n.

These both cases are inserted into a JavaScript string "...". Hence the second version will be converted to a linefeed. And evidently for the first case a linefeed character inside a string is allowable.

Why does JSON parse both "\\n" and "\\n" as newline?

\\n is processed into an actual, literal newline character (ie Unicode 000A). \\\\n is equivalent to the string "\\n" which the JSON parser (correctly) parses as a newline as "\\n" is a newline in JSON. You might need \\\\\\\\n if you want an actual "\\n". See JSON.org , escape sequences are on the right under "char". When you end up operating through several languages (eg Java + Regex/JSON) you tend to get some confusing nesting of escape sequences.

JSON itself technically doesn't support newlines in strings, either. Gson takes care of this for you, though, by converting it to "\\n":

在此处输入图片说明

Is there a way to force different parsing of these two without changing the original data?

I believe Gson does not provide a way to do this, and it wouldn't make much sense according to JSON standards. You could:

String unescaped = myString.replace("\\", "\\\\");

or with regular expressions:

String unescaped = myString.replaceAll("\\\\", "\\\\\\\\");

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM