简体   繁体   中英

JSON.parse with escape characters

Trying to get a handle on JSON.parse and having a difficult time understanding how escape characters are handled; specifically - why does:

JSON.parse('["\\\\\\"\\"a\\""]')

Evaluate to:

["\""a""]

How do the multiple backslashes work with each other?

Thanks!

First of all, let's clarify what value we're actually working with:

var str = '["\\\\\\"\\"a\\""]';
console.log(str);
// => ["\\\"\"a\""]

As you can see, half of those backslashes had nothing to do with JSON. They were just escaping characters in the JavaScript string. The actual characters of that string are these:

["\\\"\"a\""]

We know that the square brackets ( [] ) indicate a JSON array, and the outermost quotation marks indicate a JSON string, so let's drop those:

\\\"\"a\"

Now, to figure out what JavaScript string this JSON will deserialize to, let's break it up into its parts:

\\  \"  \"   a  \"
 1   2   3   4   5

I've paired up each backslash with the character that follows it (which is sometimes another backslash—backslashes are escaped with backslashes just like quotation marks are). Now for each character that's preceded by a backslash we just drop the backslash:

\   "   "   a   "
1   2   3   4   5

Now mash it all together again:

\""a"

Did it work?

var str = '["\\\\\\"\\"a\\""]';
var array = JSON.parse(str);
console.log(array[0]);
// => \""a"

Yep!

PS Because JSON and JavaScript escaping work the same way, you could apply the same process to the original JavaScript string:

["\\\\\\"\\"a\\""]

Split it up again:

[   "  \\  \\  \\   "  \\   "   a  \\   "   "   ]
1   2   3   4   5   6   7   8   9  10  11  12  13

You'll notice that in this case only backslashes are escaped—that's because in our JavaScript the string was surrounded by single-quotes, so the double-quotes didn't have to be escaped. Now, drop the initial backslashes again and we get:

[   "   \   \   \   "   \   "   a   \   "   "   ]
1   2   3   4   5   6   7   8   9  10  11  12  13

And squash it together again:

["\\\"\"a\""]

You'll recognize this as the original value we started with.

In this case JavaScript escaping actually works in steps. Basically meaning the string is escaped initially, but then the result after that is then escaped again. So the first escape acts like so:

Step 1: ["\\\\\\\\\\\\"\\\\"a\\\\""] ==> ["(\\\\)(\\\\)(\\\\)"\\(\\")a\\(\\")"] ==> ["\\\\\\"\\"a\\""]

In this first step each \\\\ converts to a \\ and \\" to a " . A better look at which items are being converted (I've added (..) around the converted items in this step, where (\\\\) converts to \\ and (\\") converts to " ).

Step2: ["\\\\\\"\\"a\\""] ==> ["(\\\\)(\\")(\\")a(\\")"] ==> ["\\""a""]

the same problem with me but i solve with this sample code.

def escape(str):
    str = str.replace('\\', '\\\\').replace('"', '\\"').replace('\n', '\\n').
        replace('\t', '\\t')
    result = []
    for ch in str:
        n = ord(ch)
        if n < 32:
            h = hex(n).replace('0x', '')
            result += ['\\u%s%s' % ('0'*(4-len(h)), h)]
        else:
            result += [ch]
    return ''.join(result)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM