简体   繁体   中英

PHP's json_encode does not escape all JSON control characters

Is there any reasons why PHP's json_encode function does not escape all JSON control characters in a string?

For example let's take a string which spans two rows and has control characters (\\r \\n " / \\) in it:

<?php
$s = <<<END
First row.
Second row w/ "double quotes" and backslash: \.
END;

$s = json_encode($s);
echo $s;
// Will output: "First row.\r\nSecond row w\/ \"double quotes\" and backslash: \\."
?>

Note that carriage return and newline chars are unescaped. Why?

I'm using jQuery as my JS library and it's $.getJSON() function will do fine when you fully, 100% trust incoming data. Otherwise I use JSON.org's library json2.js like everybody else. But if you try to parse that encoded string it throws an error:

<script type="text/javascript">

JSON.parse(<?php echo $s ?>);  // Will throw SyntaxError 

</script>

And you can't get the data! If you remove or escape \\r \\n " and \\ in that string then JSON.parse() will not throw error.

Is there any existing, good PHP function for escaping control characters. Simple str_replace with search and replace arrays will not work.

function escapeJsonString($value) {
    # list from www.json.org: (\b backspace, \f formfeed)    
    $escapers =     array("\\",     "/",   "\"",  "\n",  "\r",  "\t", "\x08", "\x0c");
    $replacements = array("\\\\", "\\/", "\\\"", "\\n", "\\r", "\\t",  "\\f",  "\\b");
    $result = str_replace($escapers, $replacements, $value);
    return $result;
  }

I'm using the above function which escapes a backslash (must be first in the arrays) and should deal with formfeeds and backspaces (I don't think \\f and \\b are supported in PHP).

D'oh - you need to double-encode: JSON.parse is expecting a string of course:

<script type="text/javascript">

JSON.parse(<?php echo json_encode($s) ?>);

</script>

I still haven't figured out any solution without str_replace ..

Try this code.

$json_encoded_string = json_encode(...);
$json_encoded_string = str_replace("\r", '\r', $json_encoded_string);
$json_encoded_string = str_replace("\n", '\n', $json_encoded_string);

Hope that helps...

$search = array("\n", "\r", "\u", "\t", "\f", "\b", "/", '"');
$replace = array("\\n", "\\r", "\\u", "\\t", "\\f", "\\b", "\/", "\"");
$encoded_string = str_replace($search, $replace, $json);

This is the correct way

Converting to and fro from PHP should not be an issue. PHP's json_encode does proper encoding but reinterpreting that inside java script can cause issues. Like

1) original string - [string with nnn newline in it] (where nnn is actual newline character)

2) json_encode will convert this to [string with "\\\\n" newline in it] (control character converted to "\\\\n" - Literal "\\n"

3) However when you print this again in a literal string using php echo then "\\\\n" is interpreted as "\\n" and that causes heartache. Because JSON.parse will understand a literal printed "\\n" as newline - a control character (nnn)

so to work around this: -

A) First encode the json object in php using json_enocde and get a string. Then run it through a filter that makes it safe to be used inside html and java script.

B) use the JSON string coming from PHP as a "literal" and put it inside single quotes instead of double quotes.


<?php
       function form_safe_json($json) {
            $json = empty($json) ? '[]' : $json ;
            $search = array('\\',"\n","\r","\f","\t","\b","'") ;
            $replace = array('\\\\',"\\n", "\\r","\\f","\\t","\\b", "&#039");
            $json = str_replace($search,$replace,$json);
            return $json;
        }


        $title = "Tiger's   /new \\found \/freedom " ;
        $description = <<<END
        Tiger was caged
        in a Zoo 
        And now he is in jungle
        with freedom
    END;

        $book = new \stdClass ;
        $book->title = $title ;
        $book->description = $description ;
        $strBook = json_encode($book);
        $strBook = form_safe_json($strBook);

        ?>


    <!DOCTYPE html>
    <html>

        <head>
            <title> title</title>

            <meta charset="utf-8">


            <script type="text/javascript" src="/3p/jquery/jquery-1.7.1.min.js"></script>


            <script type="text/javascript">
                $(document).ready(function(){
                    var strBookObj = '<?php echo $strBook; ?>' ;
                    try{
                        bookObj = JSON.parse(strBookObj) ;
                        console.log(bookObj.title);
                        console.log(bookObj.description);
                        $("#title").html(bookObj.title);
                        $("#description").html(bookObj.description);
                    } catch(ex) {
                        console.log("Error parsing book object json");
                    }

                });
            </script>

        </head>

         <body>

             <h2> Json parsing test page </h2>
             <div id="title"> </div>
             <div id="description"> </div>
        </body>
    </html>

Put the string inside single quote in java script. Putting JSON string inside double quotes would cause the parser to fail at attribute markers (something like { "id" : "value" } ). No other escaping should be required if you put the string as "literal" and let JSON parser do the work.

I don't fully understand how var_export works, so I will update if I run into trouble, but this seems to be working for me:

<script>
    window.things = JSON.parse(<?php var_export(json_encode($s)); ?>);
</script>

Maybe I'm blind, but in your example they ARE escaped. What about

<script type="text/javascript">

JSON.parse("<?php echo $s ?>");  // Will throw SyntaxError 

</script>

(note different quotes)

Just an addition to Greg 's response: the output of json_encode() is already contained in double-quotes ( " ), so there is no need to surround them with quotes again:

<script type="text/javascript">
    JSON.parse(<?php echo $s ?>);
</script>

Control characters have no special meaning in HTML except for new line in textarea.value . JSON_encode on PHP > 5.2 will do it like you expected.

If you just want to show text you don't need to go after JSON. JSON is for arrays and objects in JavaScript (and indexed and associative array for PHP).

If you need a line feed for the texarea-tag:

$s=preg_replace('/\r */','',$s);
echo preg_replace('/ *\n */','&#13;',$s);

This is what I use personally and it's never not worked. Had similar problems originally.

Source script (ajax) will take an array and json_encode it. Example:

$return['value'] = 'test';
$return['value2'] = 'derp';

echo json_encode($return);

My javascript will make an AJAX call and get the echoed "json_encode($return)" as its input, and in the script I'll use the following:

myVar = jQuery.parseJSON(msg.replace(/&quot;/ig,'"'));

with "msg" being the returned value. So, for you, something like...

var msg = '<?php echo $s ?>';
myVar = jQuery.parseJSON(msg.replace(/&quot;/ig,'"'));

...might work for you.

When using any form of Ajax, detailed documentation for the format of responses received from the CGI server seems to be lacking on the Web. Some Notes here and entries at stackoverflow.com point out that newlines in returned text or json data must be escaped to prevent infinite loops (hangs) in JSON conversion (possibly created by throwing an uncaught exception), whether done automatically by jQuery or manually using Javascript system or library JSON parsing calls.

In each case where programmers post this problem, inadequate solutions are presented (most often replacing \\n by \\\\n on the sending side) and the matter is dropped. Their inadequacy is revealed when passing string values that accidentally embed control escape sequences, such as Windows pathnames. An example is "C:\\Chris\\Roberts.php", which contains the control characters ^c and ^r, which can cause JSON conversion of the string {"file":"C:\\Chris\\Roberts.php"} to loop forever. One way of generating such values is deliberately to attempt to pass PHP warning and error messages from server to client, a reasonable idea.

By definition, Ajax uses HTTP connections behind the scenes. Such connections pass data using GET and POST, both of which require encoding sent data to avoid incorrect syntax, including control characters.

This gives enough of a hint to construct what seems to be a solution (it needs more testing): to use rawurlencode on the PHP (sending) side to encode the data, and unescape on the Javascript (receiving) side to decode the data. In some cases, you will apply these to entire text strings, in other cases you will apply them only to values inside JSON.

If this idea turns out to be correct, simple examples can be constructed to help programmers at all levels solve this problem once and for all.

$val = array("\n","\r");

$string = str_replace($val, "", $string);

It will remove all the newlines from json string in PHP

There are 2 solutions unless AJAX is used:

  1. Write data into input like and read it in JS:

     <input type="hidden" value="<?= htmlencode(json_encode($data)) ?>"/>
  2. Use addslashes

    var json = '<?= addslashes(json_encode($data)) ?>';

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM