简体   繁体   中英

Correctly decoding double-encoded UTF-8 on PHP

So im trying to send data from an HTML page through Ajax, to a PHP page.

Thats the piece of jQuery code that im using:

$.ajax({
    url: "test.php",
    type: "POST",
    data: {
        name: "João"
    }
}).done(function (data) {
    alert(data);
})

As you can see, the parameter im sending is "João". Before making the Ajax request jQuery encodes it on the background, "João" becomes "Jo%C3%A3o" which is double encoded UTF-8.

My problem arises when the request is sent and PHP tries to decode it on the background. PHP decodes automatically it only once when I use $_POST, so instead of getting "João" I get "João". That happens because PHP is decoding every % individually, so %C3 becomes à and %A3 becomes £ .

If I try to decode it manually through utf8_decode() it will work, but im here to know if there's a better solution. What I really need is a way for PHP to decode my data correctly, even if it's double-encoded, or even triple-encoded.

That's not double-encoded, it's correct UTF-8. It looks like the PHP is expecting latin-1 encoding instead, and is showing you what the same bytes would mean if they were not UTF-8.

In this case, since your characters seem to be below 0xFF, you could also URL-encode them first as Jo%E3o in latin-1 if you can't work out how to have PHP recognize UTF-8.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM