简体   繁体   中英

htmlspecialchars & ENT_QUOTES not working?

Basically on displaying data from MySQL database I have a htmlspecialchars() function below that should convert single and double quotes to their safe entity(s). The problem I'm having is on viewing source code, it is only converting < > & when I also need it to convert single and double quotes.

//sanitize data from db before displaying on webpage
function htmlsan($htmlsanitize){
    return $htmlsanitize = htmlspecialchars($htmlsanitize, ENT_QUOTES, 'UTF-8');
}

Then when I want to use for example I do:

htmlsan($row['comment']);

Can someone tell me why it's not converting single and double quotes?

UPDATE

What's strange is htmlsan() is used on comment in email and when I view source code of email it converts them, it seems that it won't convert the single/double quotes from the database on displaying on webpage. My database collation is also set to utf8_general_ci and I declare I am using utf8 on database connection etc.

How are you exactly testing it?

<?php

//sanitize data from db before displaying on webpage
function htmlsan($htmlsanitize){
    return $htmlsanitize = htmlspecialchars($htmlsanitize, ENT_QUOTES, 'UTF-8');
}

var_dump(htmlsan('<>\'"'));

... prints:

string(20) "&lt;&gt;&#039;&quot;"

My guess is that your input string comes from Microsoft Word and contains typographical quotes:

var_dump(htmlsan('“foo”')); // string(9) "“foo”" 

If you do need to convert them for whatever the reason, you need htmlentities() rather than htmlspecialchars() :

var_dump(htmlentities('“foo”', ENT_QUOTES, 'UTF-8')); // string(17) "&ldquo;foo&rdquo;"

Update #1

Alright, it's time for some proper testing. Type a single quote ( ' ) in your comment database field and run the following code when you retrieve it:

var_dump(bin2hex("'"));
var_dump(htmlspecialchars("'", ENT_QUOTES, 'UTF-8'));
var_dump(bin2hex($row['comment']));
var_dump(htmlspecialchars($row['comment'], ENT_QUOTES, 'UTF-8'));

It should print this:

string(2) "27"
string(6) "&#039;"
string(2) "27"
string(6) "&#039;"

Please update your question and confirm whether you ran this test and got the same or a different output.

Update #2

Please look carefully at the output you claim to be obtaining:

string(6) "'"

That's not a string with 6 characters. You are not looking at the real output: you are looking at the output as rendered by a browser. I'm pretty sure you are getting the expected result, ie string(6) "&#039;" . If you render &#039; with a web browser it becomes ' . Use the View Source menu in your browser to see the real output.

When you view sourcecode using Firebug, Firebug shows it like the web browser displays it, I thought it would have shown the source code the same as if you went to View Source in Browser Menu Bar. A headache learnt and will be remembered. Thanks everyone for your valuable time and input.

Had the same problem. My database is with utf-8_unicode_ci and my html charset utf-8, and htmlentities only converted everything but quotes. I thought that having same charset in both db and html would work fine, but it didn't. So I changed the charset on the html to iso-8859-1 and it worked. I don't know why, but it worked. My db is still with utf-8_unicode_ci.

Not sure if this will make any difference but have you tried removing the $htmlsanitize .

function htmlsan($htmlsanitize){
    return htmlspecialchars($htmlsanitize, ENT_QUOTES, 'UTF-8');
}

Using

htmlentities($htmlsin, ENT_QUOTES, 'UTF-8');

or

mb_convert_encoding($htmlsan, "HTML-ENTITIES", "UTF-8");

Would probably do what you want them to.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM