简体   繁体   中英

htmlspecialchars “Forbidden code point” validation error

My php script is getting rows from a MySQL table that contain strings such as the following:

$string = 'ï¼’ã¤ã®ä¹³é…¸èŒã®ç¨';

Is there a way to echo these sorts of strings to the browser without getting "Forbidden code point" when running the document through an HTML5 validator?

I have tried the following:

htmlspecialchars($string);
htmlspecialchars($string, ENT_SUBSTITUTE, 'UTF-8');
htmlspecialchars($string, ENT_DISALLOWED, 'UTF-8');
htmlspecialchars(mb_convert_encoding($string, 'UTF-8');

but all of these expressions still result in the "Forbidden code point" error. The encoding of the webpage is already set to UTF-8 via a meta tag:

<meta charset="UTF-8">

The PHP function htmlentities() may be what you are looking for. This function will convert applicable characters supplied into it to HTML entities.

For example:

$string = 'ï¼'ã¤ã®ä¹³é…¸èŒã®ç¨'; $string = htmlentities($string); echo $string;

Will convert your string of ï¼'ã¤ã®ä¹³é…¸èŒã®ç¨ into &iuml;&frac14;&rsquo;&atilde;&curren;&atilde;&reg;&auml;&sup1;&sup3;&eacute;&hellip;&cedil;&egrave;&OElig;&atilde;&reg;&ccedil;&uml;uml; which can be used to display on an HTML page without error.

More information on this function an be found here: https://secure.php.net/manual/en/function.htmlentities.php

The solution that worked for me was:

htmlspecialchars($string, ENT_SUBSTITUTE | ENT_DISALLOWED);

This converted as many characters as possible to UTF-8 and removed everything else.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM