[英]PHP convert html to space, > to > etc
I want to convert all html tags(  > < etc) to text format; 我想将所有html标签(&nbsp&gt&lt等)转换为文本格式; I have try
我试过了
html_entity_decode()
but it will return ? 但它会回来吗? if  .
如果&nbsp。
Use htmlspecialchars_decode
is the opposite of htmlspecialchars
. 使用
htmlspecialchars_decode
与htmlspecialchars
相反。
Example from the PHP documentation page: PHP文档页面中的示例:
$str = '<p>this -> "</p>';
echo htmlspecialchars_decode($str);
//Output: <p>this -> "</p>
html_entity_decode() is the opposite of htmlentities() in that it converts all HTML entities in the string to their applicable characters. html_entity_decode()与htmlentities()相反,它将字符串中的所有HTML实体转换为适用的字符。
$orig = "I'll \"walk\" the <b>dog</b> now";
$a = htmlentities($orig);
$b = html_entity_decode($a);
echo $a; // I'll "walk" the <b>dog</b> now
echo $b; // I'll "walk" the <b>dog</b> now
Use 使用
html_entity_decode()instead of
html_entity_encode()html_entity_encode()
If you check the html_entity_decode() manual: 如果您查看html_entity_decode()手册:
You might wonder why trim(html_entity_decode(' '));
你可能想知道为什么修剪(html_entity_decode('')); doesn't reduce the string to an empty string, that's because the ' ' entity is not ASCII code 32 (which is stripped by trim()) but ASCII code 160 (0xa0) in the default ISO 8859-1 characterset.
不会将字符串缩减为空字符串,这是因为''实体不是ASCII代码32(由trim()剥离),而是默认ISO 8859-1字符集中的ASCII代码160(0xa0)。
You can nest your html_entity_decode() function inside a str_replace() to ASCII #160 to a space: 您可以嵌套你html_entity_decode()一个内部函数str_replace()函数为ASCII#160的空间:
<?php
echo str_replace("\xA0", ' ', html_entity_decode('ABC XYZ') );
?>
I know my answer is coming in really late but thought it might help someone else. 我知道我的答案很晚才到,但我认为这可能有助于其他人。 I find that the best way to extract all special characters is to use utf8_decode() in php.
我发现提取所有特殊字符的最佳方法是在php中使用utf8_decode() 。 Even for dealing with
即使是处理
or any other special character representing blank space use utf8_decode()
. 或代表空格的任何其他特殊字符使用
utf8_decode()
。
After using utf8_decode()
one can manipulate these characters directly in the code. 使用
utf8_decode()
之后,可以直接在代码中操作这些字符。 For example, in the following code, the function clean() replaces
例如,在以下代码中,函数clean()替换了
with a blank. 一片空白。 Then it replaces all extra white spaces with a single white space using
preg_replace()
. 然后使用
preg_replace()
用一个空格替换所有额外的空格。 Leading and trailing white spaces are removed using trim()
. 使用
trim()
删除前导和尾随空格。
function clean($str)
{
$str = utf8_decode($str);
$str = str_replace(" ", "", $str);
$str = preg_replace("/\s+/", " ", $str);
$str = trim($str);
return $str;
}
$html = " Hello world! lorem ipsum.";
$output = clean($html);
echo $output;
Hello world!
你好,世界! lorem ipsum.
lorem ipsum。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.