Currently, I use strip_tags, to remove all html tags from the strings I process. However, I notice lately, that it joins words, which contained in the tags removed ie
$str = "<li>Hello</li><li>world</li>";
$result = strip_tags($str);
echo $result;
(prints HelloWorld)
How can you get around this?
you can play around which Regex Pattern is best and what to replace :)
// ------------------------------------
function strip_html_tags($string) {
$string = str_replace("\r", ' ', $string);
$string = str_replace("\n", ' ', $string);
$string = str_replace("\t", ' ', $string);
## $string = str_replace("<li>', "\n* ", $string);
## $pattern = "/<.*?>/";
$pattern = '/<[^>]*>/';
$string= preg_replace ($pattern, ' ', $string);
$string= trim(preg_replace('/ {2,}/', ' ', $string));
return $string;
}
// ------------------------------------
you can also add special replacements like: '<li>'
to "\\n* "
... or whatever :)
It all depends on what output you want after stripping HTML tags. For example:
If you want the <li>
tags to be converted in a plain list of items, I would suggest you to use str_replace
to replace <li>
with *
and </li>
with \\n
.
strip_tags
's proposal is to get rid of HTML tags without any other conversion.
This would replace all html tags (anything in the form of < ABC >, in fact, without check if it truly is html) with a whitespace, then replace possible double whitespaces to single whitespaces and remove starting or ending whitespaces.
$str = preg_replace("/<.*?>/", " ", $str);
$str = trim(str_replace(" ", " ", $str));
echo strip_tags( str_replace( '>', '> ', $string ));
这应该完全符合你所寻求的所有情况。
From your code i discover that there was no initial space in between the words Hello Word and you don't expect the strip_tags function to add it for you, so for the strip_tags function to produce exactly what you want, i added a space after the first list tag and the result was Hello world.
You can copy and paste this code and run to see the difference.
$str = "<li>Hello</li> <li>world</li>";
$result = strip_tags($str);
echo $result;
//Expected result after Execution is Hello world
You would be better off with htmlentities()
It won't remove the <>, but escape them.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.