[英]What is the best way to strip out all html tags from a string?
Using PHP, given a string such as: this is a <strong>string</strong>
; 使用PHP,给出如下字符串:
this is a <strong>string</strong>
; I need a function to strip out ALL html tags so that the output is: this is a string
. 我需要一个函数去掉所有html标签,以便输出:
this is a string
。 Any ideas? 有任何想法吗? Thanks in advance.
提前致谢。
PHP has a built-in function that does exactly what you want: strip_tags
PHP有一个内置函数,可以完全满足您的需求:
strip_tags
$text = '<b>Hello</b> World';
print strip_tags($text); // outputs Hello World
If you expect broken HTML, you are going to need to load it into a DOM parser and then extract the text. 如果您希望破坏HTML,则需要将其加载到DOM解析器中,然后提取文本。
What about using strip_tags , which should do just the job ? 如何使用strip_tags ,它应该只做这项工作?
For instance (quoting the doc) : 例如(引用文档) :
<?php
$text = '<p>Test paragraph.</p><!-- Comment --> <a href="#fragment">Other text</a>';
echo strip_tags($text);
echo "\n";
will give you : 会给你 :
Test paragraph. Other text
Edit : but note that strip_tags doesn't validate what you give it. 编辑:但请注意,strip_tags不会验证您提供的内容。 Which means that this code :
这意味着这段代码:
$text = "this is <10 a test";
var_dump(strip_tags($text));
Will get you : 会得到你:
string 'this is ' (length=8)
(Everything after the thing that looks like a starting tag gets removed). (看起来像起始标签的东西之后的所有内容都被删除)。
strip_tags
is the function you're after. strip_tags
是你追求的功能。 You'd use it something like this 你会用它这样的东西
$text = '<strong>Strong</strong>';
$text = strip_tags($text);
// Now $text = 'Strong'
I find this to be a little more effective than strip_tags() alone, since strip_tags() will not zap javascript or css: 我发现这比strip_tags()更有效,因为strip_tags()不会消除javascript或css:
$search = array(
"'<head[^>]*?>.*?</head>'si",
"'<script[^>]*?>.*?</script>'si",
"'<style[^>]*?>.*?</style>'si",
);
$replace = array("","","");
$text = strip_tags(preg_replace($search, $replace, $html));
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.