[英]PHP remove all html but comments
How would I remove all of a an html input but comments? 我如何删除所有HTML输入内容但包含注释? For example: This <html><body><!-- hello paragraph --><p>hello</p></body></html>
Would turn into this: This <!-- hello paragraph -->
例如:此<html><body><!-- hello paragraph --><p>hello</p></body></html>
会变成这样:此<!-- hello paragraph -->
How would I do this? 我该怎么做? Thanks! 谢谢!
Edit: I know you can do stuff like this with regular expressions, But I don't know how. 编辑:我知道您可以使用正则表达式执行类似的操作,但是我不知道如何做。
我将使用以下方法提取所有注释,而不是替换HTML:
preg_match_all('#(<!--.*?-->)#s', '<html><body><!-- hello paragraph --><p>hello</p></body></html>', $m);
That's indeed a bit more complex, but doable with regular expressions: 确实确实有点复杂,但是可以使用正则表达式:
$text = preg_replace('~<(?!!--)/?\w[^>]*(?<!--)>~', "", $text);
This works on your example, but can fail for others. 这适用于您的示例,但可能对其他人失败。 Amusingly it also removes HTML tags from within comments. 有趣的是,它还从注释中删除了HTML标签。
$regex = '~
< # opening html bracket
(?!!--) # negative assertion, no "!--" may follow
/?\w # tags must start with letter or optional /
[^>]* # matches html tag innards
(?<!--) # lookbehind assertion, no "--" before closing >
> # closing bracket
~x'
$foo="<html><body><!-- hello paragraph --><p>hello</p></body></html>";
preg_match('/(\<|<)!--(\s*.*?\s*)--(\>|>)/m',$foo,$result);
print_r($result);
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.