RegExp 删除 HTML 评论

Question

寻找匹配和替换的正则表达式序列（最好是 PHP 但无关紧要）来更改它（开始和结束只是需要保留的随机文本）。

在：

fkdshfks khh fdsfsk 
<!--g1-->
<div class='codetop'>CODE: AutoIt</div>
<div class='geshimain'>
    <!--eg1-->
    <div class="autoit" style="font-family:monospace;">
        <span class="kw3">msgbox</span>
    </div>
    <!--gc2-->
    <!--bXNnYm94-->
    <!--egc2-->
    <!--g2-->
</div>
<!--eg2-->
fdsfdskh

到这个：

fkdshfks khh fdsfsk 
<div class='codetop'>CODE: AutoIt</div>
<div class='geshimain'>
    <div class="autoit" style="font-family:monospace;">
        <span class="kw3">msgbox</span>
    </div>
</div>
fdsfdskh

谢谢。

Answer 1

你只是想删除评论吗？ 怎么样

s/<!--[^>]*-->//g

或者稍微好一点（由提问者本人建议）：

<!--(.*?)-->

但是请记住，HTML不是常规的，因此使用正则表达式来解析它会在有人向它抛出奇怪的边缘情况时将您带入一个受伤的世界。

Answer 2

preg_replace('/<!--(.*)-->/Uis', '', $html)

此 PHP 代码将从 $html 字符串中删除所有 html 注释标记。

Answer 3

更好的版本是：

(?=<!--)([\s\S]*?)-->

它匹配这样的 html 注释：

<!--
multi line html comment
-->

要么

<!-- single line html comment -->

最重要的是它匹配这样的评论（其他人显示的其他正则表达式不包括这种情况）：

<!-- this is my blog: <mynixworld.inf> -->

笔记

尽管从语法上看，下面的注释是 html 注释，但您的浏览器可能会以不同的方式解析它，因此它可能具有特殊含义。 剥离此类字符串可能会破坏您的代码。

<!--[if !(IE 8) ]><!-->

Answer 4

不要忘记考虑条件注释，因为

<!--(.*?)-->

将删除它们。 试试这个：

<!--[^\[](.*?)-->

不过，这也将删除下层显示的条件注释。

编辑：

这不会删除下级显示或下级隐藏的评论。

<!--(?!<!)[^\[>].*?-->

Answer 5

啊我已经做到了，

<!--(.*?)-->

Answer 6

<!--([\s\S]*?)-->

在 javascript 和 VBScript 中也可以作为“.”使用。 不匹配所有语言的换行符

Answer 7

这是我的尝试：

<!--(?!<!)[^\[>][\s\S]*?-->

这也将删除多行注释，并且不会删除下级显示或下级隐藏的注释。

Answer 8

如果您的评论包含换行符，请尝试以下操作：

/<!--(.|\n)*?-->/g

Answer 9

接下来：

/( )*<!--((.*)|[^<]*|[^!]*|[^-]*|[^>]*)-->\n*/g

可以使用测试字符串删除多行注释：

fkdshfks khh fdsfsk 
<!--g1-->
<div class='codetop'>CODE: AutoIt</div>
    <div class='geshimain'>
    <!--eg1-->
    <div class="autoit" style="font-family:monospace;">
        <span class="kw3">msgbox</span>
    </div>
    <!--gc2-->
    <!--bXNnYm94-->
    <!--egc2-->
    <!--g2-->
</div>
<!--eg2-->
fdsfdskh

<!-- --
> test
- -->

<!-- --
<- test <
>
- -->

<!--
test !<
- <!--
-->

<script type="text/javascript">//<![CDATA[
    var xxx = 'a';   
    //]]></script>

ok

Answer 10

function remove_html_comments($html) {
   $expr = '/<!--[\s\S]*?-->/';
   $func = 'rhc';
   $html = preg_replace_callback($expr, $func, $html);
   return $html;
}

function rhc($search) {
   list($l) = $search;
   if (mb_eregi("\[if",$l) || mb_eregi("\[endif",$l) )  {
      return $l;
   }
}

Answer 11

我知道这是一篇相当老的帖子，但我觉得添加到这篇文章中会很有用，以防有人想要一个易于实现的 PHP function 直接回答原始问题。

/**
 * Strip all the html comments from $text
 *
 * @param $text - text to modify
 * @param string $new replacement string
 * @return array|string|string[]|null
 */
function strip_html_comments($text, $new=''){
    $search = array ("|<!--[\s\S]*?-->|si");
    $replace = array ($new);
    return preg_replace($search, $replace, $text);
}

Answer 12

这些代码也是删除 javascript 代码。 那太糟糕了:|

这是将使用此代码删除的示例 javascript 代码：

<script type="text/javascript"><!--
    var xxx = 'a';
    //-->
    </script>

Answer 13

// Remove multiline comment
    $mlcomment = '/\/\*(?!-)[\x00-\xff]*?\*\//';
    $code = preg_replace ($mlcomment, "", $code);
// Remove single line comment
    $slcomment = '/[^:]\/\/.*/';
    $code = preg_replace ($slcomment, "", $code);
// Remove extra spaces
    $extra_space = '/\s+/';
    $code = preg_replace ($extra_space, " ", $code);
// Remove spaces that can be removed
    $removable_space = '/\s?([\{\};\=\(\)\\\/\+\*-])\s?/';
    $code = preg_replace ('/\s?([\{\};\=\(\)\/\+\*-])\s?/', "\\1", $code);

Answer 14

如果您只想要带有特定标签的文本或文本，您可以使用 PHP strip_tags处理它，它还可以删除 HTML 注释，您可以像这样保存您需要的 HTML 标签：

$text = '<p>Test paragraph.</p><!-- Comment --> <a href="#fragment">Other text</a>';
echo strip_tags($text, ['p', 'a']);

输出将是：

<p>Test paragraph.</p> <a href="#fragment">Other text</a>

我希望它可以帮助某人！

Answer 15

您可以使用现代 JavaScript 实现此目的。

function RemoveHtmlComments() {
    let children = document.body.childNodes;
    for (let child in children) {
        if (children[child].nodeType === Node.COMMENT_NODE) children[child].remove();
    }
}

它应该比 RegEx 更安全。

RegExp 删除 HTML 评论

问题描述

15 个解决方案

解决方案1
93 已采纳 2009-07-05 20:24:46

解决方案2
51 2010-07-13 09:26:12

解决方案3
36 2015-03-22 12:20:36

解决方案4
16 2011-04-15 15:58:54

解决方案5
9 2009-07-05 20:31:52

解决方案6
2 2014-02-05 10:29:32

解决方案7
2 2015-06-04 20:36:33

解决方案8
2 2010-12-03 01:01:45

解决方案9
2 2020-02-11 12:18:05

解决方案10
1 2014-07-11 10:07:47

解决方案11
1 2022-03-11 10:26:10

解决方案12
1 2012-02-25 12:04:03

解决方案13
0 2017-01-29 00:15:27

解决方案14
0 2020-11-24 11:48:55

解决方案15
0 2023-01-22 06:53:41

RegExp 删除 HTML 评论

问题描述

15 个解决方案

解决方案1 93 已采纳 2009-07-05 20:24:46

解决方案2 51 2010-07-13 09:26:12

解决方案3 36 2015-03-22 12:20:36

解决方案4 16 2011-04-15 15:58:54

解决方案5 9 2009-07-05 20:31:52

解决方案6 2 2014-02-05 10:29:32

解决方案7 2 2015-06-04 20:36:33

解决方案8 2 2010-12-03 01:01:45

解决方案9 2 2020-02-11 12:18:05

解决方案10 1 2014-07-11 10:07:47

解决方案11 1 2022-03-11 10:26:10

解决方案12 1 2012-02-25 12:04:03

解决方案13 0 2017-01-29 00:15:27

解决方案14 0 2020-11-24 11:48:55

解决方案15 0 2023-01-22 06:53:41

解决方案1
93 已采纳 2009-07-05 20:24:46

解决方案2
51 2010-07-13 09:26:12

解决方案3
36 2015-03-22 12:20:36

解决方案4
16 2011-04-15 15:58:54

解决方案5
9 2009-07-05 20:31:52

解决方案6
2 2014-02-05 10:29:32

解决方案7
2 2015-06-04 20:36:33

解决方案8
2 2010-12-03 01:01:45

解决方案9
2 2020-02-11 12:18:05

解决方案10
1 2014-07-11 10:07:47

解决方案11
1 2022-03-11 10:26:10

解决方案12
1 2012-02-25 12:04:03

解决方案13
0 2017-01-29 00:15:27

解决方案14
0 2020-11-24 11:48:55

解决方案15
0 2023-01-22 06:53:41