简体   繁体   English

如何删除空的 html 标签(包含空格和/或其 html 代码)

[英]How to remove empty html tags (which contain whitespaces and/or their html codes)

Need a regex for preg_replace.需要一个用于 preg_replace 的正则表达式。

This question wasn't answered in "another question" because not all tags I want to remove aren't empty.这个问题没有在“另一个问题”中回答,因为并非我想删除的所有标签都不是空的。

I have not only to remove empty tags from an HTML structure, but also tags containing line breaks as well as white spaces and/or their html code.我不仅要从 HTML 结构中删除空标签,还要删除包含换行符以及空格和/或其 html 代码的标签。

Possible Codes are:可能的代码是:

<br /> &nbsp; <br /> &nbsp; &thinsp; &thinsp; &ensp; &ensp; &emsp; &emsp; &#8201; &#8201; &#8194; &#8194; &#8195; &#8195;

BEFORE removing matching tags:在删除匹配标签之前:

<div> 
  <h1>This is a html structure.</h1> 
  <p>This is not empty.</p> 
  <p></p> 
  <p><br /></p>
  <p> <br /> &;thinsp;</p>
  <p>&nbsp;</p> 
  <p> &nbsp; </p> 
</div>

AFTER removing matching tags:删除匹配标签后:

<div> 
  <h1>This is a html structure.</h1> 
  <p>This is not empty.</p> 
</div>

You can use the following:您可以使用以下内容:

<([^>\s]+)[^>]*>(?:\s*(?:<br \/>|&nbsp;|&thinsp;|&ensp;|&emsp;|&#8201;|&#8194;|&#8195;)\s*)*<\/\1>

And replace with '' (empty string)并替换为'' (空字符串)

See DEMO演示

Note: This will also work for empty html tags with attributes.注意:这也适用于带有属性的空 html 标签。

Use tidy It uses the following function:使用tidy它使用以下功能:

function cleaning($string, $tidyConfig = null) {
    $out = array ();
    $config = array (
            'indent' => true,
            'show-body-only' => false,
            'clean' => true,
            'output-xhtml' => true,
            'preserve-entities' => true 
    );
    if ($tidyConfig == null) {
        $tidyConfig = &$config;
    }
    $tidy = new tidy ();
    $out ['full'] = $tidy->repairString ( $string, $tidyConfig, 'UTF8' );
    unset ( $tidy );
    unset ( $tidyConfig );
    $out ['body'] = preg_replace ( "/.*<body[^>]*>|<\/body>.*/si", "", $out ['full'] );
    $out ['style'] = '<style type="text/css">' . preg_replace ( "/.*<style[^>]*>|<\/style>.*/si", "", $out ['full'] ) . '</style>';
    return ($out);
}

I'm not so good with but, try this我不太擅长但是,试试这个

\<.*\>\s*\&.*sp;\s*\<\/.*\>|\<.*\>\s*\<\s*br\s*\/\>\s*\&.*sp;\s*\<\/.*\>|\<.*\>\s*\&.*sp;\s*\<\s*br\s*\/\>\<\/.*\>

Basically matches基本匹配

  • Tags with HTML space elements in them OR带有 HTML 空格元素的标签或
  • Tags with breaks occurring before HTML space elements in them在其中的 HTML 空格元素之前发生中断的标记
  • Tags with breaks occurring after HTML space elements in them在其中的 HTML 空格元素之后发生中断的标记

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM