如何删除空的 html 标签（包含空格和/或其 html 代码）

Question

Need a regex for preg_replace.需要一个用于 preg_replace 的正则表达式。

This question wasn't answered in "another question" because not all tags I want to remove aren't empty.这个问题没有在“另一个问题”中回答，因为并非我想删除的所有标签都不是空的。

I have not only to remove empty tags from an HTML structure, but also tags containing line breaks as well as white spaces and/or their html code.我不仅要从 HTML 结构中删除空标签，还要删除包含换行符以及空格和/或其 html 代码的标签。

Possible Codes are:可能的代码是：

<br />   <br />       &ensp; &ensp; &emsp; &emsp;            

BEFORE removing matching tags:在删除匹配标签之前：

<div> 
  <h1>This is a html structure.</h1> 
  <p>This is not empty.</p> 
  <p></p> 
  <p><br /></p>
  <p> <br /> &;thinsp;</p>
  <p>&nbsp;</p> 
  <p> &nbsp; </p> 
</div>

AFTER removing matching tags:删除匹配标签后：

<div> 
  <h1>This is a html structure.</h1> 
  <p>This is not empty.</p> 
</div>

Answer 1

You can use the following:您可以使用以下内容：

<([^>\s]+)[^>]*>(?:\s*(?:<br \/>|&nbsp;|&thinsp;|&ensp;|&emsp;|&#8201;|&#8194;|&#8195;)\s*)*<\/\1>

And replace with '' (empty string)并替换为'' （空字符串）

See DEMO见演示

Note: This will also work for empty html tags with attributes.注意：这也适用于带有属性的空 html 标签。

Answer 2

Use tidy It uses the following function:使用tidy它使用以下功能：

function cleaning($string, $tidyConfig = null) {
    $out = array ();
    $config = array (
            'indent' => true,
            'show-body-only' => false,
            'clean' => true,
            'output-xhtml' => true,
            'preserve-entities' => true 
    );
    if ($tidyConfig == null) {
        $tidyConfig = &$config;
    }
    $tidy = new tidy ();
    $out ['full'] = $tidy->repairString ( $string, $tidyConfig, 'UTF8' );
    unset ( $tidy );
    unset ( $tidyConfig );
    $out ['body'] = preg_replace ( "/.*<body[^>]*>|<\/body>.*/si", "", $out ['full'] );
    $out ['style'] = '<style type="text/css">' . preg_replace ( "/.*<style[^>]*>|<\/style>.*/si", "", $out ['full'] ) . '</style>';
    return ($out);
}

Answer 3

I'm not so good with regex but, try this我不太擅长正则表达式，但是，试试这个

\<.*\>\s*\&.*sp;\s*\<\/.*\>|\<.*\>\s*\<\s*br\s*\/\>\s*\&.*sp;\s*\<\/.*\>|\<.*\>\s*\&.*sp;\s*\<\s*br\s*\/\>\<\/.*\>

Basically matches基本匹配

Tags with HTML space elements in them OR带有 HTML 空格元素的标签或
Tags with breaks occurring before HTML space elements in them在其中的 HTML 空格元素之前发生中断的标记
Tags with breaks occurring after HTML space elements in them在其中的 HTML 空格元素之后发生中断的标记

如何删除空的 html 标签（包含空格和/或其 html 代码）

问题描述

3 个解决方案

解决方案1
7 已采纳 2015-06-16 10:55:24

解决方案2
1 2015-06-16 10:59:22

解决方案3
0 2015-06-16 11:24:17

如何删除空的 html 标签（包含空格和/或其 html 代码）

问题描述

3 个解决方案

解决方案1 7 已采纳 2015-06-16 10:55:24

解决方案2 1 2015-06-16 10:59:22

解决方案3 0 2015-06-16 11:24:17

解决方案1
7 已采纳 2015-06-16 10:55:24

解决方案2
1 2015-06-16 10:59:22

解决方案3
0 2015-06-16 11:24:17