简体   繁体   English

在html中为缺少的标签插入结尾标签

[英]Inserting Ending Tags For Missing Tags In html

How to insert the ending html tags where there are missing ending tags ? 如何在缺少结束标记的地方插入结束html标记?

Like 喜欢

 <tr>
 <td>Index No.</td><td>Name</td>

 <tr>
 <td>1</td><td>Harikrishna</td>

Where two missing ending tags.That is "/tr".Now in this case how to Search where are the missing tag and there how to insert appropriate ending tag such as "/tr". 两个丢失的结束标记在哪里。就是“ / tr”。在这种情况下,现在如何搜索丢失的标记在哪里,以及如何插入适当的结束标记,例如“ / tr”。

This seems like a very though task to do if you want to handle all possible cases. 如果您想处理所有可能的情况,这似乎是一项非常艰巨的任务。 HTML is not a regular language. HTML不是常规语言。 IMHO you should try to solve the problem at the source which is how in the first place you got invalid HTML. 恕我直言,您应该尝试从源头解决问题,这首先是您获取无效HTML的方式。

您可以看一下HTML Tidy ,看看它是否可以满足您的需求。

I cannot comment on the above, so I'll note it here. 我无法对以上内容发表评论,因此在此进行说明。 You can use HTML Tidy also for cleaning HTML fragments. 您也可以使用HTML Tidy清除HTML片段。 See examples here: 在此处查看示例:
http://www.php.net/manual/en/tidy.examples.basic.php http://www.php.net/manual/en/tidy.examples.basic.php

An alternative to HTML Tidy is to clean your output code with regular expressions - I provide an example below. HTML Tidy的替代方法是使用正则表达式清除输出代码-我在下面提供一个示例。 However please note that even though this might be faster in terms of processing, it is not that universal not robust (maintenance-wise) as HTML Tidy is. 但是请注意,即使就处理而言这可能更快,但这并不是通用的不如HTML Tidy那样健壮(在维护方面)。

Code

<?php

$html = "
<table>
<tr class=\"lorem\">
<td>Index No.</td>
<td>Name</td>

<tr>
<td>0</td>
<td>FooBaz</td>

<tr>
<td>1</td>
<td>Harikrishna</td>

<tr class=\"ipsum\">
<td>2</td>
<td>Foo</td>
</tr>

<tr>
<td>3</td>
<td>Bar</td>


</table>
";

// regex magic
$start_cond = "<tr(?:\s[^>]*)?>";
$end_cond = "(?:{$start_cond}|<\/table>)";
$row_contents = "(?:(?!{$end_cond}).)*";

// first remove all </tr> tags
$xhtml = preg_replace( "/<\/tr>/ism", "", $html );

// now re-add </tr> tags where appropriate
$xhtml = preg_replace( "/({$start_cond})({$row_contents})/ism", "$1$2</tr>\n", $xhtml );

// ignore: just for writing comparision output
echo "<h2>Before:</h2>"; show_count( $html );
echo "<h2>After</h2>"; show_count( $xhtml );

function cmp($patt,$html) {
    $count = preg_match_all( "/{$patt}/ism", $html, $matches);
    return htmlentities("\n{$count} x {$patt}");
}
function show_count($html) {
    echo "<pre>"
        . cmp("<tr(\s[^>]*)?>",$html)
        . cmp("<\/tr>",$html)
        . "</pre>";
}
?>

Output 产量


Before:
5 x <tr(\s[^>]*)?>
1 x <\/tr>

After
5 x <tr(\s[^>]*)?>
5 x <\/tr>

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM