简体   繁体   English

从php中的数据中删除损坏的html标签

[英]Remove broken html tags from data in php

I am working on php app , where i got random text from different source like (email,db ect).Now I am facing a problem .I have text that contain broken HTML element like 我正在开发php应用程序,我从不同来源获得随机文本,如(email,db等)。现在我遇到了一个问题。我的文本包含破碎的HTML元素,如

$purl  ='FTP details are as 
follow:User name : Mahmud
div>password :1234556Than
ks ';

.I tried strip_tags and some preg matching algo's but didn't work.How can i remove the HTML elements if its not complete like in above div> tag.I know this type question asked earlier but i didn't know how can i do it.Thanks for any help. 我试过strip_tags和一些preg匹配算法,但没有工作。如果它不完整,如上面的div> tag我怎么能删除HTML元素。我知道这个类型的问题早先问过但我不知道怎么办谢谢你的帮助。

For further details i am adding this Link .I am fetching emails and then getting specific portion of emails using DOM. 有关详细信息,我正在添加此链接 。我正在获取电子邮件,然后使用DOM获取特定部分的电子邮件。

http://php.net/manual/en/tidy.parsestring.php



<?php
ob_start();
?>

<html>
    <head>
        <title>test</title>
    </head>
    <body>
        <p>error<br>another line</i>
    </body>
</html>

<?php

$buffer = ob_get_clean();
$config = array('indent' => TRUE,
        'output-xhtml' => TRUE,
        'wrap' => 200);

$tidy = tidy_parse_string($buffer, $config, 'UTF8');

$tidy->cleanRepair();
echo $tidy;

?>

How about I'm an user and I want my username to be <span man ? 我是一个用户怎么样,我希望我的用户名是<span man

You can't actually know when the text should be "corrected" because its a broken tag or when it's not. 你实际上不知道文本何时应该被“纠正”,因为它是一个破坏的标签或什么时候不是。

You should just do something on your input. 你应该对你的输入做些什么。 Are you getting this text from a curl output? 你是从卷曲输出中得到这个文本的吗? But anyways, as I said, just check your reading input. 但无论如何,正如我所说,只是检查你的阅读输入。

You need HTML TIDY installed and configured in your php for details on this refer to this link 你需要在你的php中安装和配置HTML TIDY,详细信息请参考此链接

php.net/manual/en/book.tidy.php php.net/manual/en/book.tidy.php

And this question has been asked earlier, refer to this link for code (answer) 而且此问题已经提前询问过,请参考此链接获取代码(回答)

Remove HTML Entity if Incomplete 如果不完整,请删除HTML实体

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM