简体   繁体   English

如何从字符串中删除特定标记和特定属性?

[英]How to strip specific tags and specific attributes from a string?

Here's the deal, I'm making a project to help teach HTML to people. 这是交易,我正在制作一个项目来帮助人们教HTML。 Naturally, I'm afraid of that Scumbag Steve (see figure 1). 当然,我害怕那个Scumbag Steve(见图1)。

So I wanted to block ALL HTML tags, except those approved on a very specific whitelist . 所以我想阻止所有 HTML标记, 除非在非常特定的白名单上批准。

Out of those approved HTML tags, I want to remove harmful attributes as well. 在这些已批准的HTML标记中,我也希望删除有害属性 Such as onload and onmouseover . 例如onloadonmouseover Also, according to a whitelist . 另外, 根据白名单

I've thought of regex, but I'm pretty sure it's evil and not very helpful for the job. 我想到了正则表达式,但我很确定它是邪恶的,对这项工作没有多大帮助。

Could anyone give me a nudge in the right direction? 谁能给我一个正确方向的推动?

Thanks in advance. 提前致谢。


Fig 1. 图。1。

Scumbag Steve

require_once 'library/HTMLPurifier.auto.php';

$config = HTMLPurifier_Config::createDefault();

 // this one is needed cause otherwise stuff 
 // considered harmful like input's will automatically be deleted
$config->set('HTML.Trusted', true);

// this line say that only input, p, div will be accepted
$config->set('HTML.AllowedElements', 'input,p,div');

// set attributes for each tag
$config->set('HTML.AllowedAttributes', 'input.type,input.name,p.id,div.style');

// more extensive way of manage attribute and elements... see the docs
// http://htmlpurifier.org/live/configdoc/plain.html
$def = $config->getHTMLDefinition(true);

$def->addAttribute('input', 'type', 'Enum#text');
$def->addAttribute('input', 'name', 'Text');

// call...
$purifier = new HTMLPurifier($config);

// display...
$html = $purifier->purify($raw_html);
  • NOTE: as you asked this code will run as a Whitelist, only input, p and div are accepted and only certains attributes are accepted. 注意:正如您所知,此代码将作为白名单运行,只接受输入,p和div,并且只接受某些属性。

Use Zend framework 2 strip tags . 使用Zend框架2条带标签 An example below to accept ul, li, p... and img (only with src attribute) and links (with only href atttribute). 下面的示例接受ul,li,p ...和img(仅使用src属性)和链接(仅使用href atttribute)。 Everything else will be stripped. 其他一切都将被剥夺。 If I'm not wrong zf1 does the same thing 如果我没错,zf1会做同样的事情

     $filter = new \Zend\Filter\StripTags(array(
        'allowTags'   => array(
            'ul'=>array(), 
            'li'=>array(), 
            'p'=>array(), 
            'br'=>array(), 
            'img'=>array('src'), 
            'a'=>array('href')
         ),
        'allowAttribs'  => array(),
        'allowComments' => false)
    );

    $value = $filter->filter($value);

For tags you can use strip_tags 对于标签,您可以使用strip_tags

For attributes, refer to How can I remove attributes from an html tag? 有关属性,请参阅如何从html标记中删除属性?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM