简体   繁体   English

我应该在HTML Purifier之前使用strip_tags()吗?

[英]Should I use strip_tags() before HTML Purifier?

I'm integrating Redactor (a WYSIWYG editor) on my website and it outputs HTML instead of BBCode or Markdown. 我在我的网站上集成了Redactor(一个所见即所得的编辑器),它输出HTML而不是BBCode或Markdown。 I need to allow the following tags as it uses them for formatting: 我需要允许以下标记使用,因为它们使用它们进行格式化:

<code><span><div><label><a><br><p><b><i><del><strike><u><img><video><audio><iframe><object><embed><param><blockquote><mark><cite><small><ul><ol><li><hr><dl><dt><dd><sup><sub><big><pre><code><figure><figcaption><strong><em><table><tr><td><th><tbody><thead><tfoot><h1><h2><h3><h4><h5><h6>

From what I've read and been told on here, in order to safely display the content I should store the original data in my database, along with a sanitized version (output by HTML Purifier) which is what I will actually output (the unsanitized version being there in case anything goes wrong when sanitizing it). 根据我在这里已经阅读并被告知的内容,为了安全地显示内容,我应该将原始数据以及经过清理的版本(由HTML Purifier输出)存储在数据库中,而这实际上是我要输出的(未经清理的)版本存在,以防万一在清理时出错。

My question is, should I call strip_tags() on the data as well (passing the above tags as the allowed tags argument), or should I pass it directly to HTML Purifier? 我的问题是,我也应该在数据上调用strip_tags() (将上述标签作为允许的标签参数传递),还是应该将其直接传递给HTML Purifier?

While it's true that you can likely reduce the parsing work that a parser like HTML Purifier does by filtering out tags before the fact, there's no security gain in using strip_tags() first, and in your use-case it likely isn't going to make much of a difference. 确实可以通过过滤掉标记来减少像HTML Purifier之类的解析器所做的解析工作,但首先使用strip_tags()并不会增加安全性,并且在您的用例中可能不会有所作为。

The reason it won't make much of a difference is, of course, that your average submitted content will not be malicious, and thus be submitted via your WYSIWYG, which is only going to generate those tags that you already want to allow. 当然,它不会带来太大变化的原因是,您平均提交的内容不会是恶意的,因此可以通过所见即所得(WYSIWYG)进行提交,所见即所得只会生成您已经希望允许的标签。 As such, you wouldn't strip out any tags in the preliminary strip_tags() run for those comments. 这样,您就不会在针对这些注释的初步strip_tags()删除任何标签。

Meanwhile, a malicious submission is likely to bypass any benefit strip_tags() would give you, anyway. 同时,恶意提交很可能会绕开strip_tags()会给您带来的任何好处。 However, using strip_tags() before the parser won't do harm, and it could help guard against attempts to use the parser against you by letting it eat up a lot of resources - though if the parser can cause issues (I'd expect it to have safeguards against that), that tends to happen through nesting depth, not through tag. 但是,在解析器之前使用strip_tags()不会造成损害,并且可以通过占用大量资源帮助防止尝试对您使用解析器-尽管如果解析器会引起问题(我希望它具有针对此的保护措施),这往往是通过嵌套深度而不是通过标签实现的。

In brief: 简单来说:

I see no reason to recommend it in your case; 我认为没有理由推荐您的情况; but I see no reason to dissuade you from using it, either. 但我也没有理由阻止您使用它。 strip_tags() is pretty fast and it won't mangle anything if you use it before the parser. strip_tags()相当快,如果在解析器之前使用它,它将不会损坏任何内容。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM