简体   繁体   English

当输入中允许html实体时,如何防止html实体的双重编码

[英]How can one prevent double encoding of html entities when they are allowed in the input

How can I prevent double encoding of html entities, or fix them programmatically? 如何防止html实体的双重编码,或以编程方式修复它们?

I am using the encode() function from the HTML::Entities perl module to encode HTML entities in user input. 我正在使用HTML :: Entities perl模块中的encode()函数来编码用户输入中的HTML实体。 The problem here is that we also allow users to input HTML entities directly and these entities end up being double encoded. 这里的问题是我们还允许用户直接输入HTML实体,这些实体最终被双重编码。

For example, a user may enter: 例如,用户可以输入:

Stackoverflow & Perl = Awesome…

This ends up being encoded to 这最终被编码为

Stackoverflow & Perl = Awesome…

This renders in the browser as 这在浏览器中呈现为

Stackoverflow & Perl = Awesome…

We want this to render as 我们希望将其渲染为

Stackoverflow & Perl = Awesome...

Is there a way to prevent this double encoding? 有没有办法防止这种双重编码? Or is there a module or snippet of code that can easily correct these double encoding issues? 或者是否有一个模块或代码片段可以轻松纠正这些双重编码问题?

Any help is greatly appreciated! 任何帮助是极大的赞赏!

You can decode the string first: 您可以先解码字符串:

my $input = from_user();

my $encoded = encode_entities( decode_entities $input );

There is an extremely simple way to avoid this: 有一种非常简单的方法可以避免这种情况:

  1. Remove all the entities upon input (turn them into Unicode) 输入后删除所有实体(将它们转换为Unicode)
  2. Encode into entities again at the stage of output. 在输出阶段再次对实体进行编码。

Consider saving the call to encode() until you retrieve the value for display, rather than before you store it. 考虑将调用保存到encode()直到您检索显示的值,而不是在存储它之前。 So long as you are consistent in your retrieval mechanism, the extra data in your database probably isn't worth fretting over. 只要您的检索机制一致,数据库中的额外数据可能就不值得烦恼了。

Edit 编辑

Re-reading your question I realize now my answer doesn't fully address the issue seeing as calling encode() later will still have the same results. 重新阅读你的问题我现在意识到我的答案没有完全解决这个问题,因为稍后调用encode()会产生相同的结果。 Not knowing of an alternative myself, it may not be much help, but you may want to consider finding a more suitable method for encoding that will respect existing symbols. 我自己不知道替代方案,它可能没有多大帮助,但您可能想要考虑找到一种更适合编码的方法来尊重现有符号。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何防止web2py自动编码html实体? - How can I prevent web2py from automagically encoding html-entities? 如何防止Nokogiri编码HTML片段中的实体 - How to prevent Nokogiri from encoding entities in HTML fragments 可以阻止Genshi解析HTML实体吗? - Can one prevent Genshi from parsing HTML entities? 防止对现有HTML实体进行编码(转换为&但不包括&) - Prevent encoding of existing HTML entities (convert & to & but not & to &) 如何在HTML输入框中突出显示实体和代码关键字? - How can I highlight entities and code keywords in HTML input boxes? 为安全起见对 HTML 输入进行编码时,如何避免对 Ñ 或 ñ 等国际字符进行编码? - When encoding HTML input for security, how do I avoid encoding international characters like Ñ or ñ? HTML:当浏览器滚动到输入时,如何使整个输入的父视图可见? - HTML: When a browser scrolls to an input, how can one make it bring the entire input's parent into view? 当使用HTML实体转义字符串时,如果我使用UTF-8,我是否可以安全地跳过Unicode 127之上的编码字符? - When escaping a string with HTML entities, can I safely skip encoding chars above Unicode 127 if I use UTF-8? 什么时候应该使用HTML实体? - When should one use HTML entities? 在以下情况下如何防止数字数据:HTML输入类型=“文本”? - How to prevent Numeric data when:HTML Input type = “text”?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM