简体   繁体   English

如何使用 javascript 将每个特殊字符和表情符号转换为其 html 实体?

[英]How can i convert every special character and emoji into its html entity using javascript?

My current code converts characters into entities as expected.我当前的代码按预期将字符转换为实体。 But if I convert emoji, then it generates something like �� for which doesn't render as expected.但是,如果我转换表情符号,那么它会生成类似 �� 的内容,但不会按预期呈现。

 String.prototype.toHtmlEntities = function() { return this.replace(/./gm, function(s) { // return "&#" + s.charCodeAt(0) + ";"; return (s.match(/[a-z0-9\s]+/i))? s: "&#" + s.charCodeAt(0) + ";"; }); }; console.log("".toHtmlEntities()) document.write("".toHtmlEntities())

You're iterating over the code units of your string.您正在迭代字符串的代码单元 Instead, you want to iterate over the code points .相反,您想遍历代码点 Most emojis consist of one code point, which is encoded by two code units called surrogate pairs - one high and one low one.大多数表情符号由一个代码点组成,该代码点由两个称为代理对的代码单元编码 - 一高一低。 Surrogate pairs when displayed standalone don't represent a valid symbol, which ends up with being rendered.独立显示时的代理对不代表一个有效的符号,最终以 被渲染。 If you use the u (unicode) flag on your regular expression, your .如果您在正则表达式上使用u (unicode)标志,您的. will then match based on the code points, allowing you to iterate over each code point (rather than code unit).然后将根据代码点进行匹配,允许您迭代每个代码点(而不是代码单元)。 You can then access the code point value using codePointAt(0) , which you can then encode into a HTML entity:然后,您可以使用codePointAt(0)访问代码点值,然后您可以将其编码为 HTML 实体:

 String.prototype.toHtmlEntities = function() { return this.replace(/[^a-z0-9\s]/ugm, s => "&#" + s.codePointAt(0) + ";"); }; console.log("a".toHtmlEntities()); document.write("a".toHtmlEntities()); console.log("&".toHtmlEntities()); document.write("&".toHtmlEntities()); console.log("".toHtmlEntities()); // surrogate pair test document.write("".toHtmlEntities()); console.log("".toHtmlEntities()); // ZWJ test document.write("".toHtmlEntities()); console.log("❤️".toHtmlEntities()); // variation selector test document.write("❤️".toHtmlEntities()); // variation selector test console.log("ñ".toHtmlEntities()); // decomposed character test (length of 2) document.write("ñ".toHtmlEntities()); // decomposed character test (length of 2) console.log("ñ".toHtmlEntities()); // composed character (length of 1) document.write("ñ".toHtmlEntities()); // composed character (length of 1)

If you just want to replace the emoji characters, you can use \p{Emoji} to match those (or another regular expression to match your specific characters), and replace those with their code points, eg:如果您只想替换表情符号字符,您可以使用\p{Emoji}来匹配那些(或另一个正则表达式来匹配您的特定字符),并用它们的代码点替换它们,例如:

 String.prototype.toHtmlEntities = function() { return this.replace(/\p{Emoji}/ugm, s => '&#' +s.codePointAt(0) + ";"); }; console.log("a".toHtmlEntities()); document.write("a".toHtmlEntities()); console.log("&".toHtmlEntities()); document.write("&".toHtmlEntities()); console.log("".toHtmlEntities()); // surrogate pair test document.write("".toHtmlEntities()); console.log("".toHtmlEntities()); // ZWJ test document.write("".toHtmlEntities()); console.log("❤️".toHtmlEntities()); // variation selector test document.write("❤️".toHtmlEntities()); // variation selector test console.log("ñ".toHtmlEntities()); // decomposed character test (length of 2) document.write("ñ".toHtmlEntities()); // decomposed character test (length of 2) console.log("ñ".toHtmlEntities()); // composed character (length of 1) document.write("ñ".toHtmlEntities()); // composed character (length of 1)

As always, if you're going to be modifying the prototype of inbuilt JavaScript objects, ensure you know the consequences of doing so .与往常一样,如果您要修改内置 JavaScript 对象的原型,请确保您知道这样做的后果 It is instead recommended to create a new function and pass the string you want to convert into that function as an argument.相反,建议创建一个新的 function 并将要转换为该 function 的字符串作为参数传递。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在 JavaScript 中将特殊符号转换为表情符号? - How to convert special symbol to emoji character in JavaScript? 使用javascript将特殊字符转换回HTML实体代码 - Use javascript to convert special character back to HTML entity code 如何在TypeScript / JavaScript中将字符串从字符串转换为表情符号 - How can i convert from string to emoji in typescript/javascript 如何使用javascript转换特殊字符及其实体名称(如&与&) - How to Convert special chars with their entity names (like & with &) using javascript 如何将一个表情符号字符转换为 JavaScript 中的 Unicode 代码点编号? - How to convert one emoji character to Unicode codepoint number in JavaScript? 如何从Javascript / Redactor中的OS X角色查看器和iOS表情符号键盘中检测到表情符号插入事件? - How can I detect an emoji insert event from the OS X character viewer and iOS emoji keyboard in Javascript/Redactor? 如何使用.text或.html编写特殊字符? - How can I use .text or .html to write special character? 如何在javascript中替换HTML特殊字符? - How to replace HTML special character in javascript? 如何使用Javascript将HTML / CSS转换为PDF? - How can I convert HTML/CSS to PDF using Javascript? 如何使用DOJO添加具有特殊字符的属性 - How can I add an attribute with special character using DOJO
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM