[英]Using Javascript and Regex to replace HTML Characters
Thanks in advance for your help. 在此先感谢您的帮助。
I have a need within an application to remove all HTML Characters and replace them with their HTML number equivalent. 我需要在应用程序中删除所有HTML字符并将其替换为等效的HTML编号。
For example: 例如:
‡, •, -, ‰, € and ™
Become: 成为:
‡, •, -, ‰, € and ™
There are lot's of questions currently out there, but these do it the other way round. 目前有很多问题,但这些问题反过来了。
I have all of the chars I want to convert in a JSON object (this is just a snapsshot of a much larger list, just to prove my JSON is good): 我有一个我想在JSON对象中转换的字符(这只是一个更大的列表的快照,只是为了证明我的JSON是好的):
{"ch":"‘","sub":"‘"},
{"ch":"’","sub":"’"},
{"ch":"‚","sub":"‚"},
{"ch":"“","sub":"“"},
{"ch":"”","sub":"”"},
{"ch":"„","sub":"„"},
{"ch":"†","sub":"†"},
{"ch":"‡","sub":"‡"},
{"ch":"•","sub":"•"},
...
And I currently loop through (using Prototype here) and attempt to replace them: 我现在循环(在这里使用Prototype)并尝试替换它们:
oJSONItems.each(function(o){
var oRG = new RegExp(o.ch,'g');
oText = oText.replace(oRG,o.sub);
});
Some are being replaced, but some are not... 有些正在被替换,但有些不是......
‡
•
-
‰
€
™
More than anything I need to know why chars like ™ are failing to be converted. 最重要的是我需要知道为什么chars like ™无法转换。
Thanks. 谢谢。
Rather than code for specific entities, how about one that replaces anything outside the original 7 bit ASCII range: 而不是为特定实体编码,如何替换原始7位ASCII范围之外的任何内容 :
str = str.replace(/[^\011\012\015\040-\177]/g, function(x) {
return '&#' + x.charCodeAt(0) + ';'
})
(The regexp matches anything that's not white space or a "normal" ASCII character) (正则表达式匹配任何非空格或“普通”ASCII字符)
Alternatively, write your map so that the keys are the characters you want to replace, and the values are the entities: 或者,编写地图以使键是要替换的字符,值是实体:
var map = { '£' : '£' }
str = str.replace(/./g, function(x) {
return (x in map) ? map[x] : x;
});
Note that both versions only make the regexp call once , rather than once for each possible entity in your set. 请注意,两个版本仅对regexp调用一次 ,而不是对集合中的每个可能实体执行一次。 This should make the code somewhat faster than your loop-based method.
这应该使代码比基于循环的方法更快。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.