简体   繁体   English

使用Javascript和Regex替换HTML字符

[英]Using Javascript and Regex to replace HTML Characters

Thanks in advance for your help. 在此先感谢您的帮助。

I have a need within an application to remove all HTML Characters and replace them with their HTML number equivalent. 我需要在应用程序中删除所有HTML字符并将其替换为等效的HTML编号。

For example: 例如:

‡, •, -, ‰, € and ™

Become: 成为:

‡, •, -, ‰, € and ™

There are lot's of questions currently out there, but these do it the other way round. 目前有很多问题,但这些问题反过来了。

I have all of the chars I want to convert in a JSON object (this is just a snapsshot of a much larger list, just to prove my JSON is good): 我有一个我想在JSON对象中转换的字符(这只是一个更大的列表的快照,只是为了证明我的JSON是好的):

{"ch":"‘","sub":"‘"},
{"ch":"’","sub":"’"},
{"ch":"‚","sub":"‚"},
{"ch":"“","sub":"“"},
{"ch":"”","sub":"”"},
{"ch":"„","sub":"„"},
{"ch":"†","sub":"†"},
{"ch":"‡","sub":"‡"},
{"ch":"•","sub":"•"},
...

And I currently loop through (using Prototype here) and attempt to replace them: 我现在循环(在这里使用Prototype)并尝试替换它们:

oJSONItems.each(function(o){
    var oRG = new RegExp(o.ch,'g');
    oText = oText.replace(oRG,o.sub);
});

Some are being replaced, but some are not... 有些正在被替换,但有些不是......

‡
•
-
‰
€
™

More than anything I need to know why chars like are failing to be converted. 最重要的是我需要知道为什么chars like 无法转换。

Thanks. 谢谢。

Rather than code for specific entities, how about one that replaces anything outside the original 7 bit ASCII range: 而不是为特定实体编码,如何替换原始7位ASCII范围之外的任何内容

str = str.replace(/[^\011\012\015\040-\177]/g, function(x) {
    return '&#' + x.charCodeAt(0) + ';'
})

(The regexp matches anything that's not white space or a "normal" ASCII character) (正则表达式匹配任何非空格或“普通”ASCII字符)

Alternatively, write your map so that the keys are the characters you want to replace, and the values are the entities: 或者,编写地图以使是要替换的字符,值是实体:

var map = { '£' : '£' }

str = str.replace(/./g, function(x) {
    return (x in map) ? map[x] : x;
});

Note that both versions only make the regexp call once , rather than once for each possible entity in your set. 请注意,两个版本仅对regexp调用一次 ,而不是对集合中的每个可能实体执行一次。 This should make the code somewhat faster than your loop-based method. 这应该使代码比基于循环的方法更快。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM