Thanks in advance for your help.
I have a need within an application to remove all HTML Characters and replace them with their HTML number equivalent.
For example:
‡, •, -, ‰, € and ™
Become:
‡, •, -, ‰, € and ™
There are lot's of questions currently out there, but these do it the other way round.
I have all of the chars I want to convert in a JSON object (this is just a snapsshot of a much larger list, just to prove my JSON is good):
{"ch":"‘","sub":"‘"},
{"ch":"’","sub":"’"},
{"ch":"‚","sub":"‚"},
{"ch":"“","sub":"“"},
{"ch":"”","sub":"”"},
{"ch":"„","sub":"„"},
{"ch":"†","sub":"†"},
{"ch":"‡","sub":"‡"},
{"ch":"•","sub":"•"},
...
And I currently loop through (using Prototype here) and attempt to replace them:
oJSONItems.each(function(o){
var oRG = new RegExp(o.ch,'g');
oText = oText.replace(oRG,o.sub);
});
Some are being replaced, but some are not...
‡
•
-
‰
€
™
More than anything I need to know why chars like ™ are failing to be converted.
Thanks.
Rather than code for specific entities, how about one that replaces anything outside the original 7 bit ASCII range:
str = str.replace(/[^\011\012\015\040-\177]/g, function(x) {
return '&#' + x.charCodeAt(0) + ';'
})
(The regexp matches anything that's not white space or a "normal" ASCII character)
Alternatively, write your map so that the keys are the characters you want to replace, and the values are the entities:
var map = { '£' : '£' }
str = str.replace(/./g, function(x) {
return (x in map) ? map[x] : x;
});
Note that both versions only make the regexp call once , rather than once for each possible entity in your set. This should make the code somewhat faster than your loop-based method.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.