The Unicode character 𠮵
given by point 134069
, has the HTML escape 𠮵
Is there a (preferably native) way to get the HTML escapes for character entities from Javascript?
You can get both the point and hex values of the char like this:
var codePoint = '𠮵'.codePointAt(0); //codePoint = 134069
var hexValue = '𠮵'.codePointAt(0).toString(16); //hexValue = 20bb5
var htmlEscape = '&#x' + hexValue + ';'; //htmlEscape = 𠮵
Here is a working example:
$('#doIt').click(function() { $('#outputHex').html($('#inputText').val().codePointAt(0).toString(16)); $('#outputString').html('&#x' + $('#inputText').val().codePointAt(0).toString(16) + ';'); $('#outputChar').html('&#x' + $('#inputText').val().codePointAt(0).toString(16) + ';'); });
code { display: block; padding: 4px; background-color: #EFEFEF; }
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script> <textarea id="inputText"></textarea> <button id="doIt">do it</button> <h3>result</h3> <code id="outputHex"></code> <code id="outputString"></code> <code id="outputChar"></code>
One more thing, codePointAt is an ES6 function and isn't supported in older browsers. In case the browser blocks the code from running here: JSFiddle Example
Here is a function that converts all non-ASCII7 characters, and <
, >
, &
to HTML entities:
function htmlEntities(s) { return Array.from(s).map(function (c) { return c.codePointAt(0) < 128 && '<&>'.indexOf(c) == -1 ? c : '&#x' + c.codePointAt(0).toString(16) + ';'; }).join(''); } var s = 'This is \\u{20BB5}, a special character & encoded in HTML.'; document.body.innerHTML = htmlEntities(s);
Be aware that in Javascript strings, extended unicode characters are counted as two characters (for example in length
). The ES6 constructs like Array.from
, [...s]
make sure you get the right chunks.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.