简体   繁体   English

如何将字符串转换为 unicode 字符?

[英]How can I convert a string into a unicode character?

In Javascript '\\uXXXX' returns in a unicode character.在 Javascript 中'\\uXXXX'以 unicode 字符返回。 But how can I get a unicode character when the XXXX part is a variable?但是当XXXX部分是一个变量时,我怎样才能得到一个 unicode 字符呢?

For example:例如:

var input = '2122';
console.log('\\u' + input);             // returns a string: "\u2122"
console.log(new String('\\u' + input)); // returns a string: "\u2122"

The only way I can think of to make it work, is to use eval ;我能想到的让它工作的唯一方法是使用eval yet I hope there's a better solution:但我希望有更好的解决方案:

var input = '2122';
var char = '\\u' + input;
console.log(eval("'" + char + "'"));    // returns a character: "™"

Use String.fromCharCode() like this: String.fromCharCode(parseInt(input,16)) .像这样使用String.fromCharCode()String.fromCharCode(parseInt(input,16)) When you put a Unicode value in a string using \\u\u003c/code> , it is interpreted as a hexdecimal value, so you need to specify the base (16) when using parseInt .当您使用\\u\u003c/code>将 Unicode 值放入字符串时,它会被解释为十六进制值,因此您需要在使用parseInt时指定基数 (16)。

String.fromCharCode("0x" + input)

or要么

String.fromCharCode(parseInt(input, 16)) as they are 16bit numbers (UTF-16) String.fromCharCode(parseInt(input, 16))因为它们是 16 位数字 (UTF-16)

JavaScript uses UCS-2 internally. JavaScript 在内部使用 UCS-2。

Thus, String.fromCharCode(codePoint) won't work for supplementary Unicode characters.因此, String.fromCharCode(codePoint)不适用于补充 Unicode 字符。 If codePoint is 119558 ( 0x1D306 , for the '𝌆' character), for example.例如,如果codePoint1195580x1D306 ,对于'𝌆'字符)。

If you want to create a string based on a non-BMP Unicode code point, you could use Punycode.js 's utility functions to convert between UCS-2 strings and UTF-16 code points:如果要创建基于非 BMP Unicode 代码点的字符串,可以使用Punycode.js的实用函数在 UCS-2 字符串和 UTF-16 代码点之间进行转换:

// `String.fromCharCode` replacement that doesn’t make you enter the surrogate halves separately
punycode.ucs2.encode([0x1d306]); // '𝌆'
punycode.ucs2.encode([119558]); // '𝌆'
punycode.ucs2.encode([97, 98, 99]); // 'abc'

Since ES5 you can use从 ES5 开始,您可以使用

String.fromCodePoint(number) String.fromCodePoint(number)

to get unicode values bigger than 0xFFFF.获得大于 0xFFFF 的 unicode 值。

So, in every new browser, you can write it in this way:所以,在每一个新的浏览器中,你都可以这样写:

var input = '2122';
console.log(String.fromCodePoint(input));

or if it is a hex number:或者如果它是一个十六进制数:

var input = '2122';
console.log(String.fromCodePoint(parseInt(input, 16)));

More info:更多信息:

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/fromCodePoint https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/fromCodePoint

var hex = '2122';
var char = unescape('%u' + hex);

console.log(char);

will returns " ™ "将返回“™”

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM