简体   繁体   English

如何使用JavaScript在HTML标题中正确插入unicode?

[英]How do I correctly insert unicode in an HTML title using JavaScript?

I'm seeing some weird behavior when I'm setting the title of an HTML page using JavaScript. 当我使用JavaScript设置HTML页面的标题时,我看到了一些奇怪的行为。 If I insert html character references directly into the title the Unicode renders correctly, for instance: 如果我直接在标题中插入html字符引用,则Unicode呈现正确,例如:

<title>&#21543;&#20986;</title>

But if I attempt to use html characters references via JavaScript, something seems to be converting the & to (& amp ;) (separating them so SO doesn't just turn it back into ampersand) and thus breaking the encoding, causing it to be rendered as the full coded string: 但是,如果我尝试通过JavaScript使用html字符引用,似乎有些东西将&转换为(&amp;)(将它们分开,因此SO不会将其转回到&符号),从而打破编码,导致它成为呈现为完整的编码字符串:

function execTitleChange() {
  document.title = "&#21543;&#20986;";
}

(I should note that this is a little bit of speculation; when I introspect the DOM using Firebug after executing this JavaScript function, that's where I see the & instead of &.) (我应该注意这是一个小小的猜测;当我在执行这个JavaScript函数后使用Firebug对DOM进行内省时,我在那里看到&而不是&。)

If I use \\u encoded Unicode characters when setting the value from JavaScript then everything works correctly again: 如果我在从JavaScript设置值时使用\\ u编码的Unicode字符,那么一切都可以正常工作:

function execTitleChange() {
  document.title = "\u5427\u51fa";
}

The fact that \\u encoded characters work kind of makes sense to me since I think that's how JavaScript represents Unicode characters but I'm stumped as to why the behavior would be different when using the html character references. 因为我认为JavaScript代表Unicode字符的方式,所以\\ u编码字符对我有用是有意义的,但我很难理解为什么在使用html字符引用时行为会有所不同。

JavaScript string constants are parsed by the JavaScript parser. JavaScript字符串常量由JavaScript解析器解析。 Text inside HTML tags is parsed by the HTML parser. HTML标记内的文本由HTML解析器解析。 The two languages (and, by extension, their parsers) are different, and in particular they have different ways of representing characters by character code. 这两种语言(以及扩展名,它们的解析器)是不同的,特别是它们具有通过字符代码表示字符的不同方式。

Thus, what you've discovered is the way reality actually is :-) Use the \\u\u003c/code> escape notation in JavaScript, and use HTML entities ( &#nnnn; ) in HTML/XML. 因此,您发现的实际情况是:-)在JavaScript中使用\\u\u003c/code>转义符号,并在HTML / XML中使用HTML实体( &#nnnn; )。

edit — now the situation can get even more confusing when you're talking about creating/inserting HTML from JavaScript. 编辑 - 现在,当您谈论 JavaScript创建/插入HTML时,情况会变得更加混乱。 When you use .innerHTML to update the DOM from JavaScript, then you are basically handing over HTML source code to the HTML parser for interpretation. 当您使用.innerHTML从JavaScript更新DOM时,您基本上将HTML源代码交给HTML解析器进行解释。 For that reason, you can use either JavaScript \\u\u003c/code> escapes or HTML entities, and things will work (excepting painful issues of character encoding mismatches etc). 出于这个原因,你可以使用JavaScript \\u\u003c/code>转义或HTML实体,事情会起作用(除了字符编码不匹配等痛苦的​​问题)。

Finally, note that JavaScript also provides the String.fromCharCode() function to construct strings from numeric character codes. 最后,请注意JavaScript还提供了String.fromCharCode()函数来从数字字符代码构造字符串。

The best way to work with Unicode characters in JavaScript is to use the characters themselves, using an editor or other tool that can store them in UTF-8 encoding. 在JavaScript中使用Unicode字符的最佳方法是使用字符本身,使用编辑器或其他可以以UTF-8编码存储它们的工具。 You will avoid a lot of confusion. 你会避免很多困惑。 Naturally, you need to properly declare the character encoding of your .js or .html file. 当然,您需要正确声明.js或.html文件的字符编码。

The construct &#21543; 构造&#21543; has no special meaning in JavaScript; 在JavaScript中没有特殊含义; it is just eight Ascii characters. 它只有八个Ascii字符。 But if your JavaScript code has been embedded into an HTML document, then it will be processed by HTML rules before passing to the JavaScript interpreter. 但是,如果您的JavaScript代码已嵌入到HTML文档中,那么在传递给JavaScript解释器之前,它将由HTML规则处理。 And the rules vary by HTML version. 规则因HTML版本而异。 Yet another reason to avoid such constructs. 避免这种结构的另一个原因。

So just write 所以写吧

document.title = "吧出";

(Of course, there are very few situations where you should change the title element content—which is crucial to search engines and many other purposes—in JavaScript, instead of setting it in HTML. But that's beside the point.) (当然,在极少数情况下你应该更改title元素内容 - 这对于搜索引擎和许多其他目的至关重要 - 在JavaScript中,而不是用HTML设置。但这不是重点。)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM