[英]javascript encoding issue with accented characters
I have a page with UTF-8 header: 我有一个带有UTF-8标头的页面:
<meta charset="utf-8" />
And in the page I use the umbraco dictionary to fetch content in various languages. 在页面中,我使用umbraco词典来获取各种语言的内容。 When I print this in German on the page it appears fine: 当我在页面上用德语打印时,它看起来很好:
<h1>@library.GetDictionaryItem("A")</h1>
resolves to: 解析为:
<h1>Ä</h1>
in German <h1>Ä</h1>
德语
However if I enter it via a script: 但是,如果我通过脚本输入它:
<script type="text/javascript" charset="utf-8">
var a = "@library.GetDictionaryItem("A")";
alert(a);
</script>
The alert prints: 警报打印:
ä
If I do 如果我做
<script type="text/javascript" charset="utf-8">
var a = "Ä";
alert(a);
</script>
The alert prints: 警报打印:
Ä
So what could explain this behaviour and how can I fix the alert? 那么,什么可以解释这种现象,以及如何解决警报? As far as I can see everything is UTF-8 and the dictionary and the page encoding is fine. 据我所知,一切都是UTF-8,并且字典和页面编码都很好。 The problem happens within Javascript. 问题发生在Javascript中。
From what I can see from the table here, Javascript resolves the character into it's Numeric value. 从我在此处的表格中可以看到,Javascript将字符解析为数字值。 I used "escape, encodeUrl, decodeUrl" etc with no luck. 我用“转义,encodeUrl,decodeUrl”等没有运气。
chr HexCode Numeric HTML entity escape(chr) encodeURI(chr)
ä \xE4 ä ä %E4 %C3%A4
(FWIW: Character entity ä
is ä
, not Ä
.) (FWIW:字符实体ä
是ä
,而不是Ä
。)
This has nothing to do with character encoding. 这与字符编码无关。 You're outputting an HTML entity to a JavaScript string, and then asking the browser to display that JavaScript string without doing anything to interpret HTML (via alert
). 您正在将HTML 实体输出到JavaScript字符串,然后要求浏览器显示该JavaScript字符串,而不执行任何解释HTML的操作(通过alert
)。 It's exactly as though you actually typed: 就像您实际键入的一样:
<h1>ä</h1>
...(which will show ä
on the page), and ...(将在页面上显示ä
),以及
<script>
var a = "ä";
alert(a);
</script>
...which won't. ...不会。 The HTML entity isn't being used anywhere that understands HTML entities. HTML实体不会在任何了解HTML实体的地方使用。 alert
doesn't interpret HTML. alert
无法解释HTML。
But if you did this: 但是,如果您这样做:
<script>
var a = "ä";
var div = document.createElement('div');
div.innerHTML = a;
document.body.appendChild(div);
</script>
...you'd see the character on the page, because we're giving the entity to something ( innerHTML
) that will interpret HTML. ...您会在页面上看到该字符,因为我们为实体提供了将解释HTML的内容( innerHTML
)。 And so if you make that first line: 因此,如果您输入第一行:
var a = "@library.GetDictionaryItem("A")";
...and then use a
in an HTML context (as above), you'll get the ä
in the document. ...然后在HTML上下文中使用a
(如上所述),您将在文档中获得ä
。
If you always get a decimal numeric character entity (like ä
) from Umbraco, since those define unicode code points and JavaScript (mostly) uses unicode code points in its strings*, you can parse the entity easily enough: 如果您总是从Umbraco获得十进制数字字符实体(例如ä
),由于它们定义了unicode代码点,而JavaScript(通常)在其字符串中使用unicode代码点*,则可以轻松地解析该实体:
function characterFromDecimalNumericEntity(str) {
var decNumEntRex = /^\&#(\d+);$/;
var match = decNumEntRex.exec(str);
var codepoint = match ? parseInt(match[1], 10) : null;
var character = codepoint ? String.fromCharCode(codepoint) : null;
return character;
}
alert(characterFromDecimalNumericEntity("ä")); // ä
* Why "mostly": JavaScript strings are made up of 16-bit "characters" that correspond to UTF-16 code units , not Unicode code points (you can't store a Unicode code point in 16 bits, you need 21). *为什么要“主要”:JavaScript字符串由对应于UTF-16 代码单元而不是Unicode代码点的16位“字符”组成(您不能以16位存储Unicode代码点,需要21)。 All characters from the Basic Multilingual Plane fit within one UTF-16 code unit, but characters from the Supplementary Multilingual Plane , Supplementary Ideographic Plane , and so on require two UTF-16 code units for a character. 基本多语言平面中的所有字符都适合一个UTF-16代码单元,但是补充多语言平面 , 补充表意文字平面 等中的字符需要一个字符使用两个 UTF-16代码单元。 One of those characters will occupy two "characters" in a JavaScript string. 这些字符之一将占据JavaScript字符串中的两个“字符”。 The function above would fail for them. 上面的功能对他们来说将失败。 More in the JavaScript spec and the Unicode FAQ . 有关JavaScript规范和Unicode FAQ的更多信息 。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.