[英]UTF-8 is an Encoding or a Document Character Set?
According with W3C Recommendation says that every aplicattion requires its document character set (Not be confused with Character Encoding).根据W3C 推荐,每个应用程序都需要其文档字符集(不要与字符编码混淆)。
A document character set consists of:一个文档字符集包括:
A Repertoire: A set of abstract characters, such as the Latin letter "A", the Cyrillic letter "I", the Chinese character meaning "water", etc. A Repertoire:一组抽象字符,如拉丁字母“A”、西里尔字母“I”、汉字“水”等。
Code positions: A set of integer references to characters in the repertoire.代码位置:一组 integer 参考曲目中的字符。
Each document is a sequence of characters from the repertoire.每个文档都是来自曲目的字符序列。
Character Encoding is: How those characters may be represented字符编码是:如何表示这些字符
When i save a file in Windows notepad im guessing that this are the "Document Character Sets":当我在 Windows 记事本中保存文件时,我猜测这是“文档字符集”:
Simple 3 questions:简单的3个问题:
I want to know if those are the "document character sets".我想知道这些是否是“文档字符集”。 And if they are,如果是的话,
Why is UTF-8 on the list?为什么 UTF-8 上榜? UTF-8 is not supposed to be an encoding ? UTF-8 不应该是编码吗?
If im not wrong with all this stuff:如果我对所有这些东西都没有错:
Are there another Document Character Sets that Windows do not allow you to define?是否还有其他 Windows 不允许您定义的文档字符集?
How to define another document character sets?如何定义另一个文档字符集?
In my understanding:据我了解:
The purpose of that dropdown in the Save dialog is really to select both a character set and an encoding for it, but they've been a little careless with the naming of the options.保存对话框中该下拉菜单的目的实际上是 select 的字符集和编码,但他们对选项的命名有点粗心。
(Technically, though, an encoding just maps integers to byte sequences, so any encoding could be used with any character set that is small enough to "fit" the encoding. However, the UTF-* encodings are designed with Unicode in mind.) (但从技术上讲,编码只是将整数映射到字节序列,因此任何编码都可以与任何小到足以“适合”编码的字符集一起使用。但是,UTF-* 编码在设计时考虑了 Unicode。)
Also, see Joel on Software's mandatory article on the subject .另外,请参阅Joel 关于软件的关于该主题的必读文章。
UTF-8 is a character encoding that is also used to specify a character set for HTML and other textual documents. UTF-8是一种字符编码,也用于为 HTML 和其他文本文档指定字符集。 It is one of several Unicode encodings (UTF-16 is another).它是几种 Unicode 编码之一(UTF-16 是另一种)。
To answer your questions:要回答您的问题:
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.