简体   繁体   English

<meta charset="utf-8">对比<meta http-equiv="Content-Type">

[英]<meta charset="utf-8"> vs <meta http-equiv="Content-Type">

In order to define charset for HTML5 Doctype , which notation should I use?为了为HTML5 Doctype定义字符集,我应该使用哪种表示法?

  1. Short:短的:

     <meta charset="utf-8" />
  2. Long:长:

     <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

In HTML5, they are equivalent.在 HTML5 中,它们是等价的。 Use the shorter one, it is easier to remember and type.使用较短的一个,更容易记住和输入。 Browser support is fine since it was designed for backwards compatibility. 浏览器支持很好,因为它是为向后兼容而设计的。

Both forms of the meta charset declaration are equivalent and should work the same across browsers.元字符集声明的两种形式是等效的,并且在浏览器中的工作方式应该相同。 But, there are a few things you need to remember when declaring your web files character-set as UTF-8:但是,在将 Web 文件字符集声明为 UTF-8 时,您需要记住以下几点:

  1. Save your file(s) in UTF-8 encoding without the byte-order mark (BOM).不带字节顺序标记(BOM) 的 UTF-8 编码保存您的文件。
  2. Declare the encoding in your HTML files using meta charset (like above).使用元字符集(如上)在 HTML 文件中声明编码。
  3. Your web server must serve your files, declaring the UTF-8 encoding in the Content-Type HTTP header.您的 Web 服务器必须为您的文件提供服务,并在 Content-Type HTTP 标头中声明 UTF-8 编码。

Apache servers are configured to serve files in ISO-8859-1 by default, so you need to add the following line to your .htaccess file:默认情况下,Apache 服务器配置为提供 ISO-8859-1 中的文件,因此您需要将以下行添加到您的.htaccess文件中:

AddDefaultCharset UTF-8

This will configure Apache to serve your files declaring UTF-8 encoding in the Content-Type response header, but your files must be saved in UTF-8 (without BOM) to begin with.这将配置 Apache 为您的文件提供在 Content-Type 响应标头中声明 UTF-8 编码的文件,但您的文件必须以 UTF-8(无 BOM)开始保存。

Notepad cannot save your files in UTF-8 without the BOM.记事本无法在没有 BOM 的情况下以 UTF-8 格式保存您的文件。 A free editor that can is Notepad++ .可以是Notepad++ 的免费编辑器。 On the program menu bar, select "Encoding > Encode in UTF-8 without BOM".在程序菜单栏上,选择“编码 > 以 UTF-8 编码,无 BOM”。 You can also open files and re-save them in UTF-8 using "Encoding > Convert to UTF-8 without BOM".您还可以使用“编码 > 无 BOM 转换为 UTF-8”打开文件并以 UTF-8 格式重新保存它们。

More on the Byte Order Mark (BOM) at Wikipedia .有关Wikipedia上的字节顺序标记 (BOM) 的更多信息。

Another reason to go with the short one is that it matches other instances where you might specify a character set in markup.使用短的另一个原因是它与您可能在标记中指定字符集的其他实例相匹配。 For example:例如:

<script type="javascript" charset="UTF-8" src="/script.js"></script>

<p><a charset="UTF-8" href="http://example.com/">Example Site</a></p>

Consistency helps to reduce errors and make code more readable.一致性有助于减少错误并使代码更具可读性。

Note that the charset attribute is case-insensitive.请注意,字符集属性不区分大小写。 You can use UTF-8 or utf-8, however UTF-8 is clearer, more readable, more accurate.您可以使用 UTF-8 或 utf-8,但 UTF-8 更清晰、更易读、更准确。

Also, there is absolutely no reason at all to use any value other than UTF-8 in the meta charset attribute or page header.此外,完全没有理由在元字符集属性或页眉中使用 UTF-8 以外的任何值。 UTF-8 is the default encoding for Web documents since HTML4 in 1999 and the only practical way to make modern Web pages.自 1999 年的 HTML4 以来,UTF-8 是 Web 文档的默认编码,也是制作现代 Web 页面的唯一实用方法。

Also you should not use HTML entities in UTF-8.此外,您不应在 UTF-8 中使用 HTML 实体。 Characters like the copyright symbol should be typed directly.像版权符号这样的字符应该直接输入。 The only entities you should use are for the 5 reserved markup characters: less than, greater than, ampersand, prime, double prime.您应该使用的唯一实体是 5 个保留标记字符:小于、大于、与号、素数、双素数。 Entities need an HTML parser, which you may not always want to use going forward, they introduce errors, make your code less readable, increase your file sizes, and sometimes decode incorrectly in various browsers depending on which entities you used.实体需要一个 HTML 解析器,您可能并不总是希望在未来使用它,它们会引入错误,使您的代码可读性降低,增加文件大小,并且有时会在各种浏览器中错误地解码,具体取决于您使用的实体。 Learn how to type/insert copyright, trademark, open quote, close quote, apostrophe, em dash, en dash, bullet, Euro, and any other characters you encounter in your content, and use those actual characters in your code.了解如何键入/插入版权、商标、左引号、右引号、撇号、破折号、破折号、项目符号、欧元和您在内容中遇到的任何其他字符,并在您的代码中使用这些实际字符。 The Mac has a Character Viewer that you can turn on in the Keyboard System Preference, and you can find and then drag and drop the characters you need, or use the matching Keyboard Viewer to see which keys to type. Mac 有一个字符查看器,你可以在键盘系统偏好设置中打开它,你可以找到然后拖放你需要的字符,或者使用匹配的键盘查看器来查看要键入的键。 For example, trademark is Option+2.例如,商标是Option+2。 UTF-8 contains all of the characters and symbols from every written human language. UTF-8 包含来自每种书面人类语言的所有字符和符号。 So there is no excuse for using -- instead of an em dash.所以没有理由使用 -- 而不是破折号。 It is not a bad idea to learn the rules of punctuation and typography also ... for example, knowing that a period goes inside a close quote, not outside.学习标点符号和排版规则也不是一个坏主意……例如,知道句号在引号内,而不是在外引号内。

Using a tag for something like content-type and encoding is highly ironic, since without knowing those things, you couldn't parse the file to get the value of the meta tag.为内容类型和编码之类的东西使用标签是非常具有讽刺意味的,因为如果不知道这些东西,你就无法解析文件来获取元标签的值。

No, that is not true.不,这不是真的。 The browser starts out parsing the file as the browser's default encoding, either UTF-8 or ISO-8859-1.浏览器开始将文件解析为浏览器的默认编码,UTF-8 或 ISO-8859-1。 Since US-ASCII is a subset of both ISO-8859-1 and UTF-8, the browser can read just fine either way ... it is the same.由于 US-ASCII 是 ISO-8859-1UTF-8 的子集,因此浏览器可以很好地读取任何一种方式……它是相同的。 When the browser encounters the meta charset tag, if the encoding is different than what the browser is already using, the browser reloads the page in the specified encoding.当浏览器遇到 meta charset 标签时,如果编码与浏览器已经使用的不同,浏览器会以指定的编码重新加载页面。 That is why we put the meta charset tag at the top, right after the head tag, before anything else, even the title.这就是为什么我们将元字符集标签放在顶部,紧跟在 head 标签之后,在其他任何东西之前,甚至是标题。 That way you can use UTF-8 characters in your title.这样您就可以在标题中使用 UTF-8 字符。

You must save your file(s) in UTF-8 encoding without BOM您必须以不带 BOM 的 UTF-8 编码保存您的文件

That is not strictly true.严格来说并非如此。 If you only have US-ASCII characters in your document, you can Save it as US-ASCII and serve it as UTF-8, because it is a subset.如果您的文档中只有 US-ASCII 字符,则可以将其另存为 US-ASCII 并将其用作 UTF-8,因为它是一个子集。 But if there are Unicode characters, you are correct, you must Save as UTF-8 without BOM.但是如果有Unicode字符,你是对的,你必须另存为没有BOM的UTF-8。

If you want a good text editor that will save your files in UTF-8, I recommend Notepad++.如果你想要一个好的文本编辑器来保存你的 UTF-8 文件,我推荐 Notepad++。

On the Mac, use Bare Bones TextWrangler (free) from Mac App Store, or Bare Bones BBEdit which is at Mac App Store for $39.99 ... very cheap for such a great tool.在 Mac 上,使用 Mac App Store 中的 Bare Bones TextWrangler(免费),或 Mac App Store 中的 Bare Bones BBEdit,售价 39.99 美元……对于这样一款出色的工具来说非常便宜。 In either app, there is a menu at the bottom of the document window where you specify the document encoding and you can easily choose "UTF-8 no BOM".在任一应用程序中,文档窗口底部都有一个菜单,您可以在其中指定文档编码,您可以轻松选择“UTF-8 无 BOM”。 And of course you can set that as the default for new documents in Preferences.当然,您可以将其设置为首选项中新文档的默认值。

But if your Webserver serves the encoding in the HTTP header, which is recommended, both [meta tags] are needless.但是,如果您的 Web 服务器提供 HTTP 标头中的编码(推荐),则两个 [meta 标记] 都是不必要的。

That is incorrect.那是不正确的。 You should of course set the encoding in the HTTP header, but you should also set it in the meta charset attribute so that the page can be Saved by the user, out of the browser onto local storage and then Opened again later, in which case the only indication of the encoding that will be present is the meta charset attribute.您当然应该在 HTTP 标头中设置编码,但您也应该在元字符集属性中设置它,以便用户可以将页面从浏览器保存到本地存储,然后稍后再次打开,在这种情况下将出现的编码的唯一指示是元字符集属性。 You should also set a base tag for the same reason ... on the server, the base tag is unnecessary, but when opened from local storage, the base tag enables the page to work as if it is on the server, with all the assets in place and so on, no broken links.出于同样的原因,您还应该设置一个基本标签......在服务器上,基本标签是不必要的,但是当从本地存储打开时,基本标签使页面能够像在服务器上一样工作,所有的资产到位等等,没有断开的链接。

AddDefaultCharset UTF-8添加默认字符集 UTF-8

Or you can just change the encoding of particular file types like so:或者您可以像这样更改特定文件类型的编码:

AddType text/html;charset=utf-8 html

A tip for serving both UTF-8 and Latin-1 (ISO-8859-1) files is to give the UTF-8 files a "text" extension and Latin-1 files "txt."同时提供 UTF-8 和 Latin-1 (ISO-8859-1) 文件的一个技巧是为 UTF-8 文件提供“文本”扩展名,为 Latin-1 文件提供“txt”扩展名。

AddType text/plain;charset=iso-8859-1 txt
AddType text/plain;charset=utf-8 text

Finally, consider Saving your documents with Unix line endings, not legacy DOS or (classic) Mac line endings, which don't help and may hurt, especially down the line as we get further and further from those legacy systems.最后,考虑使用 Unix 行结束符保存您的文档,而不是传统的 DOS 或(经典)Mac 行结束符,这无济于事,可能会造成伤害,尤其是随着我们越来越远离那些遗留系统。 An HTML document with valid HTML5, UTF-8 encoding, and Unix line endings is a job well done.具有有效 HTML5、UTF-8 编码和 Unix 行结尾的 HTML 文档是一项出色的工作。 You can share and edit and store and read and recover and rely on that document in many contexts.您可以在许多上下文中共享、编辑、存储、阅读和恢复并依赖该文档。 It's lingua franca.这是通用语言。 It's digital paper.是数码纸。

<meta charset="utf-8"> was introduced with/for HTML5. <meta charset="utf-8">是针对 HTML5 引入的。

As mentioned in the documentation, both are valid.如文档中所述,两者都是有效的。 However, <meta charset="utf-8"> is only for HTML5 (and easier to type/remember).但是, <meta charset="utf-8">仅适用于 HTML5(并且更易于输入/记忆)。

In due time, the old style is bound to become deprecated in the near future.在适当的时候,旧样式必然会在不久的将来被弃用 I'd stick to the new <meta charset="utf-8"> .我会坚持使用新的<meta charset="utf-8">

There's only one way, but up.只有一种方法,但是向上。 In tech's case, that's phasing out the old (really, REALLY fast)在技​​术的情况下,这是逐步淘汰旧的(真的,非常快)

Documentation: HTML meta charset Attribute—W3Schools文档: HTML 元字符集属性—W3Schools

While not contesting the other answers, I think the following is worthy of mentioning.虽然不反对其他答案,但我认为以下内容值得一提。

  1. The “long” ( http-equiv ) notation and the “short” one are equal, whichever comes first wins; “长”( http-equiv )表示法和“短”表示http-equiv等,先到者胜;
  2. Web server headers will override all the <meta> tags; Web 服务器标头将覆盖所有<meta>标签;
  3. BOM (Byte order mark) will override everything , and in many cases it will affect html 4 (and probably other stuff, too); BOM(字节顺序标记)将覆盖所有内容,并且在许多情况下它会影响 html 4(可能还有其他内容);
  4. If you don't declare any encoding, you will probably get your text in “fallback text encoding” that is defined your browser.如果您不声明任何编码,您可能会在浏览器定义的“回退文本编码”中获取文本。 Neither in Firefox nor in Chrome it's utf-8;在 Firefox 和 Chrome 中都不是 utf-8;
  5. In absence of other clues the browser will attempt to read your document as if it was in ASCII to get the encoding, so you can't use any weird encodings (utf-16 with BOM should do, though);在没有其他线索的情况下,浏览器将尝试读取您的文档,就好像它是 ASCII 格式一样以获取编码,因此您不能使用任何奇怪的编码(但是,带有 BOM 的 utf-16 应该可以);
  6. While the specs say that the encoding declaration must be within the first 512 bytes of the document, most browsers will try reading more than that.虽然规范说编码声明必须在文档的前 512 个字节内,但大多数浏览器会尝试阅读更多。

You can test by running echo 'HTTP/1.1 200 OK\\r\\nContent-type: text/html; charset=windows-1251\\r\\n\\r\\n\\xef\\xbb\\xbf<!DOCTYPE html><html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"><meta charset="windows-1251"><title>привет</title></head><body>привет</body></html>' | nc -lp 4500您可以通过运行echo 'HTTP/1.1 200 OK\\r\\nContent-type: text/html; charset=windows-1251\\r\\n\\r\\n\\xef\\xbb\\xbf<!DOCTYPE html><html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"><meta charset="windows-1251"><title>привет</title></head><body>привет</body></html>' | nc -lp 4500 echo 'HTTP/1.1 200 OK\\r\\nContent-type: text/html; charset=windows-1251\\r\\n\\r\\n\\xef\\xbb\\xbf<!DOCTYPE html><html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"><meta charset="windows-1251"><title>привет</title></head><body>привет</body></html>' | nc -lp 4500 echo 'HTTP/1.1 200 OK\\r\\nContent-type: text/html; charset=windows-1251\\r\\n\\r\\n\\xef\\xbb\\xbf<!DOCTYPE html><html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"><meta charset="windows-1251"><title>привет</title></head><body>привет</body></html>' | nc -lp 4500 and pointing your browser at localhost:4500 . echo 'HTTP/1.1 200 OK\\r\\nContent-type: text/html; charset=windows-1251\\r\\n\\r\\n\\xef\\xbb\\xbf<!DOCTYPE html><html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"><meta charset="windows-1251"><title>привет</title></head><body>привет</body></html>' | nc -lp 4500并将浏览器指向localhost:4500 (Of course you will want to change or remove parts. The BOM part is \\xef\\xbb\\xbf . Be wary of the encoding of your shell.) (当然,您会想要更改或删除部分。BOM 部分是\\xef\\xbb\\xbf 。请注意外壳的编码。)

Please mind that it's very important that you explicitly declare the encoding.请注意,明确声明编码非常重要。 Letting browsers guess can lead to security issues.让浏览器猜测可能会导致安全问题。

Use <meta charset="utf-8" /> for web browsers when using HTML5.使用 HTML5 时,将<meta charset="utf-8" />用于 Web 浏览器。

Use <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> when using HTML4 or XHTML, or for outdated dom parsers, like DOMDocument in php 5.3使用<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />时使用 HTML4 或 XHTML,或者过时的 dom 解析器,如 php 5.3 中的DOMDocument

To embed a signature on an email, I would use the long version:要在电子邮件中嵌入签名,我将使用长版本:

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

The reason is that not many email readers use html5, so it's always better use old html styles.原因是没有多少电子邮件阅读器使用 html5,所以最好使用旧的 html 样式。 Actually, it's better to use tables than divs + css as well.实际上,使用表格也比使用 divs + css 更好。

There is some news based on Mozilla Foundation , andsitepoint有一些基于Mozilla Foundationsitepoint 的新闻

Do not use this value ( http-equiv=content-type ) as it is obsolete.不要使用此值 ( http-equiv=content-type ),因为它已过时。 Prefer the charset attribute on the < meta > element.首选 < meta > 元素上的charset属性。 在此处输入图片说明

I would recommend doing it like this to keep things in line with HTML5.我建议这样做以使事情与 HTML5 保持一致。

<meta charset="UTF-8">

EG: EG:

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Document</title>
</head>
<body>
</body>
</html>

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM