简体   繁体   中英

Does HTML5 permit encoding external scripts in UTF-16?

The HTML standard requires¹ the use of the UTF-8 encoding for HTML documents.

Does it permit the use of other encodings for externally loaded scripts?

<script src="/script1.js">
<script type="module" src="/script2.mjs">

These scripts would be encoded in UTF-16 rather than UTF-8 and would be served by the web server with the header Content-Type: text/javascript; charset=UTF-16 Content-Type: text/javascript; charset=UTF-16 . Does this setup comply with the HTML spec?


  1. “The charset attribute [of a meta element] specifies the character encoding used by the document. This is a character encoding declaration. If the attribute is present, its value must be an ASCII case-insensitive match for the string "utf-8"” ( § 4.2.5 ). “Regardless of whether a character encoding declaration is present or not, the actual character encoding used to encode the document must be UTF-8” ( § 4.2.5.4 ).

The HTML standard requires the use of the UTF-8 encoding for HTML documents

No, it doesn't. It prefers UTF-8, but you can use any other charset you want, as long as you declare it explicitly in an appropriate <meta> element. See Declaring character encodings in HTML .

Does it permit the use of other encodings for externally loaded scripts?

The <script> element has a charset attribute , though this is deprecated in favor of the charset attribute of the Content-Type HTTP header when the script is retrieved. If charset is present in the <script> , it must match the charset of the Content-Type . If no charset is specified, the HTML's charset is assumed.

It looks like HTML5 permits different encodings for regular scripts and mandates UTF-8 for JavaScript modules.

To fetch a classic script given a url , a settings object , some options , a CORS setting , and a character encoding , run these steps. The algorithm will asynchronously complete with either null (on failure) or a new classic script (on success).

[...]

  1. If response 's Content Type metadata, if any, specifies a character encoding, and the user agent supports that encoding, then set character encoding to that encoding (ignoring the passed-in value).

  2. Let source text be the result of decoding response 's body to Unicode, using character encoding as the fallback encoding.

[...]

To fetch a single module script , given a url , a fetch client settings object , a destination , some options , a module map settings object , a referrer , and a top-level module fetch flag, run these steps. The algorithm will asynchronously complete with either null (on failure) or a module script (on success).

[...]

  1. Let source text be the result of UTF-8 decoding response 's body.

[...]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM