The HTML standard requires¹ the use of the UTF-8 encoding for HTML documents.
Does it permit the use of other encodings for externally loaded scripts?
<script src="/script1.js">
<script type="module" src="/script2.mjs">
These scripts would be encoded in UTF-16 rather than UTF-8 and would be served by the web server with the header Content-Type: text/javascript; charset=UTF-16
Content-Type: text/javascript; charset=UTF-16
. Does this setup comply with the HTML spec?
charset
attribute [of a meta
element] specifies the character encoding used by the document. This is a character encoding declaration. If the attribute is present, its value must be an ASCII case-insensitive match for the string "utf-8"” ( § 4.2.5 ). “Regardless of whether a character encoding declaration is present or not, the actual character encoding used to encode the document must be UTF-8” ( § 4.2.5.4 ). The HTML standard requires the use of the UTF-8 encoding for HTML documents
No, it doesn't. It prefers UTF-8, but you can use any other charset you want, as long as you declare it explicitly in an appropriate <meta>
element. See Declaring character encodings in HTML .
Does it permit the use of other encodings for externally loaded scripts?
The <script>
element has a charset
attribute , though this is deprecated in favor of the charset
attribute of the Content-Type
HTTP header when the script is retrieved. If charset
is present in the <script>
, it must match the charset
of the Content-Type
. If no charset
is specified, the HTML's charset is assumed.
It looks like HTML5 permits different encodings for regular scripts and mandates UTF-8 for JavaScript modules.
To fetch a classic script given a url , a settings object , some options , a CORS setting , and a character encoding , run these steps. The algorithm will asynchronously complete with either null (on failure) or a new classic script (on success).
[...]
If response 's Content Type metadata, if any, specifies a character encoding, and the user agent supports that encoding, then set character encoding to that encoding (ignoring the passed-in value).
Let source text be the result of decoding response 's body to Unicode, using character encoding as the fallback encoding.
[...]
To fetch a single module script , given a url , a fetch client settings object , a destination , some options , a module map settings object , a referrer , and a top-level module fetch flag, run these steps. The algorithm will asynchronously complete with either null (on failure) or a module script (on success).
[...]
- Let source text be the result of UTF-8 decoding response 's body.
[...]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.