简体   繁体   中英

Why put an XHTML doctype declaration on HTML files? What does that do?

I wonder about the number of web pages I encounter that are HTML files, but that wear an XHTML DOCTYPE declaration.
Why are people doing this? What do they hope to achieve? Why not reserve the XHTML doctype declaration for actual XHTML files?

Or am I missing something?

Edit : there is some confusion about what "actual XHTML files" are; to demonstrate that the difference is not caused by the DOCTYPE declaration, compare this file to this one . The first is HTML, the second is XHTML, although the contents are identical; only the file types differ. Both display fine in compliant browsers, but the first one is parsed with the HTML parser and the second one with the XML parser.

Why put an XHTML doctype declaration on HTML files? What does that do?

All that does is tell markup validators that they're about to validate an XHTML document, as opposed to a regular, SGML-rooted, HTML document. It describes the content , or more specifically the markup that follows, but nothing else.

Why are people doing this? What do they hope to achieve? Why not reserve the XHTML doctype declaration for actual XHTML files?

Or am I missing something?

Kind of. What actually happened was that people weren't aware that just putting an XHTML doctype declaration on top of an HTML document didn't automatically transform it into an XHTML document , although admittedly that was what everybody was hoping for.

You see, most web applications out there aren't configured to serialize XHTML documents as application/xhtml+xml properly, instead opting to serve pages as just text/html . (It's typically because of the .html file extension more than anything else, really; generally speaking, servers do correctly apply application/xhtml+xml to documents with .xhtml or .xht as the extension, but only static sites that actually make use of the file format will benefit from this.) That leads browsers to decide that they received a regular HTML document, and so that tag soup parsing nonsense we've all come to know and love inevitably ensues.

Note that it doesn't matter even if you have a meta tag like this on your XHTML document:

<meta http-equiv="Content-Type" content="application/xhtml+xml; charset=utf-8" />

Browsers will ignore that, and only look at the actual HTTP Content-Type header that was sent along with the XHTML document.

To make matters worse, Internet Explorer, being the most-used browser in the past few years in XHTML's heyday, never properly supported the application/xhtml+xml MIME type before version 9 was finally released: instead of parsing the markup, constructing the DOM and rendering the page, all it would do was ask for a file download. That doesn't make a very usable XHTML page!

So, guess what we all had to live with until HTML5 became cool?

This, along with things like IE6 going quirky on pages with the XML declaration before the doctype declaration, is also one of the biggest factors leading to XHTML's downfall (along with XHTML 1.1 never gaining widespread usage, and XHTML 2.0 being canceled in favor of HTML5).

Most people use the XHTML doctype because they read it in an old book somewhere or read it on a forum but otherwise are using it for no technical reason they are aware of. Hardly anyone uses it properly by serving it as application/xml+xhtml. Serving XHTML pages as text/html means "tag soup" or "broken html". It should not be done but browsers generally handle it well.

You are correct in your wondering about this. It drives me crazy.

I assume that you're asking why people are serving XHTML documents as HTML, by using the text/html MIME type instead of application/xhtml+xml .

Mostly, it's because of a misguided understanding of compatibility: Lots of browsers simply don't understand the XHTML+XML MIME type, which has caused users to simply serve it as HTML to overcome this. Since browsers often don't complain about what they get, and people don't tend to research a lot, most people assume that the browsers just treat the XHTML-doctyped document as XHTML, even though it was served as HTML. But they don't - thry serve them as HTML. Since the two languages are so much alike, people rarely notice the difference.

So no, you're not missing anything; it's very bad practice. Nowadays, after HTML5, luckily, it seems to become less common.

The hilarious thing about XHTML is that because IE didn't understand the XML mimetype ( application/xhtml+xml ) at the peak of XHTML's popularity, most people never actually used the XML part of it as IE8 and lower refuse to render the content.

This meant that millions of sites think they are using standards compliant XHTML, when in fact they are being parsed as malformed/weird HTML4.

Luckily HTML5 came along and properly defined the parsing of documents, removing much of the ambiguity that surrounded XHTML (all that transitional and strict rubbish).

People who add the XML prolog before the doctype are doing themselves an extra disservice, as a comment before the doctype will cause old IE to use quirks mode, which among other things brings back the old box-model in IE6 and below. This undoubtedly has contributed to the mass hate of IE6, as in quirks mode it has significant bugs that cause modern layouts to be completely broken, rather than just lacking in newer features.

The short answer is that in this industry many people just copy and paste code without understanding it.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM