简体繁体 English

如何在 Javascript 中请求 HTML 文档？

[英]How do I request an HTML document in Javascript?

原文 2021-02-20 05:07:01 7 1 javascript/ html/ xml

I know I can send a request for an XML document with XMLHttpRequest, which will also parse the received XML document.我知道我可以使用 XMLHttpRequest 发送对 XML 文档的请求，该文档还将解析收到的 XML 文档。 Unfortunately that's not going to work with HTML all the time.不幸的是，这不会一直与 HTML 一起工作。 Unlike XML, HTML has some tags that don't require closing (such as the < img > tag).与 XML 不同，HTML 有一些不需要关闭的标签（例如 <img> 标签）。 XML ALWAYS closes a tag, either as self closing like this < mytag / > or as a pair of tags like this < mytag > < / mytag >. XML 总是关闭一个标签，要么像这样的自关闭 <mytag/> 要么像这样的一对标签 <mytag> </mytag>。 XMLHttpRequest attempts to parse the received XML, but if it's HTML with unclosed tags, it's going to fail. XMLHttpRequest 尝试解析接收到的 XML，但如果是 HTML 带有未关闭的标签，它将失败。 Is there a better way of having Javascript download and parse an HTML document, which contains unclosed tags?有没有更好的方法让 Javascript 下载并解析包含未闭合标签的 HTML 文档？

1 个解决方案

I know I can send a request for an XML document with XMLHttpRequest , which will also parse the received XML document.我知道我可以使用XMLHttpRequest发送对 XML 文档的请求，该文档还将解析收到的 XML 文档。

Yes , but your understanding is incomplete.是的，但你的理解是不完整的。 XMLHttpRequest can be used not only for XML documents, but any HTTP request (with limitations only on cross-origin activity, not content). XMLHttpRequest不仅可以用于 XML 文档，还可以用于任何HTTP 请求（仅限跨域活动，而不是内容）。

Unfortunately that's not going to work with HTML all the time.不幸的是，这不会一直与 HTML 一起工作。 Unlike XML, HTML has some tags that don't require closing (such as the <img> tag).与 XML 不同，HTML 有一些不需要关闭的标签（例如<img>标签）。 XML ALWAYS closes a tag, either as self closing like this <mytag /> or as a pair of tags like this <mytag></mytag> . XML 总是关闭一个标签，或者像这样的<mytag />自动关闭，或者像这样的一对标签<mytag></mytag> 。

You are correct insofar as HTML is not XML (indeed: HTML was an application of SGML, as was XML, but since HTML5 it's it's own thing ), and HTML cannot be reliably parsed as XML. You are correct insofar as HTML is not XML (indeed: HTML was an application of SGML, as was XML, but since HTML5 it's it's own thing ), and HTML cannot be reliably parsed as XML. However you are completely incorrect about XMLHttpRequest being unable to handle HTML responses: in a web-page JavaScript environment you can load the HTML responses into the main document DOM, or a separate document fragment and let the browser parse the HTML5 tag-soup just like a normal web-page request. However you are completely incorrect about XMLHttpRequest being unable to handle HTML responses: in a web-page JavaScript environment you can load the HTML responses into the main document DOM, or a separate document fragment and let the browser parse the HTML5 tag-soup just like一个正常的网页请求。

XMLHttpRequest attempts to parse the received XML, but if it's HTML with unclosed tags, it's going to fail. XMLHttpRequest尝试解析接收到的 XML，但如果是 HTML 带有未关闭的标签，它将失败。

You are mistaken: XMLHttpRequest will only attempt to parse the response as XML if you tell it to .你误会了：如果你告诉它XMLHttpRequest只会尝试将响应解析为 XML 。 If you tell XMLHttpRequest to give you the response as text/ string or a JSON blob you can do that too.如果您告诉XMLHttpRequest以文本/ string或 JSON blob 形式给您响应，您也可以这样做。

Is there a better way of having Javascript download and parse an HTML document, which contains unclosed tags?有没有更好的方法让 Javascript 下载并解析包含未闭合标签的 HTML 文档？

Yes: you set responseType = 'document' - that instructs XMLHttpRequest to expect and handle a HTML document response, which will be exposed through the (confusingly named!) responseXML property.是的：您设置responseType = 'document' - 指示XMLHttpRequest期待并处理 HTML 文档响应，该响应将通过（令人困惑的命名！） responseXML属性公开。

However , it's 2021 now - you should not be using XMLHttpRequest - you should be using fetch instead: https://developer.mozilla.org/en-US/docs/Web/API/Fetch_API但是，现在是 2021 年 - 你不应该使用XMLHttpRequest - 你应该使用fetch代替： https://developer.mozilla.org/en-US/docs/Web/API/Fetch_API