JavaScript中的DOM解析

Question

Some background: 一些背景：
I'm developing a web based mobile application using JavaScript. 我正在使用JavaScript开发基于Web的移动应用程序。 HTML rendering is Safari based. HTML呈现基于Safari。 Cross domain policy is disabled, so I can make calls to other domains using XmlHttpRequests. 跨域策略已禁用，因此我可以使用XmlHttpRequests调用其他域。 The idea is to parse external HTML and get text content of specific element. 这个想法是解析外部HTML并获取特定元素的文本内容。
In the past I was parsing the text line by line, finding the line I need. 过去，我逐行解析文本，找到我需要的行。 Then get the content of the tag which is a substring of that line. 然后获取标记的内容，该标记是该行的子字符串。 This is very troublesome and requires a lot of maintenance each time the target html changes. 这非常麻烦，并且每次目标html更改时都需要大量维护。
So now I want to parse the html text into DOM and run css or xpath queries on it. 所以现在我想将html文本解析为DOM并在其上运行css或xpath查询。
It works well: 它运作良好：

$('<div></div>').append(htmlBody).find('#theElementToFind').text()

The only problem is that when I use the browser to load html text into DOM element, it will try to load all external resources (images, js files, etc.). 唯一的问题是，当我使用浏览器将html文本加载到DOM元素中时，它将尝试加载所有外部资源（图像，js文件等）。 Although it isn't causing any serious problem, I would like to avoid that. 尽管这不会引起任何严重的问题，但我还是想避免这种情况。

Now the question: 现在的问题是：
How can I parse html text to DOM without the browser loading external resources, or run js scripts ? 如何在浏览器不加载外部资源或运行js脚本的情况下将html文本解析为DOM？
Some ideas I've been thinking about: 我一直在思考的一些想法：

creating new document object using createDocument call ( document.implementation.createDocument() ), but I'm not sure it will skip the loading of external resources. 使用createDocument调用（ document.implementation.createDocument() ）创建新的文档对象，但是我不确定它将跳过外部资源的加载。
use third party DOM parser in JS - the only one I've tried was very bad with handling errors 在JS中使用第三方DOM分析器-我尝试过的唯一一个在处理错误方面非常糟糕
use iframe to create new document, so that external resources with relative path will not throw an error in console 使用iframe创建新文档，以便具有相对路径的外部资源不会在控制台中引发错误

Answer 1

It seems that the following piece of code works great: 似乎以下代码很不错：

var doc = document.implementation.createHTMLDocument("");
doc.documentElement.innerHTML = htmlBody;
var text = $(doc).find('#theElementToFind').text();

external resources aren't loaded, scripts aren't being evaluated. 没有加载外部资源，没有评估脚本。

Found it here: https://stackoverflow.com/a/9251106/95624 在这里找到它： https : //stackoverflow.com/a/9251106/95624

Origin: https://developer.mozilla.org/en/DOMParser#DOMParser_HTML_extension_for_other_browsers 来源： https : //developer.mozilla.org/en/DOMParser#DOMParser_HTML_extension_for_other_browsers

Answer 2

您可以构造任何html字符串的jQuery对象，而无需将其附加到DOM：

$(htmlBody).find('#theElementToFind').text();

JavaScript中的DOM解析

问题描述

2 个解决方案

解决方案1
4 已采纳 2012-08-15 11:49:54

解决方案2
1 2012-08-15 09:34:47

JavaScript中的DOM解析

问题描述

2 个解决方案

解决方案1 4 已采纳 2012-08-15 11:49:54

解决方案2 1 2012-08-15 09:34:47

解决方案1
4 已采纳 2012-08-15 11:49:54

解决方案2
1 2012-08-15 09:34:47