Firefox扩展和XUL：获取页面源代码

Question

I am developing my first Firefox extension and for that I need to get the complete source code of the current page. 我正在开发我的第一个Firefox扩展，为此我需要获得当前页面的完整源代码。 How can I do that with XUL? 我怎么能用XUL做到这一点？

Answer 1

You will need a xul browser object to load the content into. 您将需要一个xul 浏览器对象来加载内容。

Load the "view-source:" version of your page into a the browser object, in the same way as the "View Page Source" menu does. 将“view-source：”版本的页面加载到浏览器对象中，方法与“查看页面源”菜单相同。 See function viewSource() in chrome://global/content/viewSource.js . 请参阅chrome://global/content/viewSource.js函数viewSource（）。 That function can load from cache, or not. 该函数可以从缓存加载，也可以不加载。

Once the content is loaded, the original source is given by: 加载内容后，原始来源由下式给出：

var source = browser.contentDocument.getElementById('viewsource').textContent;

Serialize a DOM Document 序列化DOM文档
This method will not get the original source, but may be useful to some readers. 此方法不会获得原始来源，但可能对某些读者有用。

You can serialize the document object to a string. 您可以将文档对象序列化为字符串。 See Serializing DOM trees to strings in the MDC. 请参阅在MDC中将DOM树序列化为字符串。 You may need to use the alternate method of instantiation in your extension. 您可能需要在扩展中使用替代的实例化方法。

That article talks about XML documents, but it also works on any HTML DOMDocument. 那篇文章讨论了XML文档，但它也适用于任何HTML DOMDocument。

var serializer = new XMLSerializer();
var source = serializer.serializeToString(document);

This even works in a web page or the firebug console. 这甚至可以在网页或firebug控制台中使用。

Answer 2

really looks like there is no way to get "all the sourcecode". 真的看起来没有办法得到“所有的源代码”。 You may use 你可以用

document.documentElement.innerHTML

to get the innerHTML of the top element (usually html). 获取top元素的innerHTML（通常是html）。 If you have a php error message like 如果你有像这样的PHP错误消息

<h3>fatal error</h3>
segfault

<html>
    <head>
        <title>bla</title>
        <script type="text/javascript">
            alert(document.documentElement.innerHTML);
        </script>
    </head>
    <body>
    </body>
</html>

the innerHTML would be innerHTML将是

<head>
<title>bla</title></head><body><h3>fatal error</h3>
segfault    
        <script type="text/javascript">
            alert(document.documentElement.innerHTML);
        </script></body>

but the error message would still retain 但错误信息仍将保留

edit: documentElement is described here: https://developer.mozilla.org/en/DOM/document.documentElement 编辑：documentElement在这里描述： https ： //developer.mozilla.org/en/DOM/document.documentElement

Answer 3

You can get URL with var URL = document.location.href and navigate to "view-source:"+URL . 您可以使用var URL = document.location.href获取URL并导航到"view-source:"+URL 。

Now you can fetch the whole source code (viewsource is the id of the body): 现在你可以获取整个源代码（viewsource是正文的id）：

var code = document.getElementById('viewsource').innerHTML;

Problem is that the source code is formatted. 问题是源代码是格式化的。 So you have to run strip_tags () and htmlspecialchars_decode () to fix it. 所以你必须运行strip_tags （）和htmlspecialchars_decode （）来修复它。

For example, line 1 should be the doctype and line 2 should look like: 例如，第1行应为doctype，第2行应如下所示：

&lt;<span class="start-tag">HTML</span>&gt;

So after strip_tags () it becomes: 所以在strip_tags （）之后变成：

&lt;HTML&gt;

And after htmlspecialchars_decode () we finally get expected result: 在htmlspecialchars_decode （）之后我们终于得到了预期的结果：

<HTML>

The code doesn't pass to DOM parser so you can view invalid HTML too. 代码不会传递给DOM解析器，因此您也可以查看无效的HTML。

Answer 4

Maybe you can get it via DOM, using 也许你可以通过DOM获得它，使用

var source =document.getElementsByTagName("html"); var source = document.getElementsByTagName（“html”）;

and fetch the source using DOMParser 并使用DOMParser获取源代码

https://developer.mozilla.org/En/DOMParser https://developer.mozilla.org/En/DOMParser

Answer 5

Sagi的第一部分答案，但是使用document.getElementById('viewsource').textContent代替。

Answer 6

More in line with Lachlan's answer, but there is a discussion of the internals here that gets quite in depth, going into the Cpp code. 更符合Lachlan的回答，但是这里有一个关于内部的讨论，深入到Cpp代码中。

http://www.mail-archive.com/mozilla-embedding@mozilla.org/msg05391.html http://www.mail-archive.com/mozilla-embedding@mozilla.org/msg05391.html

and then follow the replies at the bottom. 然后按照底部的回复。

Firefox扩展和XUL：获取页面源代码

问题描述

6 个解决方案

解决方案1
6 2010-03-06 14:34:02

解决方案2
2 已采纳 2010-03-02 14:45:01

解决方案3
2 2010-03-05 14:16:39

解决方案4
1 2010-03-01 13:36:05

解决方案5
0 2010-03-06 16:49:00

解决方案6
0 2010-04-12 10:22:20

Firefox扩展和XUL：获取页面源代码

问题描述

6 个解决方案

解决方案1 6 2010-03-06 14:34:02

解决方案2 2 已采纳 2010-03-02 14:45:01

解决方案3 2 2010-03-05 14:16:39

解决方案4 1 2010-03-01 13:36:05

解决方案5 0 2010-03-06 16:49:00

解决方案6 0 2010-04-12 10:22:20

解决方案1
6 2010-03-06 14:34:02

解决方案2
2 已采纳 2010-03-02 14:45:01

解决方案3
2 2010-03-05 14:16:39

解决方案4
1 2010-03-01 13:36:05

解决方案5
0 2010-03-06 16:49:00

解决方案6
0 2010-04-12 10:22:20