简体繁体 English

Chrome扩展程序 - 在浏览器上加载js之前获取Html DOM

[英]Chrome extension - get Html DOM before load js on browser

原文 2015-06-11 22:43:57 4 2 javascript/ html/ dom/ google-chrome-extension/ onbeforeload

I'm developing a chrome extension that needs to block the load of the html page, do some validations on the javascript, that cames in the page, in my content script, and proceed(or not) with the loading of the page. 我正在开发一个chrome扩展，需要阻止html页面的加载，对javascript进行一些验证，在页面中，在我的内容脚本中进行，然后继续（或不加载）加载页面。

In my manifest with "run_at": "document_start", the content scrip get a empty html and can't do the validation. 在我的“run_at”：“document_start”的清单中，内容脚本获取一个空的html，无法进行验证。 With run_at at document_end, it alredy executed js that comes in the page, and just after that my extension does the validation of it... 在document_end的run_at中，它已经执行了页面中的js，之后我的扩展程序对它进行了验证...

Is there a way to set like a DOMContentBeforeLoad in my content script or something? 有没有办法像我的内容脚本中的DOMContentBeforeLoad一样设置？ I'm really out of options.. 我真的没有选择..

Thanks 谢谢

2 个解决方案

Take a look at how TopLevel.js works: https://github.com/kristopolous/TopLevel (interesting source at https://github.com/kristopolous/TopLevel/blob/master/toplevel.js ) 看看TopLevel.js的工作原理： https ： //github.com/kristopolous/TopLevel （有趣的来源： https ： //github.com/kristopolous/TopLevel/blob/master/toplevel.js ）

It's a library you explicitly include in your page. 它是您明确包含在页面中的库。 When it's reached in the page and run it immediately document.write()'sa <plaintext> element with style='display: none', which immediately stops the browser parsing the rest of the page at all, and hides the plain text result (plaintext is a deprected element that stops interpreting page content, and treats all the HTML as vanilla unparsed plain text: https://developer.mozilla.org/en-US/docs/Web/HTML/Element/plaintext ). 当它到达页面并立即运行时，document.write（）'sa <plaintext>元素，其中style ='display：none'，它会立即停止浏览器解析页面的其余部分，并隐藏纯文本结果（明文是一个停止解释页面内容的deprected元素，并将所有HTML视为未经过分析的纯文本： https ： //developer.mozilla.org/en-US/docs/Web/HTML/Element/plaintext ）。

Toplevel then parses the text content of the <plaintext> element itself (and does some templating, which is the point of the library), and document.write()'s the resulting new content to the page by hand. 然后Toplevel解析<plaintext>元素本身的文本内容（并进行一些模板化，这是库的重点），而document.write（）则是手动生成的新内容。

You should be able to do something similar: inject a <plaintext> element to stop the page being parsed by the browser, parse it yourself (or do whatever you want with it), and then potentially write out whatever you like (including the original content) to the page once you're happy. 你应该可以做类似的事情：注入一个<plaintext>元素来阻止浏览器解析页面，自己解析（或用它做任何你想做的事情），然后写出你喜欢的任何东西（包括原文）内容）一旦你开心就到页面。

I think to do what you are doing you are going to have to do what you did with document_start, then load the html page via an ajax call and parse it yourself. 我认为要做你正在做的事情你将不得不做你对document_start所做的事情，然后通过ajax调用加载html页面并自己解析它。

The browsers typically don't load all the scripts and then execute them, this happens asynchronously in the order of the page, so there isn't a point you can catch it at where the javascript will have loaded but nothing will have executed (unless you control the content of the page as well). 浏览器通常不加载所有脚本然后执行它们，这是按照页面的顺序异步发生的，所以没有一点你可以在javascript加载的地方捕获它但没有任何东西会被执行（除非你也控制页面的内容）。