简体   繁体   English

Chrome 扩展:在内容脚本中获取页面变量

[英]Chrome Extension: Get Page Variables in Content Script

Is there any way to retrieve a page's javascript variables from a Google Chrome Content Script?有没有办法从 Google Chrome 内容脚本中检索页面的 javascript 变量?

If you really need to, you can insert a <script> element into the page's DOM;如果确实需要,可以在页面的 DOM 中插入一个<script>元素; the code inside your <script> element will be executed and that code will have access to JavaScript variables at the scope of the window. <script>元素中的代码将被执行,并且该代码将可以访问窗口范围内的 JavaScript 变量。 You can then communicate them back to the content script using data- attributes and firing custom events.然后,您可以使用data-属性和触发自定义事件将它们传达回内容脚本。

Sound awkward?听起来很别扭? Why yes, it is, and intentionally so for all the reasons in the documentation that serg has cited.为什么是的,它是,并且出于 serg 引用的文档中的所有原因故意如此。 But if you really, really need to do it, it can be done.但如果你真的,真的需要这样做,它是可以做到的。 See here and here for more info.请参阅此处此处了解更多信息。 And good luck!还有祝你好运!

I created a little helper method, have fun :)我创建了一个小助手方法,玩得开心:)

to retrieve the window's variables "lannister", "always", "pays", "his", "debts" , you execute the following:要检索窗口的变量"lannister", "always", "pays", "his", "debts" ,请执行以下操作:

var windowVariables = retrieveWindowVariables(["lannister", "always", "pays", "his", "debts"]);
console.log(windowVariables.lannister);
console.log(windowVariables.always);

my code:我的代码:

function retrieveWindowVariables(variables) {
    var ret = {};

    var scriptContent = "";
    for (var i = 0; i < variables.length; i++) {
        var currVariable = variables[i];
        scriptContent += "if (typeof " + currVariable + " !== 'undefined') $('body').attr('tmp_" + currVariable + "', " + currVariable + ");\n"
    }

    var script = document.createElement('script');
    script.id = 'tmpScript';
    script.appendChild(document.createTextNode(scriptContent));
    (document.body || document.head || document.documentElement).appendChild(script);

    for (var i = 0; i < variables.length; i++) {
        var currVariable = variables[i];
        ret[currVariable] = $("body").attr("tmp_" + currVariable);
        $("body").removeAttr("tmp_" + currVariable);
    }

    $("#tmpScript").remove();

    return ret;
}

please note that i used jQuery.. you can easily use the native js "removeAttribute" and "removeChild" instead.请注意,我使用了 jQuery.. 你可以轻松地使用原生 js “removeAttribute”“removeChild”来代替。

Using Liran's solution, I'm adding some fix for Objects , here's correct solution:使用 Liran 的解决方案,我为Objects添加了一些修复程序,这是正确的解决方案:

function retrieveWindowVariables(variables) {
    var ret = {};

    var scriptContent = "";
    for (var i = 0; i < variables.length; i++) {
        var currVariable = variables[i];
        scriptContent += "if (typeof " + currVariable + " !== 'undefined') $('body').attr('tmp_" + currVariable + "', JSON.stringify(" + currVariable + "));\n"
    }

    var script = document.createElement('script');
    script.id = 'tmpScript';
    script.appendChild(document.createTextNode(scriptContent));
    (document.body || document.head || document.documentElement).appendChild(script);

    for (var i = 0; i < variables.length; i++) {
        var currVariable = variables[i];
        ret[currVariable] = $.parseJSON($("body").attr("tmp_" + currVariable));
        $("body").removeAttr("tmp_" + currVariable);
    }

     $("#tmpScript").remove();

    return ret;
}

Chrome's documentation gives you a good starting point: https://developer.chrome.com/extensions/content_scripts#host-page-communication Chrome 的文档为您提供了一个很好的起点: https : //developer.chrome.com/extensions/content_scripts#host-page-communication

This method allows you to extract a global page variable to your content script.此方法允许您将全局页面变量提取到内容脚本中。 It also uses an idea to only accept incoming messages that you recognize given your handshake.它还使用一个想法来只接受您在握手时识别的传入消息。 You can also just use Math.random() for the handshake but I was having some fun.您也可以使用Math.random()进行握手,但我玩得很开心。

Explanation解释

  1. This method creates a script tag此方法创建一个脚本标记
  2. It stringifies the function propagateVariable and passes the current handShake and targeted variable name into the string for preservation since the function will not have access to our content script scope.它将函数propagateVariable变量字符串化并将当前的握手和目标变量名称传递到字符串中以进行保存,因为该函数将无法访问我们的内容脚本范围。
  3. Then it injects that script tag to the page.然后它将该脚本标记注入页面。
  4. We then create a listener in our content script waiting to hear back from the page to pass back the variable we're after.然后,我们在我们的内容脚本中创建一个监听器,等待从页面收到回音以传回我们所追求的变量。
  5. By now the injected script has hit the page.到目前为止,注入的脚本已经命中页面。
  6. The injected code was wrapped in an IIFE so it runs itself pushing the data to the listener.注入的代码包含在IIFE 中,因此它会自行运行,将数据推送到侦听器。
  7. Optional: The listener makes sure that it had the correct handshake and voila we can trust the source of the data (It's not actually secure, but it helps create an identifier in this case, that gives us some level of trust).可选:监听器确保它有正确的握手,瞧,我们可以信任数据的来源(它实际上并不安全,但在这种情况下它有助于创建一个标识符,这给了我们一定程度的信任)。

Round 1第1轮

v1.0 v1.0

 const globalToExtract = 'someVariableName'; const array = new Uint32Array(5); const handShake = window.crypto.getRandomValues(array).toString(); function propagateVariable(handShake, variableName) { const message = { handShake }; message[variableName] = window[variableName]; window.postMessage(message, "*"); } (function injectPropagator() { const script = `( ${propagateVariable.toString()} )('${handShake}', '${globalToExtract}');` const scriptTag = document.createElement('script'); const scriptBody = document.createTextNode(script); scriptTag.id = 'chromeExtensionDataPropagator'; scriptTag.appendChild(scriptBody); document.body.append(scriptTag); })(); window.addEventListener("message", function({data}) { console.log("INCOMINGGGG!", data); // We only accept messages from ourselves if (data.handShake != handShake) return; console.log("Content script received: ", data); }, false);

v1.1 With Promise! v1.1 承诺!

 function extractGlobal(variableName) { const array = new Uint32Array(5); const handShake = window.crypto.getRandomValues(array).toString(); function propagateVariable(handShake, variableName) { const message = { handShake }; message[variableName] = window[variableName]; window.postMessage(message, "*"); } (function injectPropagator() { const script = `( ${propagateVariable.toString()} )('${handShake}', '${variableName}');` const scriptTag = document.createElement('script'); const scriptBody = document.createTextNode(script); scriptTag.id = 'chromeExtensionDataPropagator'; scriptTag.appendChild(scriptBody); document.body.append(scriptTag); })(); return new Promise(resolve => { window.addEventListener("message", function({data}) { // We only accept messages from ourselves if (data.handShake != handShake) return; resolve(data); }, false); }); } extractGlobal('someVariableName').then(data => { // Do Work Here });

Round 2 - Class & Promises第 2 轮 - 课堂和承诺

v2.0 v2.0

I would recommend tossing the class into its own file and exporting it as a default if using es modules.如果使用 es 模块,我建议将类放入自己的文件中并将其导出为默认值。 Then it simply becomes:然后它就变成了:

ExtractPageVariable('someGlobalPageVariable').data.then(pageVar => {
  // Do work here 💪
});

 class ExtractPageVariable { constructor(variableName) { this._variableName = variableName; this._handShake = this._generateHandshake(); this._inject(); this._data = this._listen(); } get data() { return this._data; } // Private _generateHandshake() { const array = new Uint32Array(5); return window.crypto.getRandomValues(array).toString(); } _inject() { function propagateVariable(handShake, variableName) { const message = { handShake }; message[variableName] = window[variableName]; window.postMessage(message, "*"); } const script = `( ${propagateVariable.toString()} )('${this._handShake}', '${this._variableName}');` const scriptTag = document.createElement('script'); const scriptBody = document.createTextNode(script); scriptTag.id = 'chromeExtensionDataPropagator'; scriptTag.appendChild(scriptBody); document.body.append(scriptTag); } _listen() { return new Promise(resolve => { window.addEventListener("message", ({data}) => { // We only accept messages from ourselves if (data.handShake != this._handShake) return; resolve(data); }, false); }) } } const windowData = new ExtractPageVariable('somePageVariable').data; windowData.then(console.log); windowData.then(data => { // Do work here });

As explained partially in other answers, the JS variables from the page are isolated from your Chrome extension content script.正如在其他答案中部分解释的那样,页面中的 JS 变量与您的 Chrome 扩展内容脚本隔离。 Normally, there's no way to access them.通常,没有办法访问它们。

But if you inject a JavaScript tag in the page, you will have access to whichever variables are defined there.但是,如果您在页面中注入 JavaScript 标记,您将可以访问在那里定义的任何变量。

I use a utility function to inject my script in the page:我使用一个实用函数在页面中注入我的脚本:

/**
 * inject - Inject some javascript in order to expose JS variables to our content JavaScript
 * @param {string} source - the JS source code to execute
 * Example: inject('(' + myFunction.toString() + ')()');
 */
function inject(source) {
  const j = document.createElement('script'),
    f = document.getElementsByTagName('script')[0];
  j.textContent = source;
  f.parentNode.insertBefore(j, f);
  f.parentNode.removeChild(j);
}

Then you can do:然后你可以这样做:

function getJSvar(whichVar) {
   document.body.setAttribute('data-'+whichVar,whichVar);
}
inject('(' + getJSvar.toString() + ')("somePageVariable")');

var pageVar = document.body.getAttribute('data-somePageVariable');

Note that if the variable is a complex data type (object, array...), you will need to store the value as a JSON string in getJSvar(), and JSON.parse it back in your content script.请注意,如果变量是复杂数据类型(对象、数组...),您需要将值作为 JSON 字符串存储在 getJSvar() 中,并将其 JSON.parse 返回到您的内容脚本中。

This is way late but I just had the same requirement & created a simple standalone class to make getting variable values (or calling functions on objects in the page) really really easy.这已经很晚了,但我只是有相同的要求并创建了一个简单的独立类来使获取变量值(或在页面中的对象上调用函数)非常容易。 I used pieces from other answers on this page, which were very useful.我使用了此页面上其他答案的片段,这些片段非常有用。

The way it works is to inject a script tag into the page which accesses the variable you want, then it creates a div to hold the serialised version of the value as innerText.它的工作方式是将一个脚本标记注入到访问您想要的变量的页面中,然后它创建一个 div 以将值的序列化版本保存为innerText。 It then reads & deserialises this value, deletes the div and script elements it injected, so the dom is back to exactly what it was before.然后它读取并反序列化这个值,删除它注入的 div 和脚本元素,这样 dom 就会恢复到之前的状态。

    var objNativeGetter = {

        divsToTidyup: [],
        DIVID: 'someUniqueDivId',
        _tidyUp: function () {
            console.log(['going to tidy up ', this.divsToTidyup]);
            var el;
            while(el = this.divsToTidyup.shift()) {
                console.log('removing element with ID : ' + el.getAttribute('id'));
                el.parentNode.removeChild(el);
            }
        },

        // create a div to hold the serialised version of what we want to get at
        _createTheDiv: function () {
            var div = document.createElement('div');
            div.setAttribute('id', this.DIVID);
            div.innerText = '';
            document.body.appendChild(div);
            this.divsToTidyup.push(div);
        },

        _getTheValue: function () {
            return JSON.parse(document.getElementById(this.DIVID).innerText);
        },

        // find the page variable from the stringified version of what you would normally use to look in the symbol table
        // eg. pbjs.adUnits would be sent as the string: 'pbjs.adUnits'
        _findTheVar: function (strIdentifier) {
            var script = document.createElement('script');
            script.setAttribute('id', 'scrUnique');
            script.textContent = "\nconsole.log(['going to stringify the data into a div...', JSON.stringify(" + strIdentifier + ")]);\ndocument.getElementById('" + this.DIVID + "').innerText = JSON.stringify(" + strIdentifier + ");\n";
            (document.head||document.documentElement).appendChild(script);
            this.divsToTidyup.push(script);
        },

        // this is the only call you need to make eg.:
        // var val = objNativeGetter.find('someObject.someValue');
        // sendResponse({theValueYouWant: val});
        find: function(strIdentifier) {
            this._createTheDiv();
            this._findTheVar(strIdentifier);
            var ret = this._getTheValue();
            this._tidyUp();
            return ret;
        }
    };

You use it like this:你像这样使用它:

chrome.runtime.onMessage.addListener(
    function(request, sender, sendResponse) {

        var objNativeGetter = {
        .... the object code, above
        }

        // do some validation, then carefully call objNativeGetter.find(...) with a known string (don't use any user generated or dynamic string - keep tight control over this)
        var val = objNativeGetter.find('somePageObj.someMethod()');
        sendResponse({theValueYouWant: val});
    }
);

I actually worked around it using the localStorge API.我实际上使用 localStorge API 解决了它。 Note: to use this, our contentscript should be able to read the localStorage.注意:要使用它,我们的 contentscript 应该能够读取 localStorage。 In the manifest.json file, just add the "storage" string:在 manifest.json 文件中,只需添加“存储”字符串:

"permissions": [...,"storage"]

The hijack function lives in the content script:劫持函数存在于内容脚本中:

function hijack(callback) {
    "use strict";
    var code = function() {
      //We have access to topframe - no longer a contentscript          
      var ourLocalStorageObject = {
        globalVar: window.globalVar,
        globalVar2: window.globalVar2
      };
      var dataString = JSON.stringify(ourLocalStorageObject);
      localStorage.setItem("ourLocalStorageObject", dataString);
    };
    var script = document.createElement('script');
    script.textContent = '(' + code + ')()';
    (document.head||document.documentElement).appendChild(script);
    script.parentNode.removeChild(script);
    callback();
  }

Now we can call from the contentscript现在我们可以从 contentscript 调用

document.addEventListener("DOMContentLoaded", function(event) { 
    hijack(callback);
});

or if you use jQuery in your contentscript, like I do:或者如果你在你的内容脚本中使用 jQuery,就像我一样:

$(document).ready(function() { 
    hijack(callback);
});

to extract the content:提取内容:

function callback() {
    var localStorageString = localStorage.getItem("ourLocalStorageObject");
    var ourLocalStorageObject= JSON.parse(localStorageString);

    console.log("I can see now on content script", ourLocalStorageObject);
    //(optional cleanup):
    localStorage.removeItem("ourLocalStorageObject");
}

This can be called multiple times, so if your page changes elements or internal code, you can add event listeners to update your extension with the new data.这可以被多次调用,因此如果您的页面更改元素或内部代码,您可以添加事件侦听器以使用新数据更新您的扩展。

Edit: I've added callbacks so you can be sure your data won't be invalid (had this issue myself)编辑:我添加了回调,所以你可以确保你的数据不会无效(我自己有这个问题)

No.不。

Content scripts execute in a special environment called an isolated world.内容脚本在称为孤立世界的特殊环境中执行。 They have access to the DOM of the page they are injected into, but not to any JavaScript variables or functions created by the page.他们可以访问被注入页面的 DOM,但不能访问页面创建的任何 JavaScript 变量或函数。 It looks to each content script as if there is no other JavaScript executing on the page it is running on.它查看每个内容脚本,就好像它正在运行的页面上没有其他 JavaScript 正在执行一样。 The same is true in reverse: JavaScript running on the page cannot call any functions or access any variables defined by content scripts.反过来也是如此:在页面上运行的 JavaScript 无法调用任何函数或访问由内容脚本定义的任何变量。

Isolated worlds allow each content script to make changes to its JavaScript environment without worrying about conflicting with the page or with other content scripts.隔离世界允许每个内容脚本对其 JavaScript 环境进行更改,而不必担心与页面或其他内容脚本发生冲突。 For example, a content script could include JQuery v1 and the page could include JQuery v2, and they wouldn't conflict with each other.例如,内容脚本可以包含 JQuery v1,页面可以包含 JQuery v2,它们不会相互冲突。

Another important benefit of isolated worlds is that they completely separate the JavaScript on the page from the JavaScript in extensions.隔离世界的另一个重要好处是它们将页面上的 JavaScript 与扩展中的 JavaScript 完全分开。 This allows us to offer extra functionality to content scripts that should not be accessible from web pages without worrying about web pages accessing it.这允许我们为不应从网页访问的内容脚本提供额外的功能,而不必担心网页访问它。

If you know which variables you want to access, you can make a quick custom content-script to retrieve their values.如果您知道要访问哪些变量,您可以制作一个快速的自定义内容脚本来检索它们的值。

In popup.js :popup.js

chrome.tabs.executeScript(null, {code: 'var name = "property"'}, function() {
    chrome.tabs.executeScript(null, {file: "retrieveValue.js"}, function(ret) {
        for (var i = 0; i < ret.length; i++) {
            console.log(ret[i]); //prints out each returned element in the array
        }
    });
});

In retrieveValue.js :retrieveValue.js

function returnValues() {
    return document.getElementById("element")[name];
    //return any variables you need to retrieve
}
returnValues();

You can modify the code to return arrays or other objects.您可以修改代码以返回数组或其他对象。

Works with any data type.适用于任何数据类型。 Date need to be parsed after retrieving.检索后需要解析日期。

/**
 * Retrieves page variable or page function value in content script.
 * 
 * Example 1:
 * var x = 'Hello, World!';
 * var y = getPageValue('x'); // Hello, World!
 * 
 * Example 2:
 * function x() = { return 'Hello, World!' }
 * var y = getPageValue('x()'); // Hello, World!
 * 
 * Example 3:
 * function x(a, b) = { return a + b }
 * var y = getPageValue('x("Hello,", " World!")'); // Hello, World!
 */
 function getPageValue(code) {
    const dataname = (new Date()).getTime(); 
    const content = `(()=>{document.body.setAttribute('data-${dataname}', JSON.stringify(${code}));})();`;
    const script = document.createElement('script');
    
    script.textContent = content;
    document.body.appendChild(script);
    script.remove();

    const result = JSON.parse(document.body.getAttribute(`data-${dataname}`));
    document.body.removeAttribute(`data-${dataname}`);
   
    return result;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM