简体   繁体   中英

Scrape a websites's javascript variables using NodeJS

I am trying to scrape the real time values of variables of a website using my NodeJS application.

I have tried "cheerio" but it didn't worked. It only returns me the HTML of the target in a string not the runtime values of the variables.

For example the value of "google.authuser" of "www.google.com"

screenshot

Please suggest me a simple solution. Thanks.

Thanks for the recommendations of the Headless browser. Headless Chrome NodeJS API Puppeteer worked for me.

async function crawl() {
const puppeteer = require('puppeteer');
const browser = await puppeteer.launch();

const page = await browser.newPage();
await page.goto('http://www.google.com');

const dataPromise = await page.evaluate(() => {
    return Promise.resolve({
        number: google.authuser
    });
});

browser.close();
return dataPromise;}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM