简体   繁体   中英

How do you return an object from the browser environment to the Node environment in Puppeteer?

I have the following code that attempts to scrape all the 'Add to basket' button elements from the page, put them in an array and return that array to the Node environment.

const puppeteer = require('puppeteer');

let getArrayofButtons = async () => {
  const browser = await puppeteer.launch({
    devtools: 'true',
  });

  const page = await browser.newPage();
  await page.setViewport({ width: 1280, height: 1800 });

  await page.goto('http://books.toscrape.com/', {
    waitUntil: 'domcontentloaded',
  });

  await page.waitForSelector('.product_pod');
  let buttons = [];

  await page.evaluate(() => {
    buttons = [...document.querySelectorAll('*')].filter(e =>
      [...e.childNodes].find(n => n.nodeValue?.match('basket'))
    );
    console.log(buttons);
  });
  // browser.close();
};
getArrayofButtons().then(returnedButtons => {
  console.log(returnedButtons);
});

When I console.log(buttons); I can see the array of button elements in the browser environment, but when I try to return that array to the Node environment I get undefined .

My understanding is that page.evaluate() will return the value of the function passed to it, so if I replace:

articles = [...document.querySelectorAll('*')].filter(e => [...e.childNodes].find(n => n.nodeValue?.match('basket')) );

with:

return [...document.querySelectorAll('*')].filter(e => [...e.childNodes].find(n => n.nodeValue?.match('basket')) );

it seems like it should work. Am I not resolving the Promise correctly?

You can call evaluateHandle to get a pointer to that result.

const arrayHandle = await page.evaluateHandle(() => {
    buttons = [...document.querySelectorAll('*')].filter(e =>
      [...e.childNodes].find(n => n.nodeValue?.match('basket'))
    );
    return buttons;
  });

Notice that arrayHandle is not an array. It is an ElementHandle pointing to the array in the browser.

If you want to process each button on your side you will need to process that handle calling the getProperties function.

const properties = await arrayHandle.getProperties();
await arrayHandle.dispose();
const buttons = [];
for (const property of properties.values()) {
  const elementHandle = property.asElement();
  if (elementHandle)
    buttons.push(elementHandle);
}

Yes, it's quite a boilerplate. But you could grab that handle and pass it to an evaluate function.

page.evaluate((elements) => elements[0].click(), arrayHandle);

Unfortunately, page.evaluate() can only transfer serializable data (roughly, the data JSON can handle). DOM elements are not serializable. Consider returning an array of strings or something like that (HTML markup, attributes, text content etc).

Also, buttons is declared in the puppeteer (Node.js) context and is not available in browser context (in page.evaluate() function argument context). So you need const buttons = await page.evaluate() here.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM